AI Project

Borrower Risk Evaluator

Loan-default classifier that measures its own accuracy

Lee Yih VenAI Project

Most AI demos report a confidence score that nobody ever checks. Borrower Risk Evaluator checks it.

Classifying records is the easy half of an AI build. The hard half, usually skipped, is evaluation: proving with evidence that the output is any good. This project keeps the classifier deliberately simple and spends its effort on the proof.

The data is a public lending set of 32,577 labelled loans, about 21% of them ending in default. The pieces:

The brain: Google Gemini (gemini-2.5-flash) returns a typed verdict for each borrower, a status, a confidence, and a one-line reason, through structured output. The borrower's facts go in; the real outcome never does.
The gate: a short deterministic rule, not a model, routes each borrower to auto-approve, auto-flag, or send-to-human based on how sure the model is. Not every part needs an LLM.
The proof: every prediction is scored against the real Current_loan_status label, producing accuracy, precision, recall, F1, a confusion matrix, and accuracy on the cases the model cleared on its own. Each number carries a 95% confidence interval, so the margin of error is visible rather than hidden.

A lazy model that calls every borrower "no default" already scores about 79% accuracy here while catching zero defaults. That is exactly why precision and recall, not accuracy, carry the story, and why measuring confidence matters when language-model confidence is so often miscalibrated.

The demo classifies 150 held-out borrowers live with your own Gemini key, and a built-in guide explains what changes to run this on thousands. Framed as decision support, not auto-rejection: a human owns the final call.

Live demo → Source →

#AIEngineering #MachineLearning #ModelEvaluation