The operating layer behind the evaluator: the fixed classification yardstick, worked examples, the data behind every signal, and the standard procedure an adjuster team runs each week.
A claim scores points from the signals below; the total decides the category. This is the entire rule. No model, no judgement call, so the same claim always lands in the same category.
The point score (max 11)
Signal in the claim
Points
Policy holder at fault
+2
All Perils policy (Collision +1, Liability +0)
+2
Recent address change at claim (under 6 months or 2 to 3 years)
+2
Accident at policy start (zero days policy-to-accident)
+2
Rural accident
+1
Vehicle price at an extreme (under 20k or over 69k)
+1
Vehicle 0 to 4 years old
+1
The four categories
Total points
Category
Action
Share of book
0 to 2
Fast track
auto-clear, minimal effort
44%
3
Approve
pay after standard processing
27%
4 to 5
Investigate
refer to the fraud unit before any decision
26%
6 or more
Repudiate
recommend denial, a person decides, never auto-rejected
3%
Investigate and Repudiate are the fraud flags scored against the real label. Repudiate here means strong fraud grounds; this public dataset has no coverage or policy-status field, so coverage-based repudiation is out of scope and would be added in a real system.
2. Worked Examples
Four claims scored point by point, one per category. Use these to onboard a handler and to show exactly why a claim lands where it does.
Claim facts
Points scored
Total
Category
Third party at fault, Liability policy, no address change, urban, mid price, 7-year-old vehicle
nothing scores
0
Fast track
Policy holder at fault, Collision policy, no other signals
fault +2, Collision +1
3
Approve
Policy holder at fault, All Perils policy, no other signals
fault +2, All Perils +2
4
Investigate
Policy holder at fault, All Perils policy, address changed 2 to 3 years ago
fault +2, All Perils +2, recent address change +2
6
Repudiate
Rule of thumb for handlers: you can always re-derive a category by hand from the point table. If a claim feels misplaced, check its facts against the signals, the rule does not bend.
3. How the Signals Were Chosen
Every weight is grounded in the true fraud rate by field across all 15,420 claims (base rate 6.0%). Signals that did not separate fraud, or that ran inverse to intuition, were left out.
Signals kept (real fraud lift)
Field value
Fraud rate
vs 6.0% base
Address change under 6 months
75.0%
12.5x
Accident at policy start (zero days)
16.4%
2.7x
Address change 2 to 3 years
17.5%
2.9x
All Perils policy
10.2%
1.7x
Vehicle price under 20k
9.4%
1.6x
Policy holder at fault
7.9%
1.3x (third party 0.9%)
Rural accident
8.3%
1.4x
Signals dropped (intuitive but wrong here)
Prior-claims pattern. Folklore says more priors means more fraud. In this data it is inverse: claimants with no prior claims have the highest fraud (7.8%), and those with more than 4 the lowest (3.4%). Dropped.
Late reporting. 99% of claims are reported more than 30 days out, so it does not discriminate. Dropped.
No police report. Only a weak lift (6.0% vs 3.7%), and almost every claim has no report. Dropped.
This is the honest core of the build: the rule uses what the data proves, not what sounds right. To re-derive the yardstick on a new book, recompute fraud rate by field and keep the signals with real lift.
4. QC Standard Operating Procedure
How an adjuster team runs the queue day to day, and the weekly check that keeps the rule honest.
4.1 Daily review queue
Work by category. Repudiate and Investigate first, then Approve; Fast track clears in a batch with a logged reason.
Read the points. Every claim shows the points it scored and why. Confirm the facts before acting.
Decide and record. Confirm, downgrade, or escalate. A flag is a review trigger, never an auto-denial; a repudiation is a human decision with reasons recorded.
Log the outcome. Note the final decision and, once known, the true fraud outcome. This feeds the weekly scorecard.
4.2 Weekly QC scorecard
Pull the week's closed claims that now carry a confirmed fraud outcome.
Recompute the scorecard: catch rate, flag accuracy, false-alarm rate, and F1, each with a 95% confidence interval.
Check calibration. Does a higher point score still land on fraud more often? If the climb flattens, the signals need a refresh.
Set the review threshold. Move the flag line (default 4 points and above) to the agreed point on the catch-rate versus false-alarm trade-off, and record why.
Re-derive on drift. If catch rate falls two weeks running, recompute fraud rate by field on recent claims and update the weights.
The scorecard is the trust contract. The team keeps using the rule only as long as the weekly numbers hold, and updates it the moment they do not.