Motor Claim QC Pack - Yardstick, Worked Examples, Signal Analysis, SOP

1. Classification Yardstick

A claim scores points from the signals below; the total decides the category. This is the entire rule. No model, no judgement call, so the same claim always lands in the same category.

The point score (max 11)

Signal in the claim	Points
Policy holder at fault	+2
All Perils policy (Collision +1, Liability +0)	+2
Recent address change at claim (under 6 months or 2 to 3 years)	+2
Accident at policy start (zero days policy-to-accident)	+2
Rural accident	+1
Vehicle price at an extreme (under 20k or over 69k)	+1
Vehicle 0 to 4 years old	+1

The four categories

Total points	Category	Action	Share of book
0 to 2	Fast track	auto-clear, minimal effort	44%
3	Approve	pay after standard processing	27%
4 to 5	Investigate	refer to the fraud unit before any decision	26%
6 or more	Repudiate	recommend denial, a person decides, never auto-rejected	3%

Investigate and Repudiate are the fraud flags scored against the real label. Repudiate here means strong fraud grounds; this public dataset has no coverage or policy-status field, so coverage-based repudiation is out of scope and would be added in a real system.

2. Worked Examples

Four claims scored point by point, one per category. Use these to onboard a handler and to show exactly why a claim lands where it does.

Claim facts	Points scored	Total	Category
Third party at fault, Liability policy, no address change, urban, mid price, 7-year-old vehicle	nothing scores	0	Fast track
Policy holder at fault, Collision policy, no other signals	fault +2, Collision +1	3	Approve
Policy holder at fault, All Perils policy, no other signals	fault +2, All Perils +2	4	Investigate
Policy holder at fault, All Perils policy, address changed 2 to 3 years ago	fault +2, All Perils +2, recent address change +2	6	Repudiate

Rule of thumb for handlers: you can always re-derive a category by hand from the point table. If a claim feels misplaced, check its facts against the signals, the rule does not bend.

3. How the Signals Were Chosen

Every weight is grounded in the true fraud rate by field across all 15,420 claims (base rate 6.0%). Signals that did not separate fraud, or that ran inverse to intuition, were left out.

Signals kept (real fraud lift)

Field value	Fraud rate	vs 6.0% base
Address change under 6 months	75.0%	12.5x
Accident at policy start (zero days)	16.4%	2.7x
Address change 2 to 3 years	17.5%	2.9x
All Perils policy	10.2%	1.7x
Vehicle price under 20k	9.4%	1.6x
Policy holder at fault	7.9%	1.3x (third party 0.9%)
Rural accident	8.3%	1.4x

Signals dropped (intuitive but wrong here)

Prior-claims pattern. Folklore says more priors means more fraud. In this data it is inverse: claimants with no prior claims have the highest fraud (7.8%), and those with more than 4 the lowest (3.4%). Dropped.
Late reporting. 99% of claims are reported more than 30 days out, so it does not discriminate. Dropped.
No police report. Only a weak lift (6.0% vs 3.7%), and almost every claim has no report. Dropped.

This is the honest core of the build: the rule uses what the data proves, not what sounds right. To re-derive the yardstick on a new book, recompute fraud rate by field and keep the signals with real lift.

4. QC Standard Operating Procedure

How an adjuster team runs the queue day to day, and the weekly check that keeps the rule honest.

4.1 Daily review queue

Work by category. Repudiate and Investigate first, then Approve; Fast track clears in a batch with a logged reason.
Read the points. Every claim shows the points it scored and why. Confirm the facts before acting.
Decide and record. Confirm, downgrade, or escalate. A flag is a review trigger, never an auto-denial; a repudiation is a human decision with reasons recorded.
Log the outcome. Note the final decision and, once known, the true fraud outcome. This feeds the weekly scorecard.

4.2 Weekly QC scorecard

Pull the week's closed claims that now carry a confirmed fraud outcome.
Recompute the scorecard: catch rate, flag accuracy, false-alarm rate, and F1, each with a 95% confidence interval.
Check calibration. Does a higher point score still land on fraud more often? If the climb flattens, the signals need a refresh.
Set the review threshold. Move the flag line (default 4 points and above) to the agreed point on the catch-rate versus false-alarm trade-off, and record why.
Re-derive on drift. If catch rate falls two weeks running, recompute fraud rate by field on recent claims and update the weights.

The scorecard is the trust contract. The team keeps using the rule only as long as the weekly numbers hold, and updates it the moment they do not.