Credit scores and other profiling
Aug. 4th, 2025 12:32 pmOne of very frequently misinterpreted data science-related metrics is a likelihood "score". Most popular score classes: credit score, trust score, and automation score, respectively, estimating how likely a person is to return a certain loan, to be an honest user vs bad actor, and to be a script vs a live person. These days, such scores are computed as a prediction probability in a machine learning classifier trained on some past data mapping a bunch of environmental, metadata, and prior behaviour information about a person/user to known binary outcome.
The misinterpretation that makes me categorize this post as fac is the human tendency to treat statistical estimates as deterministic parameters. Humans are bad at probabilities. (Proof: gambling) Basically, most of us, when hearing "credit scores for John and Paul are 0.6 and 0.9 respectively" would perceive this Paul being 9/6 = is one and a half times more financially responsible than Paul.
But that is bullshit.
It's best understood with automation trust score. A person cannot be 90% bot or a 60% bot. You either are a human or you ain't; a human with 0.9 automation score is 100% more human than a bot with a 0.6 automation score. The same thing about credit score: a borrower with credit score of 600 returning a loan is 100% better than a 900-credit-score borrower who defaults. The only way a score can and should be treated is when dealing with populations and risks: "out of a thousand applicants with a credit score of .9, a hundred will not pay off properly".
The most obvious (and correct!) parallel here is profiling. Yes, treating people only by credit score is just that -- good old profiling. And just like with profiling, you have to know when not to use it and when yes use it. Declaring someone definitely good or bad just because they fit the profile is wrong. Planning your business or policy or any other large scale thing where you have to care about return of investment and cost-benefit analysis is right and the only way to go. That simple.
So, what's the takeaway? Don't stop computing scores but hnow when to use them. Don't stop profiling, but know why and where you do it.
The misinterpretation that makes me categorize this post as fac is the human tendency to treat statistical estimates as deterministic parameters. Humans are bad at probabilities. (Proof: gambling) Basically, most of us, when hearing "credit scores for John and Paul are 0.6 and 0.9 respectively" would perceive this Paul being 9/6 = is one and a half times more financially responsible than Paul.
But that is bullshit.
It's best understood with automation trust score. A person cannot be 90% bot or a 60% bot. You either are a human or you ain't; a human with 0.9 automation score is 100% more human than a bot with a 0.6 automation score. The same thing about credit score: a borrower with credit score of 600 returning a loan is 100% better than a 900-credit-score borrower who defaults. The only way a score can and should be treated is when dealing with populations and risks: "out of a thousand applicants with a credit score of .9, a hundred will not pay off properly".
The most obvious (and correct!) parallel here is profiling. Yes, treating people only by credit score is just that -- good old profiling. And just like with profiling, you have to know when not to use it and when yes use it. Declaring someone definitely good or bad just because they fit the profile is wrong. Planning your business or policy or any other large scale thing where you have to care about return of investment and cost-benefit analysis is right and the only way to go. That simple.
So, what's the takeaway? Don't stop computing scores but hnow when to use them. Don't stop profiling, but know why and where you do it.