The Issue of Fairness in AI is Overblown

As AI is being used to make decisions that affect people, whether applicants to Amazon or babysitters, or determining insurance premia, the question of fairness has come up. For example, Amazon’s system was found to consider women to be less promising than men. This was perceived as unfair, and the system was shut down.

This is an example of the fairness issue in AI.

How should we look at this debate?

First, we should be skeptical and demand proof that AI systems are actually doing what they promise. The builders of a system talk of it in glowing terms, to get customers to buy and use it. Even for internal tools, I’ve observed that teams building it tend to present it in the most positive terms, to improve their standing within the company, to get promoted, get more influence, etc. In today’s marketing-riddled world, every product is supposedly amazing, great or <insert superlative here>. We should ignore this marketing bullshit. A company licensing an AI system from a vendor should insist on the vendor sharing all the data they have, so that the buyer can evaluate its pros and cons for themselves. Raw data, not marketing bullet points like “Up to 45% better!” . This data should also be released publicly, instead of treating it as a competitive advantage. It should, of course, be anonymised, to protect privacy. The un-anomyised, or less anonymised, data set should be made available to researchers at universities. With this transparency in place, society as a whole can have an informed discussion about this whole issue. As opposed to falling back to black and white preconceived positions like “AI will improve the world!” vs “AI will perpetuate unfairness!”

Second, the alternative to AI is humans, who also have their own biases. So the question is whether the AI is more or less fair than the humans it’s replacing, not whether the AI is fair in every single case. Unfortunately, this nuance is often missing in discussions. Saying AI is unfair is like saying traveling by train is unsafe. No, trains are actually far safer than driving. Probably 100x safer in India [1]. But if you make a blanket statement that traveling by train is unsafe, someone might listen to you and choose to drive, thus taking a far bigger risk. That’s the danger of making black and white statements like “Traveling by train is dangerous” or “AI is unfair”. Critics would be more intellectually honest to phrase it as an open question for us to consider, like “Is AI fair?”, not a predetermined conclusion, like “AI is unfair.”

Third, it’s easy to point out cases where an AI unfairly rejected a candidate, but what about the qualified people who were selected by the AI who wouldn’t otherwise have been? If for every person of the former group, there are more than one of the latter, then the AI is actually more fair than the system it’s replacing.

As a concrete example, Google hires only 1 in 428 applicants. The hiring process is a funnel, with most people rejected in the first stage. If we assume that 90% are rejected without being called for a phone interview and thus given a chance to prove their skills, that’s 385 people. If even one among the 385 gets hired because of the AI who’d have been overlooked by a human recruiter, that’s a win for the AI.

Again, we don’t live in a utopia, so the question to ask is whether the new way is better than the old way, not whether the new way meets an absolute standard of perfection the critic has set out in his mind.

Fourth, AI can augment existing systems, not just replace them. In the above example of Amazon potentially rejecting qualified women, nothing prevents Amazon recruiters from manually looking at women with borderline scores, to offset the known bias of the system. That’s not an argument for rejecting a system. For example, if humans look at the top 5% of men, they can look at the top 8% of women.

Fifth, saying that rejecting women is unfair is a political opinion. One could just as well argue that if women are actually less likely to succeed at a job, the AI is being accurate, not biased, in taking that into account [2].

Part of this comes from US political correctness, which holds that that there’s no significant difference in workplace outcomes across genders, that men and women are identical except for sex organs. Nobody who told me that could give me evidence to support this hypothesis, and even asking for it resulted in a conversation with HR at one time. Maybe this is really true, in which case it shouldn’t be hard to make an argument backed by evidence. Or it’s not, in which case we should be free to discard this belief. Just as we’ve discarded other beliefs like encountering a black cat on your way will lead to failure, unless counterbalanced by a white cat. Really — I’ve had a well-wisher tell me that.

Having interviewed hundreds of candidates, I’ve noticed that girls are on average worse. An Aspiring Minds report found that female graduates are less likely to be able to write functionally correct code. I’m mentioning that in case you think I’m biased or mistaken. At least in some cases, some groups in society really perform worse than others, and we should acknowledge that fact rather than trying to stick our heads in the sand.

Once we acknowledge that fact, it’s fair for a company to take that into account to make their hiring process more efficient [3]. Hiring is all about efficiency, about finding the best people in the time available, not finding the absolute best people assuming unlimited time.

As another example, some kinds of insurance in some countries costs more for men than for women. If some do-gooder equalises rates for both genders, I as a man would consider it unfair to dump my costs on women.

Sixth, an example was cited of a company that uses AI to select a babysitter by evaluating whether they’re violent, among other factors. What if a person is wrongly flagged as violent, a critic asked. But the counter-argument is: what if a baby is abused that could’ve been prevented? Will you (the critic) volunteer to be the one to explain that to the parents? It’s easy to be an armchair critic, pointing out only one side of the coin, but that doesn’t help anyone.

Seventh, there won’t be just one AI system — there’ll be multiple, built by different vendors. Some will be internal, some not. All these systems are unlikely to have exactly the same biases, so a person unfairly rejected by one stands a chance of being selected by another. Or a person rejected by an AI may be hired by a human. It’s not as if all decisions will be made by AI in 2019.

In summary, it’s time to raise the level of this debate. Participants in the debate should acknowledge valid points on both sides. They should ask whether the new system is more or less fair than the old one, not whether it’s fair in every single case. Open questions should be phrased as such and not as predetermined conclusions. Arguments should be backed by data, like how often something happens and whether the tradeoff is worth it, not that something happens.

If we do this, we’ll actually have a reasoned debate that will achieve something, as opposed to blindly rushing through without sufficient consideration of the downsides. Or, on the other hand, killing technology that will actually push the world forward by holding to an impossible standard. Neither extreme is good. To find the right balance, we need to have an informed debate, by following the thoughts in this post.

[1] 9x safer in India, and that’s not accounting for the fact that most people dying in train accidents are not passengers, but people on the tracks. If you take that into account, you get at least an order of magnitude improvement, like 100x.

[2] If the system has an actual bug that causes it to make predictions that don’t match outcomes, that should, of course, be fixed. That’s not a reason to rule out AI any more than the recent crashes of the Boeing 737 Max is a reason to rule out aviation.

[3] One could say that it’s still unfair, and it is, in an idealistic sense. But fixing that imposes more costs on the company. For that matter, a lot of the screening process is unfair. Penalising people without a degree is unfair to people who don’t have one, but may be equally skilled. Likewise for penalising people without a CS degree. And for penalising people who didn’t go to a top-tier college. Or people who’re not good at English, because they were born into a poor family. And so on.

Fixing all these will require companies to spend an order of magnitude more time on hiring. I run a startup, and I know what a huge effort hiring is, to the point where it’s a risk for the startup. Make it an order of magnitude more, and my startup will go out of business. And it’s not just startups. Google gets more than 400 applicants for each opening, and there’s no way they can give everyone a fair chance, unless you want them to stop building products and spend all their time interviewing.

On-demand Leader. Earlier: IIT | Google | Solopreneur | Founder | CTO | Advisor