Written by Roman Stanek |
As AI systems become more common, Gartner predicts that we’ll see 2.3 million new AI-related jobs by 2020. These positions could include personality trainers and algorithm forensic analysts, but data hygienists have become a particularly popular topic of discussion. Essentially, data hygienists will ensure that the data used to train AI systems is “free from any slanted perspective” by monitoring the data that’s fed into the AI. This may seem like a reasonable new role as the use of AI systems spreads, but as I explained for Datanami, the chances that a human—or a machine—could operate without any bias are basically zero.
We can never be free of our biases because there are simply too many of them—and I’m not just talking about the biases we’re familiar with, like racial or gender discrimination. There are 188 known cognitive biases, which range from the backfire effect—where someone doubles down on their beliefs when they’re confronted with information that contradicts them—to anchoring, which is the tendency we all have to rely too heavily on one piece of information—usually the first piece that we encounter—when making a decision. How could a data hygienist possibly assess data without falling victim to even one of those 188 biases, all of which are deeply ingrained in the way humans process information? What makes a data hygienist different from anyone else when it comes to biases?
It would be impossible to interact with an AI system in a way that doesn’t pass some of these biases on, which in turn affects the way the machine processes future information. Even the algorithms we use were created by someone who, by definition, has their own cognitive biases, which are present in models, regardless of whether they were intended to be there. As the use of AI becomes more widespread, we all need to be aware of how we unintentionally teach AI systems to respond to something in the same way that our own biases lead us to respond.
It’s tempting to claim that data can remain unblemished by the biases that dictate how we interact with each other and how we process information, but with so many, it would be impossible to get rid of them all. Instead, we need to be aware of our biases and work to minimize their effects. AI systems and data will continue to bear at least some mark of our human inclinations, but by recognizing our tendencies and our own biases, we can make a conscious effort to minimize their effects.
Written by Roman Stanek |
Subscribe to our newsletter
Get your dose of interesting facts on analytics in your inbox every month.Subscribe