Earlier this month, The Washington Post looked under the hood of some of the artificial intelligence systems that power increasingly popular chatbots. These bots answer questions about a vast array of topics, engage in a conversation and even generate a complex — though not necessarily accurate — academic paper.
To “instruct” English-language AIs, called large language models (LLMs), companies feed the LLMs enormous amounts of data collected from across the web, ranging from Wikipedia to news sites to video game forums. The Post’s investigation found that the news sites in Google’s C4 data set, which has been used to instruct high-profile LLMs like Facebook’s LlaMa and Google’s own T5, include not only content from major newspapers, but also from far-right sites like Breitbart and VDare.
For computer scientist and journalist Meredith Broussard, author of the new book "More Than A Glitch: Confronting Race, Gender, and Ability Bias in Tech," the Post’s findings are both deeply troubling and business as usual. “All of the preexisting social problems are reflected in the training data used to train AI systems,” she said. “The real error is assuming the AI is doing something better than humans. It’s simply not true.”
There has been an explosion of interest in chatbots since the release of OpenAI’s ChatGPT last year. People have reported using ChatGPT to help with a growing list of tasks and activities, including homework, gardening, coding, gaming, writing and editing. New York Times columnist Farhad Manjoo reported it has changed the way he and other journalists do their work, but warned they need to proceed with caution. “ChatGPT and other chatbots are known to make stuff up or otherwise spew out incorrect information,” he wrote. “They’re also black boxes. Not even ChatGPT’s creators fully know why it suggests some ideas over others, or which way its biases run, or the myriad other ways it may screw up.”
But Broussard points out that bias problems plagued tech well before the chatbot craze. In a 2018 book "Algorithms of Oppression," internet studies scholar Safiya U. Noble exposed how racism was baked into the algorithm that powers Google’s search engine. For example, Noble, now a professor at UCLA, found that when Googling the terms “Black girls,” “Latina girls” or “Asian girls,” the top results were pornography. In other contexts, artificial intelligence used for tasks like approval of mortgage applications led to Black applicants being 40% to 80% more likely to be denied a loan than similarly qualified white applicants.
Anyone who has searched the web for information on a topic knows that it can sometimes land them on a site spewing bigoted content or disinformation. The building blocks of chatbots have been scraped from the same internet. An offended user can navigate away from a toxic site in disgust. But because the data collection for LLMs is automated, such content gets included in the “instruction” for them. So if an LLM includes information from sites like Breitbart and VDare, which publish transphobic, anti-immigrant and racist content, that information — or disinformation — could be incorporated in a chatbot’s responses to your questions or requests for help.
“LLMs have been trained with white supremacist language and toxic material,” Broussard said, and “will definitely output white supremacist language.”
After reading the Washington Post story, I looked at VDare, a site I’ve reported on in the past, but had not visited in some time. One front-page stories, reflecting a preoccupation of the site, claimed that “black-on-white homicides” were contributing to the “death of white America” — an argument purported to be based on FBI statistics. It reminded me of how Donald Trump, when running for president in 2016, retweeted a tweet from a white supremacist account that included a racist image and numbers falsely claiming that Black people are responsible for 81 % of homicides of White people. The fact-checking site Politifact deemed the tweet “pants on fire” for its lies, but the power of technology — in that case, the retweet by someone who was famous, rich and a candidate for president — imbued the lie with a kind of imprimatur that mainstreamed white supremacist hate and far outstripped any corrective.
If an AI receives information from VDare, and is being unable to distinguish between it and a reputable source, what would a chatbot powered by this LLM tell you if you asked it to summarize crime statistics for you?
Many people, said Broussard, assume “that technology is superior, that it is objective, that it is unbiased.” That “technochauvinism,” in which someone assumes that a person can write code to fix AI, is itself a bias. “But if you’re using an anti-racist mindset, you look at this method of training an AI, and you say, the AI is not going to work because, Why would you want to train a model with white supremacist content unless you are trying to promote that viewpoint?” she said.
Why do companies go this route? It is far cheaper to automate the LLM instruction than hire humans to supervise it. And beyond cost, they fear that deciding to include or exclude certain sites will be seen as taking an ideological position — even though allowing white supremacist content to automatically instruct their LLMs is itself an ideological position.
Broussard says the experience of using ChatGPT is “fun and nifty, and if you read hype you may be forgiven for assuming it’s the best new thing.” Every “best new thing” that seems novel and revolutionary — as Google's search engine did 20 years ago — eventually becomes an unremarkable aspect of daily life. But, Broussard adds, “We can look back at the past 20-30 years of tech hype and we have the opportunity to make better decisions now, and to understand the way that social problems like racism, sexism or ableism always manifest in tech systems.”
That can start with journalists — and users — asking the right questions, as the Post did, about what is at the core of the AI that might increasingly be helping them do their work.