LLMs become more covertly racist with human intervention
Even when the two sentences had the same meaning, the models were more likely to apply adjectives like “dirty,” “lazy,” and “stupid” to speakers of AAE than speakers of Standard American English (SAE). The models associated speakers of AAE with less prestigious jobs (or didn’t associate them with having a job at all), and when asked to pass judgment on a hypothetical criminal defendant, they were more likely to recommend the death penalty. An even more notable finding may be a flaw the study pinpoints in the ways…