Stopping hate speech online is difficult. New research is teaching machines to find white nationalist content.
Mitigating the impact of online extremism has proven to be a difficult task for companies wishing to protect freedom of expression.
Social media companies have struggled to moderate hate speech and adapt to changes in the way white supremacists broadcast their views on digital platforms. Libby Hemphill, associate professor at the University of Michigan’s School of Information, believes machine learning technology may provide an answer.
“We know white supremacists and other types of extremists use social media to talk to each other, to recruit, to try to get their point across,” Hemphill said. “The challenge has been that the platforms haven’t really stepped up the fight against hate on their platforms.”
Through a partnership with the Anti-Defamation League, Hemphill set out to teach algorithms to distinguish white supremacist and extremist speech from the typical conversations people have on social media. It turns out that extremist groups are good enough to hide in plain sight, but algorithms can get even better at finding them.
“We can’t moderate content without the help of a machine,” Hemphill said. “There is just too much content. “
Hemphill began by collecting a sample of 275,000 blog posts from Stormfront, a white nationalist website. The data was fed to the algorithms used to study the sentence structure of the messages, detect specific sentences and flag recurring topics.
The goal was to train a machine to identify toxic language using Stormfront conversations as a model. The algorithms compared data from Stormfront to 755,000 Twitter posts from users affiliated with the alt-right movement and another set of 510,000 Reddit posts collected from general users.
The results are still being compiled. Hemphill hopes to unveil a public tool later this fall.
Big tech companies like Facebook and Twitter have been accused of stifling free speech by removing users who violate community guidelines banning hate speech. The offending posts are identified by algorithms trained to detect hate speech and by user reports.
However, advocacy organizations are not satisfied with the enforcement standards.
The Center to Counter Digital Hate found that the top five social media companies had taken no action on 84% of the anti-Semitic posts reported to them. CCDH, a nonprofit group in the US and UK, reported 714 messages through the platforms’ user reporting tools. Anti-Semitic publications have been viewed up to 7 million times.
“The platforms must aggressively remedy their moderation systems which have proven insufficient,” wrote CCDH CEO Imran Ahmed in a study announcing the group’s findings.
Measures to ban white nationalist content have also inspired the rise of new platforms with more flexible content standards. Gab, MeWe and Parler claim to defend free speech, but have been criticized as havens for extremists.
Gab CEO Andrew Torba promotes his website as “the only place you can criticize groups like the US Jewish Congress and ADL” in emails to Gab’s user base.
Hemphill said there is significant conflict around what constitutes hate speech and what social platforms should do about it. She also acknowledged fears that machines could make mistakes and unfairly punish users.
“The challenge is that we, as a community, don’t share a set of values,” Hemphill said. “We do not agree on what is hateful and what should be allowed. One of the things I would love to see come out of a job like this is more explicit discussions of what our values are and what works for us and what doesn’t. Then we can worry about what to teach the machines.
Finding the line can be difficult. It’s harder to pinpoint nuanced racial stereotypes and microaggressions than it is to find insults, Hemphill said. There are also a large number of posts with images and videos of memes that are more difficult to catalog with precision.
Stormfront was used as a benchmark in Hemphill’s research because the group openly identifies itself as a forum for white nationalists. The Twitter users on the right were selected from a study commissioned by the Center for American Progress.
Beyond the use of racial slurs and other types of toxic language, Hemphill said subtle differences were found between white nationalists and the average internet user. For example, white nationalists swear less often, perhaps a tactic to appear more acceptable to the general public.
“They’re kind of politely hateful,” Hemphill said. “If you’ve spent any time on the Internet, you know it’s a pretty secular place, but white supremacists are not secular. They are marked by what they do not do as well.
Hate speech constitutes a relatively small amount of content on the internet, Hemphill said, although it has a big impact on anyone who sees it. She plans to collect more data from smaller websites like Gab, 4chan and Parler to continue teaching algorithms to identify hate speech.
Ultimately, the goal is to encourage social media companies to create inclusive chat rooms.
“It’s okay to be afraid of big tech, and it’s okay to be afraid of university researchers and what we can do,” Hemphill said. “I think the more this conversation takes place in the open and the more data is shared among people who are trying to understand how these decisions are made, the better the fair decisions will be about what is allowed of this openness.”
LEARN MORE ABOUT MLIVE:
Democratic lawmaker slammed for promoting vaccine misinformation
Michigan Redistribution Commission Seeks To Boost Public Contribution On Cards
Weighing the risks of COVID vaccines against the risk of the COVID virus
Michigan is becoming more and more multiracial. See county changes in diversity from 2010 to 2020