Protect your players with ToxMod’s new violent radicalization category

June 15, 2023

Over 50% of gamers in five of the world’s top videogame markets have seen some form of extremist language while playing multiplayer games in the past year according to a survey conducted by NYU.

At Modulate, we’ve been working hard to help game developers regulate and eliminate hate speech and toxic behavior for years. With tools like ToxMod, these behaviors have become much easier for game moderators to catch—and enforcement has been mostly straightforward with the transparency provided by features such as user report correlation.

But extremism and radicalization? The perpetrators are smarter and more elusive. To combat this, we’ve added a new detection category to ToxMod that helps moderators catch the more insidious behavior that often goes unchecked until it has already escalated.

Why is it so hard to catch extremists?

Hate speech and toxic behavior are hard to miss when you have the right moderation tools. The language is often overtly aggressive and riddled with keywords that are obviously racist, sexist, and vulgar. These bad actors often rely on manipulation tactics (reporting their victim before they can be reported themselves, for example) to avoid being penalized or banned. But with ToxMod, it’s easy to gather the context needed to work around these manipulation tactics and take action.

Extremists, on the other hand—especially those seeking to recruit others—are more subtle, intentionally avoiding certain keywords and phrases that may trigger alerts for moderators. These bad actors often fly under the radar for months or years, sometimes resulting in real violence and death either by the hand of the extremist or those they recruited.

Modulate’s solution: Violent radicalization detection

As more and more extremist, white supremacist, and alt right movements proliferate the multiplayer gaming world, it has become increasingly urgent to give moderators the tools needed to catch those spreading dangerous ideologies.

And now that tool finally exists. ToxMod—the signature voice chat moderation software you know—now has a new detection category: violent radicalization.

This new category allows gaming’s only proactive, voice-native moderation solution to go even further in making multiplayer games safer. With this category, video game voice chat moderators can flag and take action on terms and phrases relating to white supremacist groups, radicalization, and extremism—in real-time.

The category detects:

Promotion or sharing ideology.
Recruitment or convincing others to join a group or movement.
Targeted grooming or convincing vulnerable individuals (i.e., children and teens) to join a group or movement.
Planning violent actions or actively planning to commit physical violence.

This wasn’t an overnight development, nor was it taken lightly. Our team spent months utilizing data and research from nonprofits and research organizations—most notably the Anti-Defamation League (ADL)—to develop this new category. The Modulate team also conducted a number of internal ethics discussions to gather more perspectives, and made sure to address all raised concerns to ensure more fair outcomes and reduce the risks of punishing innocent players.

The work doesn’t stop there

Extremists are constantly looking for new ways to recruit and promote their ideology without repercussion. That means Modulate will constantly look for new ways to stand in their way.

This initial release focuses on “utterance level” data. Using research from groups like ADL, studies like the one conducted by NYU, current thought leadership, and conversations with folks in the gaming industry, we’ve developed the category to identify signals that have a high correlation with extremist movements, even if the language itself isn’t violent. (For example, “let’s take this to Discord” could be innocent, or it could be a recruiting tactic.)

As we gather more specific data, we will continue to expand our knowledge of violent radicalization terminology and tactics. With this data, our machine learning team will be able to retrain the models to improve accuracy and eventually track larger patterns of behavior to empower moderators to investigate and take confident action.

Bad actors abandon ethics. We don’t.

Because our mission is to create spaces that are safer and more equitable, we approach features like this with ethics in mind as well (even when the folks we’re working against don’t do the same). ToxMod should not only detect and report these nuanced harms, but also it should do it with accuracy to protect innocent players from unfair accusations.

We are taking several steps to handle this category with more sensitivity towards false-positive reports. As part of this initial release, ToxMod is looking for a smaller set of unambiguous signals in escalating offenses in the Violent Radicalization category. Feedback from our customers is being used to more precisely tune ToxMod to identify these harms, and model updates are already in the works. In addition to this continuous and iterative development of precise signals, this category has also been released with modified escalation to enable customers to investigate these offenses in a specific queue. Moderation through this queue requires human-in-the-loop validation to determine if someone is truly a threat or not before suggesting different ways to engage with and regulate the content.

Our aim is to continue to improve ToxMod's ability to identify and escalate Violent Radicalization offenses to Trust & Safety teams. The checks and balances built into the release of this category, along with our strong code of ethics as a guiding principle, increase the likelihood that extremists are penalized and innocent gamers are protected—not the other way around.