HOW TOXMOD WORKS

ToxMod™ is the world's only full-coverage voice moderation solution. Where existing voice moderation tools focus only on the ~8% of players that submit reports, ToxMod goes beyond, independently catching all the worst harms on your platform - empowering your team to take action while it's happening, and mitigate the damage before good players churn away.

key facts

Voice-Native

ToxMod was born to understand all the nuance of voice. It goes beyond transcription to consider emotion, speech acts, listener responses, and much more.

Intelligent

ToxMod becomes an expert in your game's code of conduct and escalates what matters most with high priority to your team.

SECURE

All user data is anonymized and protected to ISO 27001 standards. Modulate will not sell or rent your data, ever.

Plug-and-Play

ToxMod ships with a variety of plugins for different combinations of game engine and voice infrastructure. You can integrate in less than a day.

Flexible

ToxMod provides the reports - your Trust & Safety team decides which action to take for each detected harm.

Detailed

Review an annotated back-and-forth between all participants to understand what drew ToxMod’s attention and who instigated things.

BasICS

How does ToxMod work?

ToxMod uses sophisticated machine learning to recognize when conversations begin to take a bad turn, at which point it generates automated reports to be analyzed by human moderators.

Is this just using transcription and key word detection?

No. Transcription and key word detection is a part of the puzzle, but ToxMod uses a variety of factors, including emotion and contextual phrase detection, when analyzing a conversation. ToxMod listens not only to what is being said, but also how it is said and how other players respond to it.

Is ToxMod in addition to player reports then, or a replacement for them?

Definitely in addition! It’s crucial to maintain player reports as a way for your players to flag to you when they have a problematic experience, and know that you’ll engage materially with them and your community. That said, relying only on those reports means you’ll miss out on a lot of players who need your support, so ToxMod aggregates both player reports and its automated ones to ensure you’re not leaving any victims of hate, harassment or other toxicity to fend solely for themselves. 

Why not just rely on player reports?

The sad reality is that more than 90% of harms online don’t get reported by players. Further, the worst harms like child grooming and violent radicalization tend to involve victims who are either unaware that they are in danger, or who lack the wherewithal to defend themselves. And ultimately, relying only on player reports puts the responsibility on the victims to protect themselves, when players and studios alike agree that platforms should be more proactive in preventing these kinds of online harms.

Does “automated voice moderation” mean ToxMod is snooping on our players?

No! While ToxMod processes all of your player audio, so does your voice chat system. What really matters is whether that data is ever seen by a human or used in any automated actions or decisions. And the answer is that ToxMod - just like a player report - will trigger if and only if it detects high risk of toxicity or harm, and only then will send the relevant data to your team of moderators to respond to.

But wouldn’t ToxMod still be storing copies of everyone’s voice chats, then?

While ToxMod does observe all voice chats across your platform, it can actually perform its initial analysis on-device, meaning that data would not reach our servers unless it was flagged as relevant to a harmful behavior. It should be noted that this is an optional setting, and that some studios may prefer to send all audio to ToxMod's servers, but even in this case, ToxMod immediately analyzes this audio to split out the relevant audio from the irrelevant audio. The irrelevant audio is archived temporarily in case it is later determined to be relevant to a harmful behavior, and ultimately deleted after no more than 30 days. This irrelevant data is never viewed by a human - including ToxMod employees - and is similarly never made accessible to the customer even through automated APIs.

If it’s listening to everyone, is it building predictive models to guess who will transgress, a la Minority Report?

Absolutely not. ToxMod only flags harms that are happening live - it will never try to flag players based only on a guess that they’ll misbehave in the future. ToxMod does track player history - i.e. how many times they’ve committed offenses - in order to help prioritize repeat or escalating offenders, but this only determines how urgently it flags new offenses after the player misbehaves again. 

How quickly does ToxMod notice harms happening? Can it can intervene directly during the conversation?

When you first deploy it, ToxMod does not take any immediate action on its own. Instead, it watches for harmful behavior and escalates it promptly to your moderators - typically within 45 seconds from when the offense began. This means your moderators can often mute the offender or take more extreme action if needed; and as ToxMod learns about your ecosystem, you can also begin using its automation features to take these actions directly for the most overt offenses without needing to loop your moderation team in first.

Don’t players reject the idea of an AI moderation system?

Some players surely will, but the majority are actually quite supportive. Major platforms such as Riot and Sony have announced voice moderation in recent years with the general response being positive, with notes of “it’s about time.” Modulate has also found that providing clarity on how ToxMod works is extremely important. While many players express initial concerns regarding privacy or the risk of false positives, more than half have converted to supporters after a conversation offering additional detail about how ToxMod actually works - especially when it’s clarified that human moderators will make the final decisions based on AI recommendations, rather than AI acting directly.

Does ToxMod really understand nuance? Will it know the difference between friendly trash talk and genuine harm?

ToxMod is designed specifically to understand this kind of subtlety. It takes advantage of a wide range of signals like emotions (of both the speaker and respondent), speech behaviors like interruption or awkward silence, and speech modes like laughter or crying, to recognize whether someone has been harmed or it is just hearing coarse language among friends. 

This might sound like a grand claim - after all, everyone knows AI doesn’t understand nuance! - but the numbers back us up here. Within 2-3 weeks of a standard deployment, ToxMod can already correctly separate harm from other behaviors with an accuracy 80% (compare to player reports, which tend to have accuracy closer to 15-30%!); and that number can be increased to 95% or above as ToxMod continues to automatically improve.

privacy

If ToxMod flags an offense based on its on-device analysis, where exactly does it send that data next?

That data will be sent to Modulate’s secure servers for deeper contextual analysis, after which it is shared through a secure web console or API for your moderators to examine and respond to. Our servers are secured with industry best practices in an isolated AWS environment, and can be linked through VPC Peering with your existing AWS environment to further minimize any security risks related to the sending or receiving of data. If absolutely necessary, we are also able to deploy ToxMod on-premise within your own environment, though this may result in increased costs and alterations to support responsiveness.

Can you give me more details about how Modulate protects its servers?

To start, we don’t collect any PII about your users; just the relevant voice chat information and an anonymized user id. ToxMod only stores data to aid your moderators in making their decision. Moderators will be able to see saved audio, transcriptions, and other conversational context (for harmful conversations only, of course!) for a set number of days before that data is deleted. And of course, all data is stored and transferred using industry-standard encryption. Modulate also conducts regular penetration and security tests and is certified with full ISO 27001 compliance.

How do you use our player data?

Modulate never shares your player data with any third parties. Some studios authorize Modulate to use portions of their data (after additional anonymization is done) to improve ToxMod’s core services, but by default, each customer’s data is used only within their own ToxMod ecosystem.

How do we assure client data safety when using ToxMod?

The data is associated only with a User ID (or other identifier specified by the client) and all data is automatically deleted after thirty days. Of course, we also support data subject requests (as defined in e.g. GDPR), and can delete data for users immediately upon request - though we require that request to be validated by you first, since we don’t have any data on our side to tie any Modulate user ID to a specific real person. For more, please see the Privacy and Data section of our website.

Why should I trust Modulate to manage our data responsibly?

We can’t answer that for you, but we can show you that our commitment to ethics is more than just words. Modulate exists to make online communities better, and we’re not afraid to pump the brakes on features or ideas that could cause more harm than good or put your security at risk.

details

How many simultaneous chats can ToxMod handle at one time?

ToxMod is designed to scale to many millions of simultaneous chats without issue. This is made possible by our revolutionary triaging technology, which uses multiple algorithmic “gates” to quickly identify non-toxic chats which don’t require moderation.

How do you handle large groups of users chatting at once? Is each user’s voice processed independently?

When you integrate with our SDK, we strongly recommend you send us individual audio streams for each user rather than a mixed stream.. If this is impossible, ToxMod can still function while processing the mixed audio, but its performance and accuracy will be decreased compared to the single-stream-per-speaker approach.

How do you handle slang, game-specific vocab, and mixed language usage?

We actively train our models to incorporate commonly used slang and gaming terminology. Additionally we collaborate with our customers to ensure any vocabulary that is specific to their game is included in our models.

What languages does ToxMod support?

ToxMod has currently been trained only for the English language, but our model architecture supports any language straightforwardly. Multiple languages including Korean, German, Spanish, Mandarin, and French are currently planned on our roadmap in the near future, and we are happy to work with customers on any additional language support needs.

What platforms do you support?

Platforms: Window 7+, MacOS, Android (including Oculus), iOS, PS4, PS5, Xbox One

ToxMod can be integrated with any game engine and VoIP solution fairly seamlessly; but we additionally offer example plugins to expedite integration for some of the most common setups.


How much compute / memory does the Client-Side SDK require?

The exact values vary a bit depending on your platform, but typically between 8-16 MB memory, and about ~0.5% CPU usage.

Can I integrate ToxMod data into our existing moderation tools?

Yes. By default, ToxMod data is visible on your ToxMod Web Console, but this data is supplied through a straightforward HTTP API. If you already have a moderation platform you wish to continue using, we’re happy to work with you to connect our API to that system.

Can I decide what constitutes toxicity or disruptive behavior?

Absolutely! ToxMod automatically learns based on your moderator’s behavior what should and shouldn’t count as disruptive, but if you’d like to do more, we give you additional levers related to different types of offenses (such as racial offenses vs religious hostility.) You can set your tolerance for each category individually, so if your game has substantial violence built in, you might only moderate the most severe violent dialogue, compared to a game which was catering to a younger audience. These settings can be adjusted live from your ToxMod Web Console at any time.

What kind of software engineers do I need to integrate ToxMod?

ToxMod is designed with a core SDK which can integrate smoothly into any game regardless of your specific game engine, VoIP solution, or platform, as well as some convenience wrappers to speed up the integration further for certain common setups (such as Unreal Engine + Epic Voice Chat + PC.)

If you’re looking for an engineer on your team to integrate one of these convenience “plugins”, you’ll need someone who meets the following criteria:

  • They are a game developer who knows how to use a game engine (Unity / Unreal), and they know how to install a plugin for that game engine
  • They have integrated (or are integrating) voice chat into your game (presumably through a plugin)
  • They do NOT have to know audio programming or the ins and outs of how voice chat works. 
  • They SHOULD know where in your code a player joins and leaves a voice chat room, and how you communicate that to your voice chat framework
  • They are comfortable with the primary programming language for the game engine you are using
  • They don’t need to know how to work with DLLs or shared libraries, but it can sometimes aid debugging if your game already utilizes similar dependencies to those our plugin will include (particularly libopus)

If you’ll instead be using the C++ Core SDK directly, you’ll want someone with the above expertise as well as a few other points of knowledge. The specific expertise you need will depend on whether you’ll be implementing our Server-Side SDK (Enterprise customers only) or our Client-Side SDK.

Client-Side SDK prerequisites

  • They’re capable of managing state, creating/destroying resources “responsibly”, working w/ structs & pointers & C arrays
  • If your game (or more precisely, voice chat framework interface with your game) is in another language, they are comfortable writing a wrapper for that language around our C interface
  • They know audio programming - they know not to allocate memory in the audio thread, they know how to do format conversation (e.g. shorts to floats), etc.
  • They know how your voice chat framework works - they know where the callback to get raw audio is, they know where the information on player and session identities lives, etc.
  • They know how to deal with incorporating and maintaining shared libraries on your platform

Server-Side SDK prerequisites

  • All Client-Side SDK prerequisites, plus…
  • They know how to interact with raw opus packets from your voice chat stream on the server-side
  • They are comfortable with multithreading (voice chat servers may receive many packets on distinct threads at once)

are you ready

The time is now

to be awesome?

Schedule a call

we can take you to the next level

Ventra is a professional Webflow template with multiple sections that you can customize