Modulate is a tech company, and that means that our lifeblood is innovation - doing something new and powerful that changes the world in a material way.
Too often, though, companies prioritize this innovation above all else, and forget that their original goal was to take care of people. "Move fast and break things" isn't exactly a guiding light that worries about unintended consequences. "Don't be evil" might have gotten closer for a time, but it's still too loose, permitting wide debates about what constitutes "evil" all while forgetting that the road to hell can be traveled just as easily by good intentions.
At Modulate, we take a different approach. Our core value facets define the way that we operate, and set clear guideposts that define where we will and won't compromise. Among these, our Accrual and Net Impact values both speak to our position on ethics. Among other things, our Accrual value states that we recognize the importance of investing in people. That means we refuse to 'take advantage' of our employees, we reject the 'brilliant assholes' regardless of their skill level, and we believe what ultimately makes or breaks a team is the people on it.
Perhaps less straightforward, though, is our Net Impact value. This value doesn't care about intentions, strategies, or efforts. It cares about outcomes. "Leave everything better than how you found it." It's a simple but powerful statement.
Ignorance isn't an excuse. That we wanted to move fast or make some more revenue isn't an excuse. That few others seem to hold themselves to the same bar, or that it's hard to stay accountable in this way, aren't excuses either. The only rule is that whatever you do, you should be making the world better through it.
Now, it bears stating that this IS a hard task - monumentally so. We don't expect that we'll necessarily always succeed - unintended consequences do happen sometimes. But it means that we'll never permit ourselves the excuse that we couldn't have seen something coming. It means we'll constantly be striving not just to do better next time, but to rectify the wrong choices we might have made. And it means that, when there's no perfect answer and we're forced to choose between two shades of grey, we'll push ourselves to be explicit in our thinking, so that we can hear feedback, gather other perspectives, and navigate the challenging ethical questions that inevitably arise at a fast-growing machine learning startup in the best possible way.
So how does one actually guarantee this? Just saying we want to be ethical isn't enough - we need to take action. Modulate has a number of policies and practices designed around this goal - read on for more detail.
Modulate has put substantial effort into reducing bias during the recruiting and interview process. Of course, we first and foremost invest in getting the job postings out to a diverse range of communities and avoiding restrictive language; but the most interesting part of our process is that we avoid ever learning demographic info (age, name, sex/gender, etc) not only during the initial application, but during the phone screen as well! We do this by using an in-house-developed voice changing system (VoiceWear), so that we can conduct phone screens with candidates and still understand their words and emotion, while ensuring everyone's voice sounds generically like it comes from the same biologically speaker. (For more on this, check out our careers page.)
Once a candidate reaches the final interview stage, we will ask them for their name and pronouns, and team members will meet them directly. But at this stage, we still use a structured rubric to ensure we’re assessing each candidate along the same dimensions, and minimizing opportunity for biases to creep back into the process.
To start, we’ll of course never just say “yes” - there will always be room for us to gain by bringing in additional folks with unique or underrepresented perspectives. That said, we’ve invested significantly in making sure that, especially while the team is small, we are bringing in experts to help us fill in the gaps in our knowledge. One prominent such expert is Dr. Kishonna Gray, who meets with the Modulate team regularly to lend her wisdom relating to the experiences of underrepresented and intersectional communities online.
We try to constantly emphasize the importance of work-life balance, but we also know that’s not always enough; especially at a fast-growing startup where there’s always more to do. So we also provide a number of benefits to ensure our team can find a way to work that keeps them healthy and avoids undue stress. Among these, some of the more notable ones are our unlimited vacation policy; paid leave offered to new parents and others experiencing major life events; and a compensation plan that ensures new hires are given multiple offers, enabling them to choose between earning a bit more cash in the short term versus receiving a larger option grant for a greater payoff down the line.
We’ve actually got a pretty concrete “ethics review” process for new features or products which would come with significant risks of harm. The linked blog post goes into more detail, but briefly, this process is specifically designed to give the wider team an opportunity to think about possible risks and ensure they are seriously considered and mitigated before we ship - and in some cases, leads us to scrap an idea for a product or feature entirely!
We consider it not only vital for our business, but also our responsibility, to ensure that all of our machine learning systems are trained on a representative and broad set of different voices corresponding to a range of age, gender identity, race, body type, and more. Whenever possible, we endeavor to utilize large public datasets which already have been studied for representativeness, though in some cases we are required to gather our own reference speech. When this is the case, we’ve made a point to either hire a wide range of voice actors (where acted speech is sufficient) or to work with our trained data labeling team to identify a sufficiently diverse test set. That said, we acknowledge that as a still-growing company, we likely do not yet have the full range of coverage we’d like, and are constantly looking for opportunities and partnerships that will allow us to do even better. If you find that your voice is not handled well by our system, or work with an organization interested in helping collect equitable training data, please don’t hesitate to reach out to us at email@example.com!
ToxMod is designed carefully and its models are routinely tested to ensure we are not more likely to flag certain demographics as harmful, given the same behavior. That said, we do occasionally consider an individual’s demographics when determining the severity of a harm. For instance, if we detect a prepubescent speaker in a chat, we might rate certain kinds of offenses more severely due to the risk to the child.
We also recognize that certain behaviors may be fundamentally different depending on the demographics of the participants. While the n-word is typically considered a vile slur, many players who identify as black or brown have reclaimed it and use it positively within their communities. While Modulate does not detect or identify the ethnicity of individual speakers, it will listen to conversational cues to determine how others in the conversation are reacting to the use of such terms. If someone says the n-word and clearly offends others in the chat, that will be rated much more severely than what appears to be reclaimed usage that is incorporated naturally into a conversation.
While we acknowledge that the risk of such misuse will never quite be zero, we’ve taken several steps to ensure you can rest easy.
Firstly, we only distribute pre-selected voices to our customers, rather than allowing anyone to design and use any voice without oversight. This ensures that specific high-profile voices, such as politicians, are unavailable unless the person in question gives explicit approval.
In addition, even if someone does misuse a voice skin to create misleading audio, we will be able to detect it. We do this by watermarking all the audio we generate. While we cannot confirm whether an audio clip was synthesized or altered elsewhere, the presence of our watermark makes it easy to identify any content created here, and ensure it isn't treated as evidence of a real event.
No, Modulate will not enable voice fraud.
There may be some applications where customers will wish to use VoiceWear to create voice skins which are inspired by or based on real people. In these cases, we require that the customer demonstrate their legal right to use that voice in their final application before we will agree to work with them.
Modulate's voice skins are not meant to cut humans out of the loop - rather, our goal is to give people increased freedom to express themselves. Because of this, our voice skins only affect the vocal cords (or more accurately, the "timbre") of the voice, while leaving other aspects of the performance (such as emotion and prosody) intact. So anyone hiring a voice actor for a great performance today will still want a voice actor tomorrow - the only difference will be that, thanks to Modulate, that voice actor will be able to take on additional new roles outside of his or her biological range!
Overall, we actually feel our voice skins will be a hugely positive tool for voice actors as well as ordinary users. That said, if you harbor any concerns, we want to hear from you! Please feel free to email us at firstname.lastname@example.org with your worries - it's important to us to make sure we understand the impact of our technology, and always guarantee it's used in a positive way for everyone!
Thanks so much for asking! The first and most important thing you can do is to recognize that there are subtleties here. Society has actually faced challenges like this before - including the initial release of Photoshop, which led us to realize that online photos can’t always be taken at face value. Synthetic media will surely lead to us having to rethink some of our beliefs in a similar way, but we have confidence that we as a society can handle it - and we’ll do much better if we’re all exploring these questions together, rather than trying to label specific technologies as “inherently good” or “inherently bad.”
As a consumer, we also recommend you think carefully about which companies you’re supporting. Wherever possible, try to prioritize products and services from organizations with a clear commitment to responsible business practices. The ethics of innovation is something we will all have to develop together - so as long as the company in question wants to honestly participate like we do, we hope you will give them a fair chance to ask questions, and even to occasionally fail - so long as they are always pushing to improve.
At the end of the day, ethics is hard - we all need to be proactive to shape the world we want!