Q: Who are you? Where are you from?
A: I'm Carter Huffman (he/him/his) - I'm Modulate's CTO and one of the two cofounders! I was born and raised in northern Virginia, before I moved up to Boston for school.
Q: What’s your background? What did you do before Modulate?
A: I spent part of my undergrad at MIT working on early universe cosmology - asking questions like "If the universe started out with such and such properties, how would it evolve? And what traces would that leave on the visible universe today that we can observe?" A lot of that time was working out some basic equations, writing physics simulations to test them out, and then checking the results to make sure that they were actually consistent with physics!
After undergrad I went to the Jet Propulsion Laboratory, working on ways to make spacecraft more capable and intelligent, as part of the Machine Learning and Instrument Autonomy group! Doing machine learning for spacecraft is pretty different from the kinds of deep neural network research that I do today - they use much older, radiation-hardened CPUs and have a tight time and power budget, so simpler and faster algorithms are preferred. Also, the stakes can be pretty high: in a flyby mission, there's only one shot at taking measurements and gathering as much science as possible, so any ML/AI algorithms onboard absolutely have to work the first time in an unknown environment!
Q: What inspired you to work on developing voice skins? In particular, what gave you the passion to cofound a company for this technology?
A: For me, it's always been the understanding that this technology is "part of the future": in the year 2100, of course you can sound like anyone you want to, just like you can look however you want to with VR and body tracking! This is all pretty much taken for granted in fiction (e.g. Ready Player One) - and actually, when I first started working on voice skins, I discovered that many people just assumed that this technology already existed. After I started working on voice manipulation with deep learning, I became convinced that this was a piece of the future that I could help build, which has remained my motivation for the past three and a half years!
Q: How did you first discover how to make working voice skins? How long did it take?
A: I knew the basic idea pretty much from the outset: take a clip of speech, force a neural network to separate out the "content" from the "voice", change the voice part, then recombine them. The tricky bit is how to force the network to separate those components. I spent several months trying to just build a meaningful "latent space" and manipulate it manually (if you've seen demos where people morph one face into another, it's that kind of tech), but relying on the neural network to just happen to learn a good space doesn't give you any control over the fidelity of the results, so I had to abandon that.
At that point, I had the thought that adversarial training could help solve the problem by explicitly forcing the network to mimic a particular voice - and after some time I became convinced that this was the "right way" to build a voice skin. All told it was about a year from starting the project to getting anything sounded even vaguely human, and then another year and a half from that before I started getting really high quality results!
Q: What’s something a potential Modulate employee should know about you?
A: I love talking to people about their interests and passions! Anything that has enough depth and complexity for someone to be interested in it is fascinating - from sports, to fiction, to reality TV, to AI, to cars, to microscopic sea creatures (my brother's PhD topic). If you're ever wondering how to start a conversation with me, just bring up something you're interested in!
Q: What’s the voice skin you’re most excited to use, and how do you plan to use it?
A: As far as celebrity voices go, I'd love if we could make a voice skin of Jennifer Hale! She's a legend in video game voice acting - I loved the Role Playing Game genre growing up, where there's an emphasis on characters and story, and Jennifer was the voice of so many of the characters that I grew up with.
Although, I'm even more excited to build my own custom voices - I've always like creating my own unique character in games, but I've never been able to give them their own voice - they've always just sounded like Carter. It's a bit awkward for me when I'm trying to play a different character and I can't get their voice right - it feels too much like I'm faking something; and it's a constant reminder that there's "someone behind the curtain". I'm looking forward to not having to worry about that.
Q: What’s your ideal work environment? Any special strategies you use to stay effective?
A: I love being surrounded by people working and chatting, while I can wear headphones and focus down into what I'm doing. Working at home can get pretty lonely, while being in an open office where people can ask me for things at any time can get distracting if I want to focus. I like being around noise and commotion, but not a part of it!
Q: Who are you outside of work?
A: I like to think that I'm pretty cheerful and easy to get along with! I like meeting people and learning about them, but I'm a bit of an introvert so I need to intersperse that with quiet evenings and weekends sometime!
Q: What’s something you’re great at that few people realize?
A: I'm pretty good at breaking game systems - video games, pen and paper RPGs, boardgames - by coming up with edge cases that the designers didn't think of! I love trying to combine elements of games in ways that don't initially make sense but still kind of work, and usually those combinations aren't well tested and lead to unexpected consequences. Mike likes to tell a story about a video game he tried to make, where the first thing I did was run into a wall for a minute before I clipped through and broke the game - so that's something that I find fun!
Q: Leave us with a fun tidbit - a favorite joke, a story from your past, an obscure riddle, whatever you like!
A: Did you know that ants can count? If you put stilts on an ant that's trying to go somewhere, it will end up going too far and get lost, because it's counting its steps to measure distance!