Research Engineer / Scientist, Speech Generation

About Fixie

We’re a Seattle-based AI startup (with support for working remotely). We’ve raised $17M in seed funding. Our vision is simple: build artificial intelligences that can communicate as naturally as humans. We’re a small team of researchers and engineers with a deep focus in speech and real-time technologies. Our core model, Ultravox, is open-source. We also build a serving stack that’s optimized for very low-latency interactions.

The Role

As a Research Engineer & Scientist working on foundational multimodal models, you will lead the effort to develop the next-generation speech generation capabilities for Ultravox, our open-source speech-to-speech model.

What you’ll do

Lead critical research on speech generation in both pre-training and post-training stages, addressing core challenges in natural and expressive speech synthesis.
Collaborate with a team of researchers and engineers to develop foundational multimodal models with comprehensive capabilities in speech understanding, speech generation, and full-duplex real-time communication.
Develop novel models based on public and proprietary data sources.
Build tools to improve our data flywheel and measure model quality.
Drive the optimization and deployment of AI models for real-world applications in partnership with engineering and product teams.

Things we’re looking for

An incredibly strong AI researcher with a track record of contributions to AI research, systems, and products.
Experience with large language models, speech models, and multimodal models.
Strong experience in Python and, ideally, PyTorch.
Ability to roll up your sleeves and get things done.
A great communicator and team player.

Benefits

Generous equity package
Unlimited PTO (take time when you need it)
Top-of-market salary