About Fixie
We’re a Seattle-based AI startup (with support for working remotely). We’ve raised $17M in seed funding. Our vision is simple: build artificial intelligences that can communicate as naturally as humans. We’re a small team of researchers and engineers with a deep focus in speech and real-time technologies. Our core model, Ultravox, is open-source. We also build a serving stack that’s optimized for very low-latency interactions.
The Role
As a Research Engineer & Scientist working on foundational multimodal models, you will lead the effort to develop the next-generation speech generation capabilities for Ultravox, our open-source speech-to-speech model.
What you’ll do
- Lead critical research on speech generation in both pre-training and post-training stages, addressing core challenges in natural and expressive speech synthesis.
- Collaborate with a team of researchers and engineers to develop foundational multimodal models with comprehensive capabilities in speech understanding, speech generation, and full-duplex real-time communication.
- Develop novel models based on public and proprietary data sources.
- Build tools to improve our data flywheel and measure model quality.
- Drive the optimization and deployment of AI models for real-world applications in partnership with engineering and product teams.
Things we’re looking for
- An incredibly strong AI researcher with a track record of contributions to AI research, systems, and products.
- Experience with large language models, speech models, and multimodal models.
- Strong experience in Python and, ideally, PyTorch.
- Ability to roll up your sleeves and get things done.
- A great communicator and team player.
Benefits
- Generous equity package
- Unlimited PTO (take time when you need it)
- Top-of-market salary