- Location
- San Francisco
- Last Published
- Nov. 9, 2025
- Sector
- AI/ML
Location
HQ - San Francisco, CA
Employment Type
Full time
Location Type
On-site
Department
Staff
Compensation
- $180K – $250K • Offers Equity
About Cartesia
Our mission is to build the next generation of AI: ubiquitous, interactive intelligence that runs wherever you are. Today, not even the best models can continuously process and reason over a year-long stream of audio, video and text—1B text tokens, 10B audio tokens and 1T video tokens—let alone do this on-device.
We're pioneering the model architectures that will make this possible. Our founding team met as PhDs at the Stanford AI Lab, where we invented State Space Models or SSMs, a new primitive for training efficient, large-scale foundation models. Our team combines deep expertise in model innovation and systems engineering paired with a design-minded product engineering team to build and ship cutting edge models and experiences.
We're funded by leading investors at Index Ventures and Lightspeed Venture Partners, along with Factory, Conviction, A Star, General Catalyst, SV Angel, Databricks and others. We're fortunate to have the support of many amazing advisors, and 90+ angels across many industries, including the world's foremost experts in AI.
The Role
The future of AI training will be built on a foundation of high-quality synthetic data. We are looking for a creative and resourceful Synthetic Data Specialist to design and build the systems that generate training data at an unprecedented scale. This is a unique, high-impact role, where you will solve critical data bottlenecks and directly accelerate our research progress.
What you’ll do
- Evaluate fidelity, diversity, and usefulness of synthetic data across LLMs, audio generation, and audio understanding.
- Implement techniques for steering data generation to improve model intelligence through data and mitigate bias.
- Build automated quality control systems to validate and filter generated data
- Design synthetic datasets at large scale to develop model capabilities.
- Stay on the cutting edge of research in synthetic data generation, data augmentation, and generative models.
What we’re looking for
- Experience with generative models (speech, text, or multimodal).
- Strong applied ML background with a focus on data-centric approaches.
- Understanding of evaluation methods for synthetic data quality.
- Excitement for building scalable systems that bridge research and production.
- Familiarity with building large-scale distributed systems for synthetic data generation
Our culture
🏢 We’re an in-person team based out of San Francisco. We love being in the office, hanging out together and learning from each other everyday.
🚢 We ship fast. All of our work is novel and cutting edge, and execution speed is paramount. We have a high bar, and we don’t sacrifice quality and design along the way.
🤝 We support each other. We have an open and inclusive culture that’s focused on giving everyone the resources they need to succeed.
Our perks
🍽 Lunch, dinner and snacks at the office.
🏥 Fully covered medical, dental, and vision insurance for employees.
🏦 401(k).
✈️ Relocation and immigration support.
🦖 Your own personal Yoshi.
Compensation Range: $180K - $250K