Beyond Black Box Scores: How Musubi Trains Custom AI for Trust and Safety Teams
Episode description
Guests
- Nikki Marinsek, Data Scientist, Musubi
- Brian McCaffrey, Software Engineer, Musubi
- Dan Means, Machine Learning Engineer, Musubi
In this episode
- Why off-the-shelf moderation scores fail and how custom-trained models fix that
- How Musubi combines traditional ML with LLMs for different moderation tasks
- The discovery that AI can outperform human moderators—and how to communicate that to clients
- Using AI as a judge to referee disagreements between AI and human decisions
- How Musubi onboards new customers with "reverse demos"
- What custom model training actually means: fine-tuning, feature engineering, and reusable deployment pipelines
- The policy optimizer: an agentic flow that helps customers iterate on their LLM moderation policies
- Why pushing eval tools directly to customers is a core product strategy
- How Musubi is building flexible orchestration workflows for non-technical trust and safety teams
Resources & Links
- Musubi — AI-powered trust and safety toolkit for content platforms
- Maven AI Evals Course — The course Teresa took to learn about evals (get 35% off with Teresa's affiliate link)
Chapters
00:00 Meet the Team
01:18 Why Everyone Wears Product
02:32 What Musubi Builds
04:51 AI for Human Moderation
09:59 Adversaries and Asymmetry
11:48 Early Days and Low Latency
13:35 First Prototype Slice
15:33 Traditional ML Meets LLMs
19:52 Benchmarking Against Humans
23:09 LLM as Judge and Policy Gaps
29:53 From Prototype to Platform
31:15 Customer Onboarding Reverse Demos
36:08 Custom Models Per Customer
38:05 Fine Tuning vs Training
39:14 Embedding Driven Classification
40:04 Cost and Latency Tradeoffs
43:21 Productizing Customization
49:16 Scaling Prototypes to Production
51:58 Golden Sets and Policy Loops
56:17 Coaching Customers Safely
01:02:06 Gamified Feedback Signals
01:06:19 Agentic Toolkit Roadmap
01:09:05 Workflow Orchestration Future
01:12:06 Wrap Up and Thanks
