Beyond Black Box Scores: How Musubi Trains Custom AI for Trust and Safety Teams - podcast episode cover

Beyond Black Box Scores: How Musubi Trains Custom AI for Trust and Safety Teams

Jun 11, 20261 hr 13 minEp. 27
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Guests

  • Nikki Marinsek, Data Scientist, Musubi
  • Brian McCaffrey, Software Engineer, Musubi
  • Dan Means, Machine Learning Engineer, Musubi

In this episode

  • Why off-the-shelf moderation scores fail and how custom-trained models fix that
  • How Musubi combines traditional ML with LLMs for different moderation tasks
  • The discovery that AI can outperform human moderators—and how to communicate that to clients
  • Using AI as a judge to referee disagreements between AI and human decisions
  • How Musubi onboards new customers with "reverse demos"
  • What custom model training actually means: fine-tuning, feature engineering, and reusable deployment pipelines
  • The policy optimizer: an agentic flow that helps customers iterate on their LLM moderation policies
  • Why pushing eval tools directly to customers is a core product strategy
  • How Musubi is building flexible orchestration workflows for non-technical trust and safety teams

Resources & Links

  • Musubi — AI-powered trust and safety toolkit for content platforms
  • Maven AI Evals Course — The course Teresa took to learn about evals (get 35% off with Teresa's affiliate link)

Chapters

00:00 Meet the Team
01:18 Why Everyone Wears Product
02:32 What Musubi Builds
04:51 AI for Human Moderation
09:59 Adversaries and Asymmetry
11:48 Early Days and Low Latency
13:35 First Prototype Slice
15:33 Traditional ML Meets LLMs
19:52 Benchmarking Against Humans
23:09 LLM as Judge and Policy Gaps
29:53 From Prototype to Platform
31:15 Customer Onboarding Reverse Demos
36:08 Custom Models Per Customer
38:05 Fine Tuning vs Training
39:14 Embedding Driven Classification
40:04 Cost and Latency Tradeoffs
43:21 Productizing Customization
49:16 Scaling Prototypes to Production
51:58 Golden Sets and Policy Loops
56:17 Coaching Customers Safely
01:02:06 Gamified Feedback Signals
01:06:19 Agentic Toolkit Roadmap
01:09:05 Workflow Orchestration Future
01:12:06 Wrap Up and Thanks

For the best experience, listen in Metacast app for iOS or Android