172: Transformers and Large Language Models - podcast episode cover

172: Transformers and Large Language Models

Mar 11, 20241 hr 26 minEp. 172
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

172: Transformers and Large Language Models


Intro topic: Is WFH actually WFC?

News/Links:


Book of the Show


Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h


Tool of the Show


Topic: Transformers and Large Language Models

  • How neural networks store information
    • Latent variables
  • Transformers
    • Encoders & Decoders
  • Attention Layers
    • History
      • RNN
        • Vanishing Gradient Problem
      • LSTM
        • Short term (gradient explodes), Long term (gradient vanishes)
    • Differentiable algebra
    • Key-Query-Value
    • Self Attention
  • Self-Supervised Learning & Forward Models
  • Human Feedback
    • Reinforcement Learning from Human Feedback
    • Direct Policy Optimization (Pairwise Ranking)



★ Support this podcast on Patreon ★
For the best experience, listen in Metacast app for iOS or Android