Title | : | How ChatGPT is Trained |
Duration | : | 13:43 |
Viewed | : | 512,812 |
Published | : | 24-01-2023 |
Source | : | Youtube |
This short tutorial explains the training objectives used to develop ChatGPT, the new chatbot language model from OpenAI. Timestamps: 0:00 - Non-intro 0:24 - Training overview 1:33 - Generative pretraining (the raw language model) 4:18 - The alignment problem 6:26 - Supervised fine-tuning 7:19 - Limitations of supervision: distributional shift 8:50 - Reward learning based on preferences 10:39 - Reinforcement learning from human feedback 13:02 - Room for improvement ChatGPT: https://openai.com/blog/chatgpt Relevant papers for learning more: InstructGPT: Ouyang et al., 2022 - https://arxiv.org/abs/2203.02155 GPT-3: Brown et al., 2020 - https://arxiv.org/abs/2005.14165 PaLM: Chowdhery et al., 2022 - https://arxiv.org/abs/2204.02311 Efficient reductions for imitation learning: Ross & Bagnell, 2010 - https://proceedings.mlr.press/v9/ross10a.html Deep reinforcement learning from human preferences: Christiano et al., 2017 - https://arxiv.org/abs/1706.03741 Learning to summarize from human feedback: Stiennon et al., 2020 - https://arxiv.org/abs/2009.01325 Scaling laws for reward model overoptimization: Gao et al., 2022 - https://arxiv.org/abs/2210.10760 Proximal policy optimization algorithms: Schulman et al., 2017 - https://arxiv.org/abs/1707.06347 Special thanks to Elmira Amirloo for feedback on this video. Links: YouTube: https://www.youtube.com/ariseffai Twitter: https://twitter.com/ari_seff Homepage: https://www.ariseff.com If you'd like to help support the channel (completely optional), you can donate a cup of coffee via the following: Venmo: https://venmo.com/ariseff PayPal: https://www.paypal.me/ariseff
So How Does ChatGPT really work? Behind the sc... 15:01 - 530,634 |
What are Transformer Neural Networks? 16:44 - 157,994 |
[1hr Talk] Intro to Large Language Models 59:48 - 1,777,565 |
Let's build GPT: from scratch, in code, spelled... 56:20 - 4,187,329 |
Survival Strategies in the Era of AI Taught by ... 09:37 - 1,573,819 |
What Is an AI Anyway? | Mustafa Suleyman | TED 22:02 - 255,015 |
The Greatest Maths Mistakes | Matt Parker | Tal... 58:41 - 1,057,315 |
But what is a GPT? Visual intro to transformer... 27:14 - 2,062,735 |