ChatGPT, DALL-E and Stable Diffusion have been the buzz in 2022. As a software engineer, with no formal education in the field of machine learning or artificial intelligence, I have been watching the latest progress and buzz from the sidelines, and with awe.
This holiday break I decided to develop a basic understanding of this field and demystify some of these concepts for myself, esp. around the sub-field of deep learning and neural nets.
I didn’t find something readily available which fit my needs – i.e. a ‘classical’ software engineer with experience in building products and platforms, including data analytics platforms, looking to ramp up in the field of deep learning and software 2.0. Resources on the web were either too high level aimed at a cursory understanding of the field for non-technical folks, or were too low level meant for students or professionals who have had several years of ML and AI experience.
So here it is, my set of resources assembled for the purpose. Hope you find them useful too!
University courses (Free!)
- Stanford CS231n [Winter 2016] – Deep Learning for Computer Vision by Andrej Karpathy & Justin Johnson: One of the best intros to the overall field of computer vision and machine learning; includes concepts of backpropagation, convolutional neural networks (CNNs), recurrent neural networks (RNNs), attention etc. The beauty is that this was in 2016 *before* some of the new exciting trends such as Transformers were discovered, so this course provides a good historical context of the techniques in the first half of 2010s.
- Stanford CS224n (Winter 2021) – Natural Language Processing with Deep Learning by Chris Manning: A great intro to the field of natural language processing and related deep learning concepts such as RNNs, long short term memory (LSTMs), self-attention and transformers, fine-tuning and some great guest lectures also.
- A few other notable courses or lecture series worth skimming
- MIT 6.S 191 (Spring 2022) – Intro to Deep Learning by Alexander Amini and Ava Soleimany: A good lecture series to quickly skim through (has some overlap with the above series); touches on deep reinforcement learning which other series above don’t.
- Stanford CS25 (Spring 2022) – Transformers United: A lecture series with applications of the transformers to various domains
- [Hands on course] Neural Networks – From Zero to Hero
Important Papers post Transformer era: Here’s my attempt at ‘seminal papers in the field of machine learning and deep learning since 2017’
- Attention is all you need (Transformer) (2017)
- GPT-1 (2018)
- GPT-2 (2019)
- BERT (2019)
- GPT-3 (2020)
- T5 (2020)
- DALL-E (2021)
- CLIP (2021)
- WebGPT (2022)
- Stable Diffusion (2022)
- LaMDA (2022)
Podcasts with experts in the field: Won’t teach core concepts but helpful in gaining general understanding of the trends and concepts
- Lex and Andrej Karapathy
- Deep Learning Deep Dive (only 2 episodes Andrej and Justin on DALL-E with Aditya Ramesh)
- ML Street Talk
- The Robot Brains Podcast
Notable Blog Posts: I will just list a few which I think capture the key developments in the field
- Software 2.0 (by Andrej Karpathy)
- Pathways – Next Gen AI architecture (by Jeff Dean)
- PapersWithCode Methods
- Jay Alammar’s blog
Please share this post if you found it useful. Or suggest corrections if I missed something. Please also suggest resources that I can add to make this more useful.
I will try to post a broader set of followup papers, data sources, and more hands-on resources in the following blog post. Stay tuned!
Happy Learning,
Abhi Khune
Abhiram Ganesh Khune
Transformers from Scratch: https://e2eml.school/transformers.html
LikeLike