I believe that most of the papers presented here focus on acquiring knowledge rather than deep understanding. If you’re completely unfamiliar with the subject, I recommend starting with textbooks rather than papers. The latest Bishop’s "Deep Learning: Foundations and Concepts (2024)" [1] is an excellent resource that covers the "basics" of deep learning and is quite updated. Another good option is Chip Huyen’s "AI Engineering (2024)" [2]. Another excellent choice will be "Dive into Deep Learning" [3], Understanding Deep Learning [4], or just read anything from fast.ai and watch Karpathy's lectures on YouTube.
I don't know what an "AI Engineer" is, but, is reading research papers actually necessary if the half life of the relevancy of many of these papers is only a few months until the next breakthrough happens?
I have a feeling, unless you're dabbling at the cutting edge of AI, there's no point in reading research papers. Just get a feel for how these LLMs respond then build a pretty and user friendly app on top of them. Knowing the difference between "multi head attention" and "single head attention" isn't very useful if you're just using OpenAI or Groq's API.
Am I missing something here? I'd love to know where I'm wrong
hi! author here! putting together a list like this is intimidating - for every thing i pick there are a dozen other suitable candidates so please view this as a curriculum with broadly prescriptive weightings, with the understanding that the CURRENTLY_RELEVANT_PAPER is always a moving pointer rather than fixed reference.
Heh, always fascinating to see how the term “AI” has been swallowed nigh-completely by the recent exciting developments in DL. All those papers and not a single mention of Russell & Norvig, Minsky, Shannon, Lenat, etc.!
I’m sure it’s a great list for what it is, I just wanted to be pedantic for a bit ;). If you’re interested in an introduction to AI as a broader topic, most graduate courses use the same book (Russel & Norvig) and others may publish their syllabi online.
AI Engineer Reading List
(latent.space)481 points by ingve 13 January 2025 | 67 comments
Comments
[1]: https://www.bishopbook.com [2]: https://www.oreilly.com/library/view/ai-engineering/97810981... [3]: https://d2l.ai [4]: https://udlbook.github.io/udlbook/
I have a feeling, unless you're dabbling at the cutting edge of AI, there's no point in reading research papers. Just get a feel for how these LLMs respond then build a pretty and user friendly app on top of them. Knowing the difference between "multi head attention" and "single head attention" isn't very useful if you're just using OpenAI or Groq's API.
Am I missing something here? I'd love to know where I'm wrong
we went thru this specific reading list in our paper club: https://www.youtube.com/watch?v=hnIMY9pLPdg
if you are interested in a narrative version.
> 1. GPT1, GPT2, GPT3, Codex, InstructGPT, GPT4 papers. Self explanatory. (...)
> 2. Claude 3 and Gemini 1 papers to understand the competition. (...)
> 3. LLaMA 1, Llama 2, Llama 3 papers to understand the leading open models. (...)
I agree that you should have read most of these papers at the time, when they were released, but I wonder if it would be that useful to read them now.
Perhaps it would be better to highlight one or two important papers from this section?
I’m sure it’s a great list for what it is, I just wanted to be pedantic for a bit ;). If you’re interested in an introduction to AI as a broader topic, most graduate courses use the same book (Russel & Norvig) and others may publish their syllabi online.
And the one referenced in there on synthetic data generation: https://arxiv.org/abs/2212.10560
https://www.trybackprop.com/blog/top_ml_learning_resources
don't waste time skimming over, reading and understanding any LLM and AI papers.
Read about ELIZA. Build your own.
Get Tensors, Vectors, Fields, Linguistics, Computer Architectures, Networks.
Focus on the subjects themselves, not them in the context of Neural Networks, "Deep Learning" et al.
In 2025 one should only focus should be distillation & optimization.
In 2025 CoT is not new, the corrected CoT is the key and all you need.