AI Engineer Reading List Hackernews Viewer

AI Engineer Reading List

490 points by ingve 13 January 2025 | 68 comments

Comments

ofou 13 January 2025

I believe that most of the papers presented here focus on acquiring knowledge rather than deep understanding. If you’re completely unfamiliar with the subject, I recommend starting with textbooks rather than papers. The latest Bishop’s "Deep Learning: Foundations and Concepts (2024)" [1] is an excellent resource that covers the "basics" of deep learning and is quite updated. Another good option is Chip Huyen’s "AI Engineering (2024)" [2]. Another excellent choice will be "Dive into Deep Learning" [3], Understanding Deep Learning [4], or just read anything from fast.ai and watch Karpathy's lectures on YouTube.

[1]: https://www.bishopbook.com [2]: https://www.oreilly.com/library/view/ai-engineering/97810981... [3]: https://d2l.ai [4]: https://udlbook.github.io/udlbook/

kamikazeturtles 13 January 2025

I don't know what an "AI Engineer" is, but, is reading research papers actually necessary if the half life of the relevancy of many of these papers is only a few months until the next breakthrough happens?

I have a feeling, unless you're dabbling at the cutting edge of AI, there's no point in reading research papers. Just get a feel for how these LLMs respond then build a pretty and user friendly app on top of them. Knowing the difference between "multi head attention" and "single head attention" isn't very useful if you're just using OpenAI or Groq's API.

Am I missing something here? I'd love to know where I'm wrong

swyx 13 January 2025

hi! author here! putting together a list like this is intimidating - for every thing i pick there are a dozen other suitable candidates so please view this as a curriculum with broadly prescriptive weightings, with the understanding that the CURRENTLY_RELEVANT_PAPER is always a moving pointer rather than fixed reference.

we went thru this specific reading list in our paper club: https://www.youtube.com/watch?v=hnIMY9pLPdg

if you are interested in a narrative version.

bbor 13 January 2025

Heh, always fascinating to see how the term “AI” has been swallowed nigh-completely by the recent exciting developments in DL. All those papers and not a single mention of Russell & Norvig, Minsky, Shannon, Lenat, etc.!

I’m sure it’s a great list for what it is, I just wanted to be pedantic for a bit ;). If you’re interested in an introduction to AI as a broader topic, most graduate courses use the same book (Russel & Norvig) and others may publish their syllabi online.

jamalaramala 14 January 2025

From the article:

> 1. GPT1, GPT2, GPT3, Codex, InstructGPT, GPT4 papers. Self explanatory. (...)

> 2. Claude 3 and Gemini 1 papers to understand the competition. (...)

> 3. LLaMA 1, Llama 2, Llama 3 papers to understand the leading open models. (...)

I agree that you should have read most of these papers at the time, when they were released, but I wonder if it would be that useful to read them now.

Perhaps it would be better to highlight one or two important papers from this section?

lxe 14 January 2025

I think most of the instruction fine-tuning methods for oss models stem from Alpaca, so it should be included: https://crfm.stanford.edu/2023/03/13/alpaca.html

And the one referenced in there on synthetic data generation: https://arxiv.org/abs/2212.10560

nickpsecurity 13 January 2025

This is a great survey. Combine it with the courses below for best results:

https://www.trybackprop.com/blog/top_ml_learning_resources

qrsjutsu 14 January 2025

If you are not on that hype-train, yet, then

don't waste time skimming over, reading and understanding any LLM and AI papers.

Read about ELIZA. Build your own.

Get Tensors, Vectors, Fields, Linguistics, Computer Architectures, Networks.

Focus on the subjects themselves, not them in the context of Neural Networks, "Deep Learning" et al.

kevin0091 13 January 2025

The reading list is old about one year, for instance in 2025, one may use KTO for math, RLOO for CoT, DPO for function calling and optimization.

In 2025 one should only focus should be distillation & optimization.

In 2025 CoT is not new, the corrected CoT is the key and all you need.

joshdavham 13 January 2025

Awesome list!

agcobiledyalom 14 January 2025

Am interested