AI

5 AI Books for Software Engineers

Books to get your hands dirty in ML

Benedict Neo
bitgrit Data Science Publication
5 min readApr 22, 2024

--

image by author

I wrote an AI roadmap for hackers and mentioned a few books.

In this article, we’ll look at just 5 books that has a healthy balance of theory and practice.

You don’t need all 5 books.

Just pick the ones that you find interesting and would actually read.

Don’t start with, “I want to learn Deep learning so I should read this entire book”. Maybe that works for some people who prefers the bottom-up approach, reading about hard concepts and soaking everything in, but personally, I have more intrinsic motivation when I’m reading for a specific purpose.

Let’s say I want to build a model that can predict the next word, I would open up these books and then go to the NLP section, and flip through those chapters, write some code, and build a mental map of NLP, building up the foundational skills required for this project.

So, just cherry pick the books and sections that you personally find interesting.

Machine Learning with PyTorch and Scikit-Learn

Sebastian Raschka, Yuxi Liu, Vahid Mirjalili

This book is written by Sebastian Raschka, previously a professor of University of Wisconsin Madison, now Staff Research Engineer at LightningAI. He has a great AI newsletter called Ahead of AI.

The book covers both traditional ML and deep learning.

I particularly like the chapter “Going deeper — The mechanics of PyTorch”

article

And the chapeters on GANs, Graph neural networks, and Reinforcement Learning.

Links

Little Book of Deep Learning

Francois Fleuret

Professor, head of the Machine Learning Group, University of Geneva, Francois wrote this book as a short introduction to deep learning for readers with a STEM background, originally designed to be read on a phone screen.

He alsohas a comprehensive Deep Learning course he teaches in Geneva that is open-sourced UNIGE 14x050 — Deep Learning

Below is a sneak peak of the book

Dive into Deep Learning

Aston Zhang, Zack C. Lipton, Mu Li, Alex J. Smola

Dive into Deep Learning is an open-source textbook with code snippets implemented using PyTorch, NumPy/MXNet, JAX, and TensorFlow.

It’s been adopted by 500 universities in 70 countries.

If you want a to go deep into the landscape of deep learning today, these chapters are all you need: Convolutional Neural Networks (CNNs) and modern CNNs, Recurrent Neural Networks (RNNs) and modern RNNs, and the attention mechanism and transformers.

There’s also this useful chapter: Tools for Deep Learning such as AWS EC2, Amazon Sagemaker, Google Colab, and selecting GPUs

Links

Understanding Deep Learning

Simon J.D. Prince

Simon J.D. Prince, Honorary Professor of Computer Science at the University of Bath tweeted out his book on Twitter.

It starts the basics, supervised learning, then shallow and deep neural networks, fitting models, backpropagation, then CNNs and residual networks, finally, transformers and GANs and diffusion.

The book ends with two interesting chapters: “Why does deep learning work” and ethics of AI.

He released a ton of python notebooks for you to practice writing code, from neutal networks and backpropagation to self-attention and the diffusion encoder. It’ll make sure you internaliez what you learn in the textbook.

website

He also wrote a few articles for mastering Language Models.

Links

Deep Learning

Ian Goodfellow, Yoshua Bengio, and Aaron Courville

This is THE deep learning textbook that everyone recommends.

Vikas Parachuri, went from knowing nothing about Deep Learning to training SOTA OSS models, and he recently joined Answer.AI, Jeremy Howard’s new AI R&D Lab.

He recommends the following:

Read it slowly. A lot of the terminology and math will be unfamiliar — look them up. You may need to sketch some things out or code them to get them — give yourself the space to do that. If the math is unfamiliar, a good complementary resource is Math for Machine Learning.

Even though architectures like CNN or RNN might seem out of date in a world that is moving towards transformers for everything, CNNs are still widely used, and everything old is new again with RNNs.

When you’re done with the first 2 parts of the book (you can skip part 3), you should be at a point where you can code up any of the main neural networks architectures in plain numpy (forward and backward passes).

Links

Want more?

Check out our AI roadmap 👇

Thanks for reading!

Be sure to follow the bitgrit Data Science Publication to keep updated!

Want to discuss the latest developments in Data Science and AI with other data scientists? Join our discord server!

Follow Bitgrit below to stay updated on workshops and upcoming competitions!

Discord | Website | Twitter | LinkedIn | Instagram | Facebook | YouTube

--

--