50 ML projects to understand LLMs

Code repository for the book "50 ML projects to understand LLMs: Investigate transformer mechanisms through data analysis, visualization, and experimentation" by Mike X Cohen, PhD.

About This Repository

This repository contains all Python code and Jupyter notebooks for the 50 projects in the book. Each project includes:

Helper notebook: Incomplete code to work through the projects yourself (hints and guidance are in the book).
Solutions notebook: Complete, working implementation corresponding to the detailed explanations in the book.

All code runs on Google Colab, so you don't need to install anything locally or manage library configurations.

About the Book

Learn how LLMs like GPT and BERT actually work by applying machine learning techniques to their internal activations. This book takes a unique approach: rather than building LLMs from scratch or using them via APIs, you'll investigate their mechanisms by treating hidden states, attention patterns, and embeddings as data to analyze.

Through 50 hands-on projects, you'll learn to:

Inspect and visualize transformer internals
Analyze attention mechanisms and layer dynamics
Apply statistical and causal methods to understand model behavior
Manipulate activations to test hypotheses about LLM mechanisms

Each project teaches skills in three areas: machine learning techniques, LLM mechanisms, and Python coding with data visualization.

Overview of projects, ML skills, and LLM concepts

Check out the spreadsheet for detailed info

Check the pdf for table of contents and chapter 1 (introductions).

Purchase the Book

Format	Link
Paperback	amazon
PDF	Gumtree

No installation required — all notebooks run directly in Google Colab.

Prerequisites

Python programming experience (beginner to intermediate level)
Basic familiarity with machine learning concepts (helpful but not required)
Curiosity about how LLMs work

The book introduces ML techniques as needed, so you don't need to be an ML expert to get started.

Citation

If you use this code in your research or projects, please cite this GitHub url

License

This code is released under the MIT License. See LICENSE file for details.

About the Author

Mike X Cohen, PhD is a former neuroscience professor, full-time educator, and Udemy bestselling instructor with 25 years of experience teaching machine learning, mathematics, and data science.

Other books by Mike X Cohen:

Linear Algebra: Theory, Intuition, Code
Modern Statistics: Intuition, Math, Python, R
Calculus Unraveled: Intuition, Proofs, and Python

Questions and Support

For questions about the code or book content, please open an issue in this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
chapter_2		chapter_2
chapter_3		chapter_3
chapter_4		chapter_4
chapter_5		chapter_5
chapter_6		chapter_6
chapter_7		chapter_7
LICENSE		LICENSE
README.md		README.md
ml4llm_TOC_ch1.pdf		ml4llm_TOC_ch1.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

50 ML projects to understand LLMs

About This Repository

About the Book

Overview of projects, ML skills, and LLM concepts

Purchase the Book

Prerequisites

Citation

License

About the Author

Questions and Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

50 ML projects to understand LLMs

About This Repository

About the Book

Overview of projects, ML skills, and LLM concepts

Purchase the Book

Prerequisites

Citation

License

About the Author

Questions and Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages