Skip to content

mikexcohen/ML4LLM_book

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

50 ML projects to understand LLMs

Code repository for the book "50 ML projects to understand LLMs: Investigate transformer mechanisms through data analysis, visualization, and experimentation" by Mike X Cohen, PhD.

About This Repository

This repository contains all Python code and Jupyter notebooks for the 50 projects in the book. Each project includes:

  • Helper notebook: Incomplete code to work through the projects yourself (hints and guidance are in the book).
  • Solutions notebook: Complete, working implementation corresponding to the detailed explanations in the book.

All code runs on Google Colab, so you don't need to install anything locally or manage library configurations.

About the Book

Learn how LLMs like GPT and BERT actually work by applying machine learning techniques to their internal activations. This book takes a unique approach: rather than building LLMs from scratch or using them via APIs, you'll investigate their mechanisms by treating hidden states, attention patterns, and embeddings as data to analyze.

Through 50 hands-on projects, you'll learn to:

  • Inspect and visualize transformer internals
  • Analyze attention mechanisms and layer dynamics
  • Apply statistical and causal methods to understand model behavior
  • Manipulate activations to test hypotheses about LLM mechanisms

Each project teaches skills in three areas: machine learning techniques, LLM mechanisms, and Python coding with data visualization.

Overview of projects, ML skills, and LLM concepts

Check out the spreadsheet for detailed info

Check the pdf for table of contents and chapter 1 (introductions).

Purchase the Book

Format Link
Paperback amazon
PDF Gumtree

No installation required — all notebooks run directly in Google Colab.

Prerequisites

  • Python programming experience (beginner to intermediate level)
  • Basic familiarity with machine learning concepts (helpful but not required)
  • Curiosity about how LLMs work

The book introduces ML techniques as needed, so you don't need to be an ML expert to get started.

Citation

If you use this code in your research or projects, please cite this GitHub url

License

This code is released under the MIT License. See LICENSE file for details.

About the Author

Mike X Cohen, PhD is a former neuroscience professor, full-time educator, and Udemy bestselling instructor with 25 years of experience teaching machine learning, mathematics, and data science.

Other books by Mike X Cohen:

  • Linear Algebra: Theory, Intuition, Code
  • Modern Statistics: Intuition, Math, Python, R
  • Calculus Unraveled: Intuition, Proofs, and Python

Questions and Support

For questions about the code or book content, please open an issue in this repository.

About

Code and materials for my book "50 ML projects to understand LLMs"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages