Skip to content

Conversation

@edoardob90
Copy link
Member

@edoardob90 edoardob90 commented May 7, 2025

Part 1 notebook introduces the fundamentals of neural network-based language modeling, from traditional bi-gram approaches to the simplest neural networks.

  • Implementation of a single-layer neural network for character-level language modeling
  • Comparison with the bigram model approach, highlighting similarities in performance but differences in flexibility
  • Step-by-step explanation of the neural network pipeline:
    • One-hot encoding of character inputs
    • Forward pass through a weight matrix
    • Softmax transformation to obtain probability distributions
    • Loss calculation using negative log-likelihood
    • Backward pass for gradient computation
    • Weight updates using gradient descent
  • Introduction to regularization
  • Demonstration of sampling from the trained model

It's the first step of a step-by-step introduction/overview of language modeling using PyTorch library.

@edoardob90
Copy link
Member Author

edoardob90 commented May 7, 2025

Left to do:

  • Add more references to extra material
  • Update Table of Contents
  • Finalize the exercises

@edoardob90
Copy link
Member Author

edoardob90 commented May 10, 2025

Left to do:

  • Add solutions notebook

@edoardob90 edoardob90 force-pushed the new-material/pytorch-llm-tutorial branch from f36ddee to 9dad59f Compare May 12, 2025 08:28
@despadam
Copy link
Contributor

Also, this should be included in 00_index.ipynb

Copy link
Contributor

@despadam despadam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👏

@edoardob90 edoardob90 merged commit cbd5572 into main May 12, 2025
1 check passed
@edoardob90 edoardob90 deleted the new-material/pytorch-llm-tutorial branch May 12, 2025 20:23
Copy link
Collaborator

@Snowwpanda Snowwpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice, good work.

"device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')\n",
"\n",
"# Load dataset\n",
"words = open('data/names.txt', 'r').read().splitlines()\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"words = open('data/names.txt', 'r').read().splitlines()\n",
"words = open('data/lm/names.txt', 'r').read().splitlines()\n",

names is in a subfolder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants