Tried every approach for iteration
1-5
most of them terriblly failed or GPU entensive
except 2
deleted all iterations
iteration 6 is build upon the knowledge obtained from the failure of previous 5 iterations
now the AIM is to make a robust transformer that can write poem nearly good as devkota(although it cant)
Mistakes in previous approaches
no clean data
tokenizer
limited GPU
high perplexity
What to be improved
use cleaner data- Collecting myself now
tokenizer - using custom tokenizer
limited GPU, using colab for training and making it resource considerations
increase the epochs to reduce preplexity (old models were converging to less perplexity on higher num of epochs)
and MostIMP thing
NO AI for code, this project is to understand internals of an LLM
Happy Coding:)
un1u3/devkota
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|