can i get augumented dataset.
As mention in research paper, After data augmentation, the SFT dataset comprises 9,828 instances, while the KTO alignment dataset contains 19,563 instances.
can I know how you went from 1763 seed problems to 9828 and 19563. How much augument data is used and how much external data is used apart from mention datasets.