Does F* and S* are generated for each and every example or just for augumented data?

can i get augumented dataset.
As mention in research paper, After data augmentation, the SFT dataset comprises 9,828 instances, while the KTO alignment dataset contains 19,563 instances.

can I know how you went from 1763 seed problems to 9828 and 19563. How much augument data is used and how much external data is used apart from mention datasets.