I am interested in using XingTian for multi-agent training with PPO algorithm in the SMARTS environment. An example to use SMARTS environment is available here.
Could you provide a detailed step-by-step instructions and an example on how to use XingTian with our own custom environment for multi-agent training?