This project is a Python-based tool designed to generate synthetic panel data for research or experimentation purposes. Panel data, also known as longitudinal or cross-sectional time-series data, provides valuable insights into the dynamics of various phenomena over time across multiple entities, such as individuals, firms, or countries. The Synthetic Panel Data Generator allows users to specify parameters such as the list of entities (e.g., countries), the time period of observation, and the characteristics of features to be simulated. Users can choose between a default version with pre-defined features and distributions or a custom version allowing full customization of feature settings.
- Default Version: Provides a set of default features and distributions for quick generation of synthetic panel data.
- Custom Version: Allows users to customize feature settings, including distributions and parameters, to tailor the generated data to their specific needs.
- Export to CSV: Enables users to export the generated synthetic panel data to a CSV file for further analysis.
To use the Synthetic Panel Data Generator, follow these steps:
- Clone or download the repository to your local machine.
- Install the required dependencies listed in
requirements.txt. - Run the
streamlit_app.pyscript using Python. - Choose between the default or custom version and specify the desired parameters.
- Click on "Generate Synthetic Data" to generate the synthetic panel data.
- Optionally, explore the generated data and visualize features using built-in functionalities.
- Export the generated data to a CSV file using the "Export to CSV" button.
Using synthetic panel data for research or experiments requires users to maintain rigorous standards in data collection, cleaning, and analysis. While synthetic panel data offers valuable insights into various phenomena over time, users must exercise diligence to ensure the validity and integrity of their analyses. Adherence to ethical standards, transparency, and proper acknowledgment of data sources are essential. Users should recognize potential biases and take steps to mitigate them. Ultimately, users bear the responsibility of conducting research with integrity and contributing to the advancement of knowledge.
This project is licensed under the MIT License - see the LICENSE file for details.