Skip to content

bundy92/Synthetic_panel_data_generator

Repository files navigation

Synthetic Panel Data Generator

This project is a Python-based tool designed to generate synthetic panel data for research or experimentation purposes. Panel data, also known as longitudinal or cross-sectional time-series data, provides valuable insights into the dynamics of various phenomena over time across multiple entities, such as individuals, firms, or countries. The Synthetic Panel Data Generator allows users to specify parameters such as the list of entities (e.g., countries), the time period of observation, and the characteristics of features to be simulated. Users can choose between a default version with pre-defined features and distributions or a custom version allowing full customization of feature settings.

Features

  • Default Version: Provides a set of default features and distributions for quick generation of synthetic panel data.
  • Custom Version: Allows users to customize feature settings, including distributions and parameters, to tailor the generated data to their specific needs.
  • Export to CSV: Enables users to export the generated synthetic panel data to a CSV file for further analysis.

Usage

To use the Synthetic Panel Data Generator, follow these steps:

  1. Clone or download the repository to your local machine.
  2. Install the required dependencies listed in requirements.txt.
  3. Run the streamlit_app.py script using Python.
  4. Choose between the default or custom version and specify the desired parameters.
  5. Click on "Generate Synthetic Data" to generate the synthetic panel data.
  6. Optionally, explore the generated data and visualize features using built-in functionalities.
  7. Export the generated data to a CSV file using the "Export to CSV" button.

Disclaimer

Using synthetic panel data for research or experiments requires users to maintain rigorous standards in data collection, cleaning, and analysis. While synthetic panel data offers valuable insights into various phenomena over time, users must exercise diligence to ensure the validity and integrity of their analyses. Adherence to ethical standards, transparency, and proper acknowledgment of data sources are essential. Users should recognize potential biases and take steps to mitigate them. Ultimately, users bear the responsibility of conducting research with integrity and contributing to the advancement of knowledge.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Panel data generator tool

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages