A comprehensive Data Analytics & Visualization Project focused on extracting meaningful insights from employee data using Python.
Click below to open the notebook:
This project provides a detailed analysis of employee data from ABC Company.
-
Dataset contains 458 rows and 9 columns
-
Focuses on workforce insights, salary distribution, and team structure
-
Builds a data-driven report for business decision-making
The project includes:
πΉ Data Preprocessing
πΉ Exploratory Data Analysis (EDA)
πΉ Business-driven analytical tasks
πΉ Data visualizations
πΉ Insight generation & storytelling
πΉ Clean and reproducible Google Colab notebook
| Column Name | Description |
|---|---|
| empid | Employee ID |
| age | Age of the employee |
| gender | Gender |
| height | Height (corrected during preprocessing) |
| weight | Weight |
| team | Department |
| position | Job role |
| salary | Monthly salary |
| experience | Years of experience |
The following preprocessing steps were performed:
β Ensured dataset consistency and quality
β Standardized column names (lowercase, underscores)
β Handled missing values
- Numeric β Median
- Categorical β Mode / "Unknown"
β Removed duplicate records
β Corrected height values (150β180 cm range)
-
Employee count by team
-
Percentage distribution
-
Identified largest team
-
Visualized using bar & pie charts
-
Grouped employees by job role
-
Percentage distribution
-
Identified common roles
-
Visualized using count plot
-
Created age groups (bins)
-
Counted distribution
-
Visualized using bar chart
-
Total salary by team
-
Total salary by position
-
Visualized using bar charts
-
Calculated correlation between age and salary
-
Visualized using scatter plot
| Analysis | Visualization |
|---|---|
| Team Distribution | Bar chart, Pie chart |
| Position Segregation | Count plot |
| Age Groups | Bar chart |
| Salary Analysis | Bar charts |
| Age vs Salary | Scatter plot |
β Consistent theme, labels, and color palette applied across all charts
β Majority of employees fall in the 25β35 age group
β A positive correlation exists between age and salary
β Senior roles contribute significantly to salary expenditure
This analysis helps organizations to:
-
Plan workforce distribution
-
Optimize salary budgets
-
Improve hiring strategies
-
Support data-driven decision-making
-
The dataset is relatively small (458 records), which may limit the depth of insights.
-
Data is static and does not reflect real-time workforce changes.
-
Some values (e.g., height) were artificially adjusted, which may affect analysis accuracy.
-
Limited features; important factors like performance ratings, department budgets, or education level are not included.
-
Correlation analysis does not imply causation between variables (e.g., age and salary).
-
Results may not generalize to other organizations with different workforce structures.
-
Visualization-based insights are descriptive and do not include predictive modeling.
| Tool | Purpose |
|---|---|
| Python | Programming language |
| Pandas | Data processing |
| NumPy | Numerical operations |
| Matplotlib / Seaborn | Visualization |
| Google Colab | Development environment |
ABC-Company-Employee-Analysis/
β
βββ ABC Company.xlsx
βββ Python Module End Assessment 2.ipynb
βββ README.md
Click the Google Colab link above
Execute all cells sequentially
Explore visualizations and insights
This repository was created as part of a Python Module End Assessment in a Data Analytics program to demonstrate data preprocessing, exploratory data analysis (EDA), visualization techniques, and deriving business insights from employee data.
Name: Laya Mary Joy
Organization: Entri Elevate
Date: January 15, 2026
Thanks to Entri Elevate for guidance and support.
-
Add interactive dashboards (Power BI / Streamlit)
-
Include advanced statistical analysis
-
Integrate real-time employee datasets
-
Enhance visualization interactivity