This project performs Exploratory Data Analysis (EDA) on the Gurgaon real estate dataset to answer important business questions related to property pricing, locality trends, builder premiums, and investment insights. The objective is to extract meaningful insights that can help buyers, investors, and real estate businesses make data-driven decisions.
This analysis answers the following key business questions:
- Which is the costliest flat in the dataset?
- Which locality has the highest average price?
- Which locality has the highest rate per square foot?
- Do ready-to-move properties cost more than under-construction properties?
- Do RERA-approved properties command a price premium?
- How does area (sqft) impact property price?
- Which BHK configuration is the most expensive on average?
- Which property type (Apartment, Floor, Plot) is the costliest?
- Do certain builders consistently price their properties higher?
- Are larger homes always more expensive per square foot?
- Python
- Pandas
- NumPy
- Matplotlib
- Seaborn
-
Data Cleaning and Preprocessing
- Handling missing values
- Removing duplicates
- Standardizing price and area columns
-
Exploratory Data Analysis (EDA)
- Price distribution analysis
- Locality-wise comparison
- BHK-based pricing trends
- Builder-wise analysis
-
Visualization
- Bar charts
- Scatter plots
- Box plots
- Correlation analysis
- Data cleaning and preprocessing
- Group-by aggregation analysis
- Price comparison by locality and property type
- Visualization of price trends
- Statistical comparison of property categories
Gurgaon Real Estate Dataset (Kaggle)
Jaivika Agare