A machine learning project that classifies Iris flower species using Logistic Regression. This serves as my second machine learning model, built to demonstrate the foundational concepts of data preprocessing, feature utilization, and categorical classification.
The goal of this project is to accurately predict the specific species of an Iris flower based on its structural dimensions. Since Logistic Regression is highly effective for linearly separable categorical outcomes, it is implemented here to classify flowers into one of three species.
This project utilizes the classic Iris Dataset, a staple benchmark in machine learning. The dataset contains 150 instances evenly balanced across three classes (50 samples each):
- Iris-setosa
- Iris-versicolor
- Iris-virginica
The model trains on four numeric structural attributes (measured in centimeters):
- Sepal Length
- Sepal Width
- Petal Length
- Petal Width
- Python (Core language)
- NumPy & Pandas (Data manipulation and cleaning)
- Matplotlib & Seaborn (Exploratory data analysis and data visualization)
- Scikit-Learn (Model training, data splitting, scaling, and evaluation)
- Data Exploration: Analyzing feature distributions and plotting pair-plots to visualize the natural clustering and boundaries between species.
- Preprocessing: Scaling features using standard normalization and encoding categorical target labels into numerical formats.
- Data Splitting: Partitioning data into dedicated training and testing sets to evaluate unseen performance fairly.
- Model Training: Training a Multi-class Logistic Regression algorithm on the processed training set.
- Evaluation: Calculating performance metrics using a Confusion Matrix, Precision, Recall, and overall Classification Accuracy.
To interact with the code, follow these steps:
- Clone this repository to your local machine:
git clone https://github.com
- Open the notebook file
iris-classification-machine-learning-model.ipynbin your preferred environment:- Google Colab
- Jupyter Notebook / JupyterLab
- VS Code (with Jupyter Extension)
- Ensure you have the required libraries installed:
pip install numpy pandas matplotlib seaborn scikit-learn
- Run the code cells sequentially from top to bottom.