Skip to content

Qiaoli-Li-Res/histopathology-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Histopathology Image Classification

A deep learning project for colorectal cancer tissue classification using H&E-stained histopathology image patches. This repository fine-tunes an ImageNet-pretrained EfficientNet-B0 model on the NCT-CRC-HE-100K dataset and evaluates it on the independent CRC-VAL-HE-7K test set.

Overview

This project performs patch-level classification of colorectal histology images into 9 tissue categories:

  • ADI: adipose tissue
  • BACK: background
  • DEB: debris
  • LYM: lymphocytes
  • MUC: mucus
  • MUS: smooth muscle
  • NORM: normal colon mucosa
  • STR: cancer-associated stroma
  • TUM: colorectal adenocarcinoma epithelium

The goal is to build a compact and reproducible baseline for colorectal tissue recognition using transfer learning.

Dataset

The project uses the public colorectal cancer histology datasets by Kather et al.

Dataset Usage Size Description
NCT-CRC-HE-100K Training 100,000 patches H&E-stained colorectal tissue image patches
CRC-VAL-HE-7K Testing 7,180 patches Independent external validation set

All images are 224 x 224 pixel H&E-stained tissue patches. The external validation set is patient-independent from the training set, making it useful for evaluating model generalization.

Method

The pipeline uses transfer learning with EfficientNet-B0:

  1. Load EfficientNet-B0 with ImageNet-pretrained weights.
  2. Replace the classifier head with a 9-class output layer.
  3. Resize and normalize images using ImageNet statistics.
  4. Train with cross-entropy loss.
  5. Optimize using Adam.
  6. Save the best model according to test accuracy.

Model Configuration

Component Setting
Architecture EfficientNet-B0
Pretraining ImageNet
Input size 224 x 224
Number of classes 9
Optimizer Adam
Learning rate 0.001
Loss function Cross-Entropy Loss
Batch size 64
Epochs 10

Results

Metric Value
Test Accuracy 96.76%
Epochs 10
Batch Size 64

The model achieves strong classification performance on the independent CRC-VAL-HE-7K test set, showing that EfficientNet-B0 transfer learning is an effective baseline for colorectal histopathology patch classification.

Repository Structure

histopathology-classification/
├── README.md
├── models.py
├── train.py
├── predict.py
└── requirements.txt
File Description
models.py Defines the EfficientNet-B0 classification model
train.py Training and evaluation pipeline
predict.py Inference script for single-image prediction
requirements.txt Python dependencies

Installation

Clone the repository:

git clone https://github.com/Qiaoli-Li-Res/histopathology-classification.git
cd histopathology-classification

Create a Python environment:

conda create -n histopathology python=3.10 -y
conda activate histopathology

Install dependencies:

pip install -r requirements.txt

Usage

Train the model

python train.py

This trains EfficientNet-B0 on the training dataset and evaluates the model on the test set.

Run prediction

python predict.py

This script can be used to run inference on a histopathology image patch with the trained model.

Example Workflow

# 1. Install dependencies
pip install -r requirements.txt

# 2. Train the classifier
python train.py

# 3. Run prediction
python predict.py

Notes

  • This project performs patch-level tissue classification, not whole-slide image diagnosis.
  • The reported result is based on the CRC-VAL-HE-7K external test set.
  • For clinical usage, additional validation, calibration, interpretability analysis, and expert pathology review are required.
  • Accuracy alone may not fully reflect model reliability, especially when class imbalance or staining variation exists.

References

  • Kather, J. N., Halama, N., & Marx, A. 100,000 histological images of human colorectal cancer and healthy tissue. Zenodo, 2018. https://doi.org/10.5281/zenodo.1214456
  • Tan, M., & Le, Q. V. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. ICML, 2019.ng · EfficientNet · PyTorch

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages