Skip to content
View MahanBalooei's full-sized avatar

Highlights

  • Pro

Block or report MahanBalooei

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Mahanbalooei/README.md

Hi, I'm Mahan Balooei 👋🧬

🎓 Master's student in Bioinformatics at Alma Mater Studiorum – Università di Bologna
🔬 Focused on machine learning for biological sequences, protein language models, and computational biology
📍 Modena, Italy


🧠 About Me

I am an MSc Bioinformatics student at the University of Bologna, focusing on protein sequence modelling, protein language models, and ML-based sequence-function prediction.

My current work includes ESM-2-based protein sequence classification, HMMER/MMseqs2 workflows for protein domain annotation, and computational biology pipelines. I am especially interested in applying machine learning to protein function prediction, protein representation learning, and generative protein design.


🛠️ Tech Stack

Languages

Python R Bash

ML & Data Science

scikit-learn NumPy Pandas Matplotlib

Bioinformatics

HMMER Bioconductor minfi

Tools

Git Jupyter Conda Linux


📌 Featured Projects

Protein sequence-function prediction pipeline for eukaryotic signal peptide detection using UniProtKB/Swiss-Prot annotations, MMseqs2 redundancy reduction, classical biological baselines, SVMs with biochemical features, and CNN-BiLSTM models with ESM-2 protein language model embeddings.
Python PyTorch ESM-2 MMseqs2 Protein Language Models Bioinformatics


Protein domain annotation workflow using profile Hidden Markov Models, HMMER, and MMseqs2 to compare sequence-based and structure-informed approaches for Kunitz domain detection. Structure-based HMM achieved peak MCC of 0.997.
Python HMMER MMseqs2 Protein Domains Structural Bioinformatics


Differential DNA methylation analysis of CpG sites between healthy and diseased individuals using Illumina HumanMethylation450k array data, including preprocessing, normalization, PCA-based quality assessment, and differential methylation analysis.
R Bioconductor minfi Epigenomics


📊 GitHub Stats

Mahan's GitHub stats Top Languages


📫 Contact

Pinned Loading

  1. epigenetic-methylation-450k epigenetic-methylation-450k Public

    Differential DNA methylation analysis of CpG sites between healthy and diseased individuals using Illumina HumanMethylation450k array data, R, and Bioconductor.

    R

  2. HMM_KunitzDomain HMM_KunitzDomain Public

    Building structure-informed Profile HMMs for Kunitz/BPTI-type protease inhibitor domain detection, comparing sequence-based vs. structure-based approaches with 2-fold cross-validation.

    Jupyter Notebook

  3. eukaryotic-signal-peptide-prediction eukaryotic-signal-peptide-prediction Public

    A protein ML pipeline for eukaryotic signal peptide prediction using UniProtKB, MMseqs2, classical baselines, SVM biochemical features, and a CNN-BiLSTM classifier on ESM-2 protein embeddings.

    Jupyter Notebook 1