Skip to content

louiske65/CRE-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Predicting CRE-Gene Interactions from DNA Sequence by Fine-Tuning Enformer with Single-Cell Multiome Data

This repository contains the data preparation and training scripts. It omits the feature_linkages folder, hg38.pk, and gencode.v32.annotation.gtf due to GitHub space constraints.

These files and the processed data are available in Google Drive at this link: https://drive.google.com/drive/folders/1u6fTEUJmviggkTk2OXMYRfot0fvLfZj8?usp=drive_link

Although deep learning models like Enformer have achieved excellent performance in predicting a variety of genomic tasks, they struggle to accurately model genetic expression variation across individuals. This suggests a gap in the model's fundamental understanding of how cis-regulatory elements and genes interact. To address this, we hypothesized that training a model to explicitly predict CRE-gene linkages would improve its regulatory understanding. In this study, we fine-tuned Enformer using the 10x Genomics Human PBMC Single-Cell Multiome dataset, leveraging paired chromatin accessibility and gene expression data to generate high-confidence peak-gene linkages. We implemented a supervised learning approach with a balanced mean squared error loss function and added explicit distance tracks to the model input. We compared a linear probe baseline against a model where the last transformer layer was fine-tuned. While the baseline model failed to localize regulatory elements ($r=0.1296$), our fine-tuned model achieved a significant improvement in Pearson correlation on peak regions ($r=0.6168$). These results demonstrate that while Enformer may possess a rich understanding of a variety of genomic tasks, specific fine-tuning strategies are essential for adapting the model to better understand gene regulation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors