This project is a real-time, explainable system designed to detect misinformation in AI-generated text. It combines the power of LLMs (like GPT-3.5) with fact-checking APIs, knowledge graph queries, and semantic similarity models to classify claims as either Reliable or Misinformation, along with a confidence score and detailed breakdown of reasoning.
Live Demo: Watch the 5-minute project demo on YouTube
- Claim-level misinformation detection
- LLM-based semantic similarity (Sentence-BERT)
- Stance detection using DistilBERT sentiment analysis
- Fall-back logic when claims are novel or unseen
- Real-time classification via Streamlit app
- Detailed explanation of predictions (stance, source quality, similarity)
- Preloaded reliable claim set + Phase 5 embeddings
| Reliable Claim | Misinformation |
|---|---|
![]() |
![]() |
-
Input Claim
The user submits a natural-language factual claim. -
Semantic Matching & Verification
System checks similarity to known verified claims and searches knowledge bases. -
Fallback Model
If no match is found, the system estimates stance, semantic consistency, and source quality. -
Classification
Outputs a label:Reliable✅ orMisinformation❌ with confidence %. -
Explainability
Shows why the claim was classified that way (stance score, source quality, etc.).
-
Clone the repository:
git clone https://github.com/rishika7006/misinformation-detector.git cd misinformation-detector -
Create a virtual environment and activate it:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Place the following model and data files in the root directory:
production_model_v2.pklphase5_claims.jsonphase5_embeddings.npy
streamlit run app.pyThen visit http://localhost:8501 in your browser.
.
├── app.py # Streamlit app entry point
├── requirements.txt # Python dependencies
├── production_model_v2.pkl # Trained classifier model
├── phase5_claims.json # Pre-verified claims with metadata
├── phase5_embeddings.npy # Embeddings for phase 5 verified claims
├── screenshots/ # App screenshots for README
└── README.md- Streamlit – for interactive web app
- Transformers (HuggingFace) – for stance detection
- Sentence-BERT (SBERT) – for semantic similarity
- scikit-learn – for final classification
- Wikidata / DBPedia – for future knowledge graph integration
- OpenAI GPT-3.5 (for claim extraction & guidance)
- HuggingFace Transformers
- Streamlit for fast UI development
- FEVER, LIAR datasets for reference
- Course: CS 6375 – Machine Learning
Instructor: Dr. Wei Yang
Institution: The University of Texas at Dallas
This project is for educational purposes only. All data and model outputs are used in accordance with academic fair use.

