Instagram Post Analysis

Bauhaus-Universität Weimar - MSc in Computer Science for Digital Media -Research Project I

📌 Summary

This project explores the automatic classification of Instagram posts into sentiment categories (positive, neutral, negative) using multimodal models. Inspired by the concept of “Information Nutrition Labels,” the project aims to provide a clearer understanding of online content quality through image-caption analysis.

🎯 Objective

The goal was to analyze social media posts through a machine learning pipeline and compare human-labeled results with AI-generated predictions. Multiple approaches, including zero-shot and fine-tuned models, were explored to improve accuracy and robustness in classifying online data.

🧠 Background

The idea stems from the article Shaping the Information Nutrition Label, which proposes labeling online content similarly to food nutrition labels. The project adapts this idea to Instagram data, focusing on the semantic quality and sentiment of visual + textual content.

📦 Dataset

Custom Dataset A: 132 Instagram posts manually labeled by the author.

External Dataset B: 952 Instagram posts with extended coverage.

Challenges faced:

Class imbalance (initially few negative samples).
Language mismatch due to non-English captions (translation issues).

Final dataset: Combined English-only captions from both sets.

🛠️ Technologies & Tools

Data collection: 4Cat, Instagram

Figure 1. Data collection tool

Figure 2. Instagram accounts to collect data

Figure 3. Instagram post hashtags to collect data

Image model: ResNet50

Text model: BERT

Multimodal model: Combined ResNet50 + BERT pipeline

Prompt-based model: LLaVA (Large Language and Vision Assistant)

Frameworks: PyTorch, HuggingFace, CLIP

🔍 Methodology

Approaches:

1. Zero-Shot Classification with LLaVA on both datasets.

2. Feature Extraction: ResNet50 + BERT, using frozen weights except last layers.

3. Fine-Tuning:

    On Custom Dataset A
    On Custom Dataset A + External Dataset B

Approach	Dataset	Accuracy	Fine-Tuned
LLaVA (zero-shot)	Custom Dataset A	0.3401	No
LLaVA (zero-shot)	Combined Dataset (A + B)	0.3504	No
ResNet50+BERT (FE)	Custom Dataset A	0.6500	Partial
ResNet50+BERT (FE)	Combined Dataset (A + B)	0.7431	Partial
ResNet50+BERT (FT)	Custom Dataset A	0.5500	Full
ResNet50+BERT (FT)	Combined Dataset (A + B)	0.8194	Full

📈 Impact

Demonstrated that even small, real-world datasets can be used to train effective multimodal sentiment classifiers.
Showed that fine-tuned models significantly outperform zero-shot prompting.
Validated the viability of applying “information nutrition labels” to social media using machine learning.

📝 Notes

Translating captions from non-English sources introduced semantic loss; future work could explore multilingual embeddings.
Manual labeling was time-consuming but necessary to build initial benchmarks.
The best-performing model was fine-tuned ResNet50+BERT on the combined dataset, reaching over 81% accuracy.
Prompting alone (LLaVA) is not yet reliable for this task without additional tuning.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
config		config
images		images
multimodal_fusion		multimodal_fusion
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Instagram Post Analysis

📌 Summary

🎯 Objective

🧠 Background

📦 Dataset

🛠️ Technologies & Tools

🔍 Methodology

Approaches:

1. Zero-Shot Classification with LLaVA on both datasets.

2. Feature Extraction: ResNet50 + BERT, using frozen weights except last layers.

3. Fine-Tuning:

📈 Impact

📝 Notes

✅ Future Work

1. Collect more balanced datasets (especially negative examples).

2. Experiment with newer multimodal models like BLIP-2, Flamingo, or GIT.

3. Develop a web interface for visualizing sentiment analysis as a nutrition label.

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Instagram Post Analysis

📌 Summary

🎯 Objective

🧠 Background

📦 Dataset

🛠️ Technologies & Tools

🔍 Methodology

Approaches:

1. Zero-Shot Classification with LLaVA on both datasets.

2. Feature Extraction: ResNet50 + BERT, using frozen weights except last layers.

3. Fine-Tuning:

📈 Impact

📝 Notes

✅ Future Work

1. Collect more balanced datasets (especially negative examples).

2. Experiment with newer multimodal models like BLIP-2, Flamingo, or GIT.

3. Develop a web interface for visualizing sentiment analysis as a nutrition label.

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages