3D Models

Klasifikasi level video: input berupa 16 frame berurutan dari sebuah klip → satu prediksi per klip.

Pipeline

Folder	Backbone	Batch × accum	Effective batch	Epochs	Loss default	Catatan
`swin3d`	`torchvision.models.video.swin3d_t` (K400)	2 × 8	16	40	`asl`	SWA epoch 30+, EigenCAM
`videomae`	`OpenGVLab/VideoMAEv2-Base` (HF)	8 × 16	128	20	`focal`	`trust_remote_code=True`

Konfigurasi Umum

Parameter	swin3d	videomae
Input shape	`(B, 3, 16, 224, 224)`	`(B, 16, 3, 224, 224)`
Normalisasi	ImageNet mean/std	ImageNet mean/std
LR backbone	5e-5	1e-4
LR head	5e-4 (backbone × 10)	1e-3 (backbone × 10)
Weight decay	3e-4	1e-4
Freeze epochs	3	5
Warmup epochs	— (OneCycleLR, pct_start=0.1)	2 (linear setelah unfreeze)
Grad clip	1.0	1.0
Dropout	0.5	0.5
Early stopping	patience=8	—
Seed	42	42

Swin3D menggunakan OneCycleLR (dengan 10% warmup) selama fase normal, lalu beralih ke SWALR (swa_lr=1e-5) di fase SWA ab epoch 30. VideoMAE menggunakan ConstantLR saat freeze → LinearLR warmup → CosineAnnealingLR.

Struktur per pipeline

<model>/
├── README.md
├── config.py
├── dataset.py
├── model.py
├── loss.py
├── train.py
├── sweep.py
├── _sweep_worker.py
├── gradcam.py
└── crop_faces_video.py   # khusus videomae

Input & Dataset

Temporal subsampling: NUM_FRAMES = 16 frame per klip
Resolusi per frame: IMG_SIZE = 224 × 224
Swin3D: dataset berupa folder frame (frame_00.jpg s.d. frame_15.jpg)
VideoMAE: dataset berupa file video MP4 (dibaca via OpenCV)
Label CSV: Label3d/train.csv, Label3d/val.csv, Label3d/test.csv
Kolom: video_path, Boredom, Engagement, Confusion, Frustration

Konvensi Umum

Multi-label sigmoid dengan 4 emosi: Boredom, Engagement, Confusion, Frustration
Pencarian threshold per-label pada separuh validation; evaluasi di separuh sisanya (hindari leakage)
eval_criterion tanpa pos_weight untuk val/test agar loss comparable antar run
Setiap run sweep dijalankan di subprocess terpisah via _sweep_worker.py
VideoMAE memerlukan crop_faces_video.py dijalankan dulu untuk mode supercrop/faceonly

Untuk detail lengkap parameter, visualisasi, dan cara pakai, lihat README masing-masing pipeline.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

3D Models

Pipeline

Konfigurasi Umum

Struktur per pipeline

Input & Dataset

Konvensi Umum

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
swin3d		swin3d
videomae		videomae
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

3D Models

Pipeline

Konfigurasi Umum

Struktur per pipeline

Input & Dataset

Konvensi Umum

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages