New standalone project built from your voice-emotion baseline concept, without modifying:
emotion-voice-app(GitHub)ShiroOnigami23/emotion-voice-engine(HF model)
Unified hybrid emotion detection with three modalities from one video clip:
- Audio signal extracted from video
- Face expression from key frame extraction
- Temporal video branch (frame sequence)
Final output is one fused emotion prediction from all three models working together.
Emotion classes (expanded): angry, calm, disgust, fear, happy, neutral, sad, surprise
- Hugging Face Space (connected app): https://huggingface.co/spaces/ShiroOnigami23/emotion-multimodal-app
- Hugging Face model repo: https://huggingface.co/ShiroOnigami23/emotion-multimodal-engine
Kernel path: kaggle_kernel/
Datasets used:
uwrfkaggler/ravdess-emotional-speech-audioejlok1/cremadadrivg/ravdess-emotional-speech-videoastraszab/facial-expression-dataset-image-folders-fer2013
Note: FER2013 folder labels are numeric (0..6) and are mapped internally to emotion classes.
Start long run without waiting:
python scripts/start_kaggle_training.py --owner aryanchande23lCheck later (quick):
python scripts/check_kaggle_status.py --owner aryanchande23lAuto-retry until compatible GPU allocation:
python scripts/relaunch_until_gpu_compatible.py --owner aryanchande23l --max-attempts 5When complete:
kaggle kernels output aryanchande23l/emotion-multimodal-hybrid-trainer-v1 -p kaggle_pullExpected outputs:
audio_model.ptface_model.ptvideo_model.ptfusion_config.jsonmetrics.jsonrun_version.json
set HF_TOKEN=YOUR_TOKEN
python scripts/upload_to_hf.py --model-repo-id ShiroOnigami23/emotion-multimodal-engine --outputs-dir kaggle_pullpip install -r requirements.txt
streamlit run app.pyset HF_TOKEN=YOUR_TOKEN
python scripts/publish_space.py --space-id ShiroOnigami23/emotion-multimodal-appAndroid wrapper project is under android_app/ and opens the deployed HF Space.
- CI workflow:
.github/workflows/android-release.yml - Trigger release by pushing a tag (example
v1.0.0) or using workflow dispatch. - Signed APK is uploaded to GitHub Releases automatically.
This is an affect-recognition research tool. It is not a diagnostic, clinical, legal, or hiring decision system.