MOVA: Open-Source Synchronized Video & Audio Gen

Someone found MOVA, an open-source model that generates video and audio together instead of separately stitching them after the fact.

Quick start:

Two model options on Hugging Face:

MOVA-360p - https://huggingface.co/OpenMOSS-Team/MOVA-360p (faster, lower res)
MOVA-720p - https://huggingface.co/OpenMOSS-Team/MOVA-720p (better quality)

The interesting bit is the synchronized generation - audio and visual elements stay matched throughout instead of drifting apart like typical post-sync approaches. Seems particularly useful for dialogue scenes where lip-sync matters.

Original announcement: https://x.com/Open_MOSS/status/2016820157684056172

Worth checking out if working on video generation stuff where audio timing actually matters.

MOVA: Open-Source Synchronized Video & Audio Gen

Related Tips

Verity: Local AI Search Engine Like Perplexity

ACE-Step 1.5: Free Local Music AI Rivals Suno v4/v5

Radeon PRO W7900 Runs 70B Models at Full Precision