general

MOVA: Open-Source Model Generates Synced Video+Audio

MOVA is an open-source AI model that generates synchronized video and audio content together, enabling creators to produce multimodal media with temporal

Someone found MOVA, an open-source model that generates video and audio together instead of separately.

Available models:

The interesting part is how it keeps audio and video in sync. Most models generate them separately and then try to match them up, which gets messy. MOVA does both at once.

Setup from their repo:

Runs on consumer GPUs apparently, though 720p needs beefier hardware. The 360p version works fine for testing things out without burning through cloud credits.