Design a site like this with WordPress.com
Get started

Dynamic Text-to-4D Scene Generation

MAV3D (Make-A-Video3D) is a method for generating three-dimensional dynamic scenes from text descriptions. Our method employs a 4D dynamic Neural Radiance Field (NeRF) that is optimized for scene appearance, density, and motion consistency by querying a Text-to-Video (T2V) diffusion-based model. The dynamic video output generated from the provided text can be viewed from any camera location and angle, and it can be composited into any 3D environment. MAV3D does not require any 3D or 4D data, and the T2V model is trained solely on Text-Image pairs and unlabeled videos.

Read more…

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: