Design a site like this with WordPress.com
Get started

Dynamic Text-to-4D Scene Generation

MAV3D (Make-A-Video3D) is a method for generating three-dimensional dynamic scenes from text descriptions. Our method employs a 4D dynamic Neural Radiance Field (NeRF) that is optimized for scene appearance, density, and motion consistency by querying a Text-to-Video (T2V) diffusion-based model. The dynamic video output generated from the provided text can be viewed from any cameraContinue reading “Dynamic Text-to-4D Scene Generation”

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video

We present a novel framework named NeuralRecon for real-time 3D scene reconstruction from a monocular video. Unlike previous methods that estimate single-view depth maps separately on each key-frame and fuse them later, we propose to directly reconstruct local surfaces represented as sparse TSDF volumes for each video fragment sequentially by a neural network. A learning-basedContinue reading “NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video”