Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
datagen		datagen
scripts		scripts
.gitignore		.gitignore
Evaluation.md		Evaluation.md
LICENSE		LICENSE
README.md		README.md
teaser.jpg		teaser.jpg

Repository files navigation

SimpleProc: Fully Procedural Synthetic Data from Simple Rules for Multi-View Stereo

We explore the design space of procedural rules for multi-view stereo (MVS). We demonstrate that we can generate effective training data using SimpleProc: a fully procedural generator driven by a very small set of rules using Non-Uniform Rational Basis Splines (NURBS), as well as basic displacement and texture patterns. At a modest scale of 8,000 images, our approach achieves superior results compared to manually curated images (at the same scale) sourced from games and real-world objects. When scaled to 352,000 images, our method yields performance comparable to—and in several benchmarks, exceeding—models trained on over 692,000 manually curated images.

Figure 1: Fully procedural synthetic data from simple rules (top) is as effective as curated data from artists or 3D scans (bottom) for training multi-view stereo models.

If you find our work useful for your work, please consider citing our academic paper:

SimpleProc: Fully Procedural Synthetic Data from Simple Rules for Multi-View Stereo

Zeyu Ma, Alexander Raistrick, Jia Deng

@misc{ma2026fullyproceduralsyntheticdata,
      title={Fully Procedural Synthetic Data from Simple Rules for Multi-View Stereo}, 
      author={Zeyu Ma and Alexander Raistrick and Jia Deng},
      year={2026},
      eprint={2604.04925},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2604.04925}, 
}

Data Generator

1) Create Conda Environment + Prepare Infinigen

conda create -n datagen python=3.11 -y
conda activate datagen

git clone git@github.com:princeton-vl/infinigen.git
cd infinigen
pip install -e ".[dev,terrain,vis]"
cd ..
cd datagen
g++ -O3 -std=c++17 -shared -fPIC nurbs_utils.cpp -o nurbs_utils.so
cd ..

Notes: This project uses Infinigen APIs (not Infinigen assets).

2) Install Blender 4.5 for EEVEE

wget https://download.blender.org/release/Blender4.5/blender-4.5.0-linux-x64.tar.xz
tar -xf blender-4.5.0-linux-x64.tar.xz
mv blender-4.5.0-linux-x64 blender
rm -rf blender-4.5.0-linux-x64.tar.xz
./blender/4.5/python/bin/python3.11 -m pip install opencv-python OpenEXR==3.3.5 matplotlib

3) Run Locally (single-scene style)

# Generate one local scene
python datagen/run_local_scene_generation.py \
  --output-scene output/test_scene.blend \
  --seed 0

# render pass, GPU is required because EEVEE is used
python datagen/run_local_scene_generation.py \
  --output-scene output/test_scene.blend \
  --seed 0 \
  --render-output output/test_renders

4) Run on Cluster

export DATASET_FOLDER=output/dataset

python datagen/run_cluster_generation.py --start-index 0 --end-index 1

Training MVSAnywhere with Generated Data

For our 44k scenes used in the paper, please download from https://www.dropbox.com/scl/fo/snn2vebj5a84ro2m7mqou/AKCdc6-RFRMfpICQ5_PItsY?rlkey=0erbn8i2yhacy3xb54m27ber7&st=k22pyfky&dl=0, and unzip them, which will give you something like:

datafolder_0/
- renders
  - scene_0
  - ...
datafolder_1/
- renders
  - scene_0
  - ...

Run this to link raw folders into a single dataset folder (dst_folder).

export data_folder_root=/path/to/data_folder_root
export dst_folder=/path/to/dataset
python scripts/link_data.py

Clone our fork of MVSAnywhere and install its environment: https://github.com/mazeyu/mvsanywhere. Run the command below to train a MVSAnywhere on these data (replace /path/to/dataset). GPU Memory requirement: 48G.

In configs/data/infinigen_cubism/44k_scenes.yaml, please update "dataset_path: path/to/dataset". Please also download a small set of validation data used in training monitoring: https://www.dropbox.com/scl/fi/9zcrtpb3xdzz71s7fkgg7/rmvd_samples_eval.zip?rlkey=kgekhjpe4begrhhl4fop41bhg&st=hwdm3ctt&dl=0 and set RMVD_SAMPLES_VAL_DIR.

cd mvsanywhere

export RMVD_SAMPLES_VAL_DIR=/path/to/rmvd/samples

python src/mvsanywhere/train.py \
  --log_dir /u/zeyum/s/mvsanywhere_training/ \
  --name NAME \
  --config_file configs/models/mvsanywhere_model.yaml \
  --data_config configs/data/infinigen_cubism/44k_scenes.yaml \
  --val_data_config "" \
  --batch_size 6 \
  --da_weights_path weights/depth_anything_v2_vitb.pth \
  --gpus 1 \
  --max_steps 1600000 \
  --val_interval 5000

You can download our trained model here: https://www.dropbox.com/scl/fi/0gy1hxo41r0sudtzz4sml/checkpoints.zip?rlkey=er1h7vim2rnom2f8c580wcy5w&st=3po98loz&dl=0 (3 checkpoints for 3 different runs).

About

No description, website, or topics provided.

BSD-3-Clause license

Custom properties

Report repository

Releases

No releases published

Packages

Contributors

Languages