DROID Flow Processing Pipeline

This is a pipeline for processing DROID RAW dataset, extracting RGB images, depth images, optical flow, and calculating scene flow from stereo camera recordings.

Download the Droid Raw dataset

# Raw DROID dataset in stereo HD, stored as MP4 videos (8.7TB)
gsutil -m cp -r gs://gresearch/robotics/droid_raw <path_to_your_target_dir>

File Structure

├── droid_raw/
    ├── droid/
        ├── 1.0.0   
        ├── 1.0.1
        │   ├── AUTOLab     
        │         ├── success 
        │               ├── ....  
        │         ├── failure   
        │   ├── CLVR 
        │   ├── .....
├── droid_flow/
├── droid_processed/

Environment Setup

To run the pipeline locally, set up the environment:

# Create and activate conda environment
conda create -n droid_flow python=3.10 -y
conda activate droid_flow

# Install dependencies
pip install torch torchvision
pip install requests

# Install ZED SDK
# Download from: https://www.stereolabs.com/developers/release/
# Run: ./ZED_SDK_Ubuntu22_cuda12.1_v4.1.4.zstd.run -- silent
cd /usr/local/zed/ && python get_python_api.py

# Fix dependencies
conda install -c conda-forge libstdcxx-ng -y
pip install h5py scipy opencv-python==4.10.0.84

# Resolve numpy compatibility
pip uninstall -y numpy
pip install numpy==1.24.0

Quick Start

git clone https://github.com/SalesforceAIResearch/droid_flow.git
# Run the pipeline (example for TRI dataset)
python droid_pipeline_main.py --dataset_name TRI --num_workers 1

Data Format

All processed data is saved in PNG format with BGR channels, except for depth, with one channel, is scaled up 10000 times for precision when casting to uint16.

Optical Flow (2 channels + mask)

Channel	Meaning	Formula
R	Normalized Δx (pixels)	(Δx + 1/4 * w) / (1/2 * w) * 65536
G	Normalized Δy (pixels)	(Δy + 1/4 * h) / (1/2 * h) * 65536
B	Valid pixel mask (0/1)	65536 = valid, 0 = invalid

Where:
- Δx, Δy = pixel displacements
- w, h = image width and height

Scene Flow (3 channels)

Channel	Meaning	Formula
R	Normalized Δx (meters)	(Δx + 2) / 4 * 65536
G	Normalized Δy (meters)	(Δy + 2) / 4 * 65536
B	Normalized Δz (meters)	(Δz + 2) / 4 * 65536

Where:
- Δx, Δy, Δz = displacements in meters

Output Structure

Each processed episode creates the following directory structure:

droid_processed/
└── {dataset_name}/
    └── {episode_name}/
        ├── metadata.json  
        ├── trajectory.h5   
        ├── camera_left/
        │   ├── rgb/      
        │   ├── depth/      
        │   ├── optical_flow_with_mask/  
        │   └── scene_flow/ 
        └── camera_right/
            ├── rgb/
            ├── depth/
            ├── optical_flow_with_mask/
            └── scene_flow/
        └── camera_wrist/
            ├── rgb/
            ├── depth/
            ├── optical_flow_with_mask/
            └── scene_flow/

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
AI_ETHICS.md		AI_ETHICS.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.txt		LICENSE.txt
README.md		README.md
SECURITY.md		SECURITY.md
cpu_processors.py		cpu_processors.py
data_processing_utils.py		data_processing_utils.py
droid_pipeline_episode.py		droid_pipeline_episode.py
droid_pipeline_main.py		droid_pipeline_main.py
how_to_license.md		how_to_license.md
image_saver.py		image_saver.py
optical_flow_processor.py		optical_flow_processor.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DROID Flow Processing Pipeline

Download the Droid Raw dataset

File Structure

Environment Setup

Quick Start

Data Format

Optical Flow (2 channels + mask)

Scene Flow (3 channels)

Output Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DROID Flow Processing Pipeline

Download the Droid Raw dataset

File Structure

Environment Setup

Quick Start

Data Format

Optical Flow (2 channels + mask)

Scene Flow (3 channels)

Output Structure

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages