Custom Data Usage Guide

This guide provides detailed instructions on how to use custom data for model training and inference in the Dexbotic framework.

Data Format Requirements
Dataset Preparation
Dataset Registration
Experiment Configuration
Training and Inference

Data Format Requirements

Dexdata Format

Dexbotic uses the Dexdata format to store robotic datasets in a unified way. Your custom data needs to follow this format:

Dataset Directory Structure

your_custom_dataset/
    index_cache.json   # Global index file (auto-generated)
    episode1.jsonl     # Data for the first episode
    episode2.jsonl     # Data for the second episode
    ...

Episode Data Format

Each .jsonl file contains data for one robot episode, with each line corresponding to one frame:

json

{
    "images_1": {"type": "video", "url": "url1", "frame_idx": 21}, 
    "images_2": {"type": "video", "url": "url2", "frame_idx": 21},
    "images_3": {"type": "video", "url": "url3", "frame_idx": 21},
    "state": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 1.0],
    "prompt": "open the door",
    "is_robot": true
}

Field Specifications

RGB data
- Stored under keys like images_*.
- Multiple views can be added (images_1, images_2, …). The usage order can be specified in the data configuration (DataConfig) data_keys, and you can also specify to use only a subset of the images.
- We recommend using the Main View in images_1, the Left Hand View in images_2, and the Right Hand View in images_3.
- Data can be video format, represented as {"type": "video", "url": "video_url", "frame_idx": xx}.
- Data can also be image format, represented as {"type": "image", "url": "image_url"}.
Robot state
- Stored under the state key.
- Typically 7-dimensional: 3D position + 3D rotation + 1 gripper
- By default, actions are constructed online using built-in dataset transforms.
- Pre-processed actions can also be stored explicitly under the action key.
Text data
- Prompts are stored in the prompt key.
- Responses can be specified in two ways:
  - Directly: via the answer key.
  - Indirectly: leave answer empty, and Dexdata will use ActionNormAnd2String to convert actions into discretized textual responses.
Robot vs. general data [Important]
- The is_robot flag distinguishes robot data (true) from general data (false).

Dataset Preparation

1. Data Collection

Collect your robot data, ensuring it includes:

Image data
Robot state information (state field)
Corresponding text instructions (prompt field)

2. Data Conversion

Convert your raw data to Dexdata format:

python

import json
import os

def convert_to_dexdata_format(episode_data, output_dir):
    """
    Convert raw data to Dexdata format
    
    Args:
        episode_data: List containing episode data
        output_dir: Output directory
    """
    os.makedirs(output_dir, exist_ok=True)
    
    for i, episode in enumerate(episode_data):
        episode_file = os.path.join(output_dir, f"episode{i+1}.jsonl")
        
        with open(episode_file, 'w', encoding='utf-8') as f:
            for frame in episode:
                # Convert each frame of data
                dexdata_frame = {
                    "images_1": {
                        "type": "image", 
                        "url": frame['image_path']
                    },
                    "state": frame['robot_state'],
                    "prompt": frame['instruction'],
                    "is_robot": True
                }
                
                # Write to jsonl file (one JSON object per line)
                f.write(json.dumps(dexdata_frame, ensure_ascii=False) + '\n')

Dataset Registration

Create Data Source File

Create your data source file in the dexbotic/data/data_source/ directory:

python

# dexbotic/data/data_source/my_custom_dataset.py

from dexbotic.data.data_source.register import register_dataset
import math

# Define your dataset
MY_CUSTOM_DATASET = {
    "my_robot_data": {
        "data_path_prefix": "", # Image path prefix
        "annotations": '/path/to/your/custom_dataset/',  # Dataset path
        "frequency": 1,  # Data sampling frequency
    },
}

# Define metadata
meta_data = {
    'non_delta_mask': [6],  # Non-delta dim index, e.g. gripper
    'periodic_mask': [3, 4, 5],  # Indices of periodic action dimensions (e.g., rotation), used for handling wrapping
    'periodic_range': 2 * math.pi  # periodic range
}

# Register dataset
register_dataset(MY_CUSTOM_DATASET, meta_data=meta_data, prefix='my_custom')

Experiment Configuration

Create Experiment File

Create your experiment file in the playground/ directory:

python

# playground/my_custom_experiment.py

from dataclasses import dataclass, field
from dexbotic.exp.cogact_exp import CogACTDataConfig, CogACTExp

@dataclass
class MyCustomDataConfig(CogACTDataConfig):
    """Data configuration"""
    dataset_name: str = field(default='my_custom_my_robot_data')  # Dataset name
    num_images: int = field(default=1)  # Number of images
    images_keys: list[str] = field(default_factory=lambda: ['images_1'])  # Image fields

@dataclass
class MyCustomExp(CogACTExp):
    """Main experiment class"""
    data_config: MyCustomDataConfig = field(default_factory=MyCustomDataConfig)

if __name__ == "__main__":
    exp = MyCustomExp()
    exp.train()  # Training

Training and Inference

1. Start Training

bash

# Use torchrun for distributed training
torchrun playground/my_custom_experiment.py

2. Monitor Training Process

During training, the system will:

Automatically compute action normalization parameters and save to norm_assets/ directory
Periodically save model checkpoints to output_dir
Use Weights & Biases to record training metrics (if wandb is configured)

3. Inference Service

After training is complete, start the inference service:

python

# Modify experiment file, configure trained model path
from dexbotic.exp.cogact_exp import InferenceConfig

@dataclass
class MyCustomInferenceConfig(InferenceConfig):
    """Inference configuration"""
    model_name_or_path: str = field(default='/path/to/trained/model')  # Trained model path
    port: int = field(default=7891)  # Inference service port

@dataclass
class MyCustomExp(CogACTExp):
    """Main experiment class"""
    data_config: MyCustomDataConfig = field(default_factory=MyCustomDataConfig)
    inference_config: MyCustomInferenceConfig = field(default_factory=MyCustomInferenceConfig)

if __name__ == "__main__":
    exp = MyCustomExp()
    exp.inference()  # Start inference service

bash

# Start inference service
python playground/my_custom_experiment.py --task inference

4. Use Inference API

bash

# Send inference request
curl -X POST http://localhost:7891/process_frame \
  -F "text=Grab the red object" \
  -F "image=@/path/to/your/image.jpg"

# Response example
# {"response": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 1.0]}

Custom Data Usage Guide ​

Table of Contents ​

Data Format Requirements ​

Dexdata Format ​

Dataset Directory Structure ​

Episode Data Format ​

Field Specifications ​

Dataset Preparation ​

1. Data Collection ​

2. Data Conversion ​

Dataset Registration ​

Create Data Source File ​

Experiment Configuration ​

Create Experiment File ​

Training and Inference ​

1. Start Training ​

2. Monitor Training Process ​

3. Inference Service ​

4. Use Inference API ​

Custom Data Usage Guide

Table of Contents

Data Format Requirements

Dexdata Format

Dataset Directory Structure

Episode Data Format

Field Specifications

Dataset Preparation

1. Data Collection

2. Data Conversion

Dataset Registration

Create Data Source File

Experiment Configuration

Create Experiment File

Training and Inference

1. Start Training

2. Monitor Training Process

3. Inference Service

4. Use Inference API