Custom Data Usage Guide
This guide provides detailed instructions on how to use custom data for model training and inference in the Dexbotic framework.
Table of Contents
- Data Format Requirements
- Dataset Preparation
- Dataset Registration
- Experiment Configuration
- Training and Inference
Data Format Requirements
Dexdata Format
Dexbotic uses the Dexdata format to store robotic datasets in a unified way. Your custom data needs to follow this format:
Dataset Directory Structure
your_custom_dataset/
index_cache.json # Global index file (auto-generated)
episode1.jsonl # Data for the first episode
episode2.jsonl # Data for the second episode
...Episode Data Format
Each .jsonl file contains data for one robot episode, with each line corresponding to one frame:
json
{
"images_1": {"type": "video", "url": "url1", "frame_idx": 21},
"images_2": {"type": "video", "url": "url2", "frame_idx": 21},
"images_3": {"type": "video", "url": "url3", "frame_idx": 21},
"state": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 1.0],
"prompt": "open the door",
"is_robot": true
}Field Specifications
RGB data
- Stored under keys like
images_*. - Multiple views can be added (
images_1,images_2, …). The usage order can be specified in the data configuration (DataConfig)data_keys, and you can also specify to use only a subset of the images. - We recommend using the Main View in
images_1, the Left Hand View inimages_2, and the Right Hand View inimages_3. - Data can be video format, represented as
{"type": "video", "url": "video_url", "frame_idx": xx}. - Data can also be image format, represented as
{"type": "image", "url": "image_url"}.
- Stored under keys like
Robot state
- Stored under the
statekey. - Typically 7-dimensional: 3D position + 3D rotation + 1 gripper
- By default, actions are constructed online using built-in dataset transforms.
- Pre-processed actions can also be stored explicitly under the
actionkey.
- Stored under the
Text data
- Prompts are stored in the
promptkey. - Responses can be specified in two ways:
- Directly: via the answer key.
- Indirectly: leave answer empty, and Dexdata will use
ActionNormAnd2Stringto convert actions into discretized textual responses.
- Prompts are stored in the
Robot vs. general data [Important]
- The is_robot flag distinguishes robot data (true) from general data (false).
Dataset Preparation
1. Data Collection
Collect your robot data, ensuring it includes:
- Image data
- Robot state information (state field)
- Corresponding text instructions (prompt field)
2. Data Conversion
Convert your raw data to Dexdata format:
python
import json
import os
def convert_to_dexdata_format(episode_data, output_dir):
"""
Convert raw data to Dexdata format
Args:
episode_data: List containing episode data
output_dir: Output directory
"""
os.makedirs(output_dir, exist_ok=True)
for i, episode in enumerate(episode_data):
episode_file = os.path.join(output_dir, f"episode{i+1}.jsonl")
with open(episode_file, 'w', encoding='utf-8') as f:
for frame in episode:
# Convert each frame of data
dexdata_frame = {
"images_1": {
"type": "image",
"url": frame['image_path']
},
"state": frame['robot_state'],
"prompt": frame['instruction'],
"is_robot": True
}
# Write to jsonl file (one JSON object per line)
f.write(json.dumps(dexdata_frame, ensure_ascii=False) + '\n')Dataset Registration
Create Data Source File
Create your data source file in the dexbotic/data/data_source/ directory:
python
# dexbotic/data/data_source/my_custom_dataset.py
from dexbotic.data.data_source.register import register_dataset
import math
# Define your dataset
MY_CUSTOM_DATASET = {
"my_robot_data": {
"data_path_prefix": "", # Image path prefix
"annotations": '/path/to/your/custom_dataset/', # Dataset path
"frequency": 1, # Data sampling frequency
},
}
# Define metadata
meta_data = {
'non_delta_mask': [6], # Non-delta dim index, e.g. gripper
'periodic_mask': [3, 4, 5], # Indices of periodic action dimensions (e.g., rotation), used for handling wrapping
'periodic_range': 2 * math.pi # periodic range
}
# Register dataset
register_dataset(MY_CUSTOM_DATASET, meta_data=meta_data, prefix='my_custom')Experiment Configuration
Create Experiment File
Create your experiment file in the playground/ directory:
python
# playground/my_custom_experiment.py
from dataclasses import dataclass, field
from dexbotic.exp.cogact_exp import CogACTDataConfig, CogACTExp
@dataclass
class MyCustomDataConfig(CogACTDataConfig):
"""Data configuration"""
dataset_name: str = field(default='my_custom_my_robot_data') # Dataset name
num_images: int = field(default=1) # Number of images
images_keys: list[str] = field(default_factory=lambda: ['images_1']) # Image fields
@dataclass
class MyCustomExp(CogACTExp):
"""Main experiment class"""
data_config: MyCustomDataConfig = field(default_factory=MyCustomDataConfig)
if __name__ == "__main__":
exp = MyCustomExp()
exp.train() # TrainingTraining and Inference
1. Start Training
bash
# Use torchrun for distributed training
torchrun playground/my_custom_experiment.py2. Monitor Training Process
During training, the system will:
- Automatically compute action normalization parameters and save to
norm_assets/directory - Periodically save model checkpoints to
output_dir - Use Weights & Biases to record training metrics (if wandb is configured)
3. Inference Service
After training is complete, start the inference service:
python
# Modify experiment file, configure trained model path
from dexbotic.exp.cogact_exp import InferenceConfig
@dataclass
class MyCustomInferenceConfig(InferenceConfig):
"""Inference configuration"""
model_name_or_path: str = field(default='/path/to/trained/model') # Trained model path
port: int = field(default=7891) # Inference service port
@dataclass
class MyCustomExp(CogACTExp):
"""Main experiment class"""
data_config: MyCustomDataConfig = field(default_factory=MyCustomDataConfig)
inference_config: MyCustomInferenceConfig = field(default_factory=MyCustomInferenceConfig)
if __name__ == "__main__":
exp = MyCustomExp()
exp.inference() # Start inference servicebash
# Start inference service
python playground/my_custom_experiment.py --task inference4. Use Inference API
bash
# Send inference request
curl -X POST http://localhost:7891/process_frame \
-F "text=Grab the red object" \
-F "image=@/path/to/your/image.jpg"
# Response example
# {"response": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 1.0]}