Model Zoo

General Pretrained Models

Model	Description	Input Images	Action Dim	Model Size	Link
Dexbotic-Base	Discrete vision-language action model (similar to OpenVLA)	Single View	NA	7B	🤗 Hugging Face
Dexbotic-CogACT-SArm	Single-arm CogACT model	Single View	7D	7B	🤗 Hugging Face
Dexbotic-CogACT-HArm	Dual-arm CogACT model with multiple views input	Main View + Left Hand-View + Right Hand-View	16D	7B	🤗 Hugging Face

It is recommended to download the pretrained models into the following folders.

bash

mkdir checkpoints
cd checkpoints
git clone https://huggingface.co/Dexmal/Dexbotic-Base Dexbotic-Base
git clone https://huggingface.co/Dexmal/Dexbotic-CogACT-SArm Dexbotic-CogACT-SArm
git clone https://huggingface.co/Dexmal/Dexbotic-CogACT-HArm Dexbotic-CogACT-HArm

Action Dimension Description

Users need to map their data to the action dimensions of the pretrained models. If the data dimension is smaller than the pretrained model dimension, padding will be conducted automatically.

We recommend using the following data formats to fully utilize the pretrained models:

Single-arm end-effector pose: Organize 7D action data as [xyz + rpy + gripper]
Single-arm joint angles: Organize 8D action data as [joints + gripper]
Dual-arm end-effector pose: Organize 14D action data as [left_arm_xyz + left_arm_rpy + left_arm_gripper + right_arm_xyz + right_arm_rpy + right_arm_gripper]
Dual-arm joint angles: Organize 16D action data as [left_arm_joints + left_arm_gripper + right_arm_joints + right_arm_gripper]

Model Zoo ​

General Pretrained Models ​

Action Dimension Description ​

Model Zoo

General Pretrained Models

Action Dimension Description