Skip to content

Develop Your Own Model

This guide shows the minimal changes required to add a new model into Dexbotic.

You only need to define three classes:

  1. Config – extend DexboticConfig with your custom parameters.
  2. Model – extend DexboticVLMModel to define how model is built.
  3. ForCausalLM – extend DexboticForCausalLM to define everything with training/inference.

1. Config Example

python
# mymodel_arch.py
from dataclasses import dataclass, field
from dexbotic.modeling.dexbotic_arch import DexboticConfig

@dataclass
class MyModelConfig(DexboticConfig):
    """Custom config for MyModel"""
    action_dim: int = field(default=7)       # e.g. 3 pos + 3 rot + 1 gripper
    chunk_size: int = field(default=16)      # length of action sequence
    action_head_arg: str = field(default='xxx')  # example argument

👉 This is similar to CogActConfig in cogact_arch.py.

2. Model Example

To implement a custom model based on DexboticVLMModel, you should:

  1. Implement your own _build_xxx_module methods (e.g., _build_action_head_module, _build_mm_vision_module, etc.) for each submodule you want to support.

  2. In each _build_xxx_module method, always check if the module has already been built (e.g., via getattr(self, 'xxx', None)) to avoid rebuilding or overwriting existing modules.

  3. In initialize_model, you can update config with extra_config, and then rebuild all submodules as needed.

Notes:

  • Each _build_xxx_module method should first check whether self already has the module to avoid duplicate construction.

  • It is recommended that all custom submodules be implemented in the form of _build_xxx_module for future extension and management.

python
import torch
import torch.nn as nn
from dexbotic.modeling.dexbotic_arch import DexboticVLMModel

class MyModel(DexboticVLMModel):
    def __init__(self, config: MyModelConfig):
        super().__init__(config)
        if config.action_head_arg is not None:
            self.action_head = self._build_action_head_module(config)

    def _build_action_head_module(self, config: CogActConfig):
        if getattr(self, 'action_head', None) is not None:
            return self.action_head
        self.action_head = build_action_model(config)
        return self.action_head

    @property
    def action_head_module(self) -> nn.Module:
        return self.action_head

    @property
    def action_head_prefix(self) -> str:
        return 'action_head'

    def initialize_model(self, extra_config: dict):
        for key, value in extra_config.items():
            setattr(self.config, key, value)
        self.mm_vision_tower = self._build_mm_vision_module(self.config.mm_vision_tower)
        self.mm_projector = self._build_mm_projector_module(self.config)
        self.action_head = self._build_action_head_module(self.config)

👉 This mirrors what CogActModel does in cogact_arch.py.

3. ForCausalLM Example

Implement your own ForCausalLM class. Develop your own forward method.

python
from dexbotic.modeling.dexbotic_arch import DexboticForCausalLM,CausalLMOutputDexbotic


class MyModelForCausalLM(DexboticForCausalLM):
    config_class = MyModelConfig

    def _real_init(self, config: MyModelConfig):
        self.model = MyModel(config)
        self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False)
        self.post_init()

    def forward(self,
                ...
                ) -> CausalLMOutputDexbotic:

        ...

        return CausalLMOutputDexbotic(
            ...

        )

👉 This is the minimal wrapper, just like CogACTForCausalLM.