ABot-M0: VLA Foundation Model for Robotic Manipulation with Action Manifold Learning

February 11, 2026

ABot-M0

ABot-M0 is a general-purpose robotics model with the following core highlights:

ABot-M0 model overview

The overview of ABot-M0.

  • Massive & Unified Data: It integrates over 6 million open-source trajectories to form the largest unified dataset for robotic manipulation, providing a strong foundation for generalization.
  • Innovative Action Paradigm: It pioneers Action Manifold Learning (AML), which directly predicts clean actions instead of noise, resulting in a more efficient and stable model.
  • Modular 3D Perception: It supports plug-and-play modules to enhance 3D spatial understanding, improving execution precision for complex tasks.

UniACT-dataset

UniACT-dataset overview

UniACT-dataset: a large-scale unified robotic manipulation dataset behind ABot-M0.

Robotic data suffers from fragmentation and inconsistent representations. To address this challenge, we constructed UniACT-dataset, one of the largest non-private robotic manipulation datasets to date.

Data Sources and Scale

  • Integration: It aggregates 6 major public datasets, including OXE, OXE-AugE, and AgiBot-Beta.
  • Scale: The dataset comprises over 6 million trajectories, amounting to 9,500+ hours of interaction data.
  • Diversity: It covers more than 20 different robot morphologies.

Systematic Data Curation Pipeline

  • Cleaning: We filter out data with invalid instructions, visual anomalies, and erroneous actions.
  • Standardization: All actions are unified into delta actions in the end-effector coordinate frame, with rotations adopting the more stable rotation vector representation.
  • Unification: A “pad-to-dual” strategy is employed to enable a single model to handle both single-arm and dual-arm tasks.

Action Manifold Learning (AML)

Action Manifold Learning overview

Action Manifold Learning (AML): learning robot actions on a low-dimensional, structured manifold.

Conventional diffusion models typically learn to predict noise, an approach that is inefficient and unstable for robotic control. We propose the Action Manifold Hypothesis: effective robot actions are not randomly distributed in a high-dimensional space, but rather lie on a low-dimensional, smooth manifold shaped by physical laws and task constraints.

Building on this, we design Action Manifold Learning (AML):

  • Direct Action Prediction (a-prediction): The model directly outputs clean action sequences instead of high-dimensional, unstructured noise.
  • Improved Efficiency and Stability: The learning objective shifts from “fitting noise” to “projecting onto the feasible manifold.” This enables the model to learn meaningful action structures more efficiently, thereby boosting decoding speed and policy stability.
  • Enhanced Scalability: The advantages of AML become more pronounced when handling higher-dimensional actions (e.g., for dexterous hands or whole-body control), laying a foundation for the long-term development of embodied intelligence.

Results

Success rates (%) on the LIBERO benchmark. ABot-M0 is one policy for all 4 suites.
Method L-Spatial L-Object L-Goal L-Long Average
Diffusion Policy78.587.573.564.876.1
OpenVLA84.788.479.253.776.5
SpatialVLA88.289.978.655.578.1
CoT-VLA87.591.687.669.083.9
π₀-Fast96.496.888.660.285.5
GR00T-N194.497.693.090.693.9
π₀98.096.894.488.494.4
F198.297.895.491.395.7
InternVLA-M198.099.093.892.695.9
Discrete Diffusion VLA97.298.697.492.096.3
π₀.₅98.898.298.092.496.9
GR00T-N1.697.798.597.594.497.0
OpenVLA-OFT97.698.497.994.597.1
X-VLA98.298.697.897.698.1
ABot-M0 (Ours) 98.8 99.8 99.0 96.6 98.6

Video

We showcase representative rollouts from the four benchmark suites used to evaluate ABot-M0.

LIBERO

LIBERO-Plus

RoboCasa

RoboTwin 2.0

Author Team

Author contributions in the following areas are as follows:

  • Data Collection & Analysis: Yandan Yang, Haoyun Liu, Ronghan Chen, Yuzhi Chen, Dekang Qi
  • Data Standardization: Yandan Yang, Tong lin, Xinyuan Chang
  • Data Pipeline: Tong Lin, Shuang Zeng, Dongjie Huo
  • Model Architecture: Shuang Zeng, Junjin Xiao
  • Training: Shuang Zeng, Tong Lin, Junjin Xiao
  • Evaluation: Junjin Xiao, Shuang Zeng, Tong Lin
  • Project Lead: Xinyuan Chang, Feng Xiong
  • Advisor: Mu Xu†, Zhiheng Ma, Xing Wei

Citation

If you find our work helpful, please cite us:


              @article{abot-m0,
                title={ABot-M0: VLA Foundation Model for Robotic Manipulation with Action Manifold Learning},
                author={AMAP CV Lab},
                year={2026}
              }
            

Thank you!