The Alibaba AMAP CV Lab focuses on cutting-edge research and innovative applications centered around computer vision technology, dedicated to building the core technological capabilities of the spatiotemporal internet. Positioned at the intersection of the physical and digital worlds, we empower smart mobility, daily life, and virtual spaces through AI-driven understanding and generation.
As the core technical driving force behind AMAP, our research spans the entire chain from perception to generation, and from human-centric intelligence to world modeling. We are structured into six major research domains:
- 🗺️ Map & Autonomous Driving: Integrating multimodal perception with high-definition map generation to enable spatial semantic understanding and regulation-aware intelligent driving.
- 🕺🏻 Human-Centric AI: Building AI systems that understand human emotion, identity, and behavior to achieve natural visual generation and interaction.
- 🧭 Embodied Intelligence: Studying agents that perceive, plan, and act within both virtual and physical environments, unifying vision, language, and motion intelligence.
- 🌐 World Modeling: Constructing dynamic, interactive models of the world to empower AI with the ability to understand, predict, and generate complex environments.
- 🧊 3D Generation & Reconstruction: Advancing 3D scene modeling, rendering, and generation with continuous level-of-detail control and physically realistic synthesis.
- 🧠 General Deep Learning: Exploring general representation learning, model optimization, and multimodal alignment as foundational algorithms for spatiotemporal intelligence.
The AMAP CV Lab stands at the forefront of computer vision research and application, serving as a key technological practitioner in Alibaba’s spatial intelligent internet.
We believe that AI’s ability to understand the world defines the future of intelligent mobility and everyday life.