Teaching a robot is different from teaching ordinary students; you have to repeat an action hundreds, thousands, and even tens of thousands of times. In a new facility covering an area of up to 12,000 square meters in Wuhan, China, Chinese graduates spend long hours operating humanoid robots that serve steamed buns, wipe tables, and fold clothes.
Cameras and sensors record every movement made by these machines inside laboratory-like kitchens and bedrooms, built at a cost of 200 million Chinese yuan ($29 million). The Hubei Humanoid Robot Innovation Center is one of dozens of state-funded robot training centers that have proliferated across China to build a massive database specifically for robot training.
Zhang Jia, a 21-year-old program manager, says: “We are like teachers, and the robots are our students. When you teach a human, they grasp the matter after a few repetitions. But teaching a robot is different; you have to repeat the action hundreds, thousands, and even tens of thousands of times.” Officials hope that collecting this data will help the nascent humanoid robot industry overcome the key challenges facing artificial intelligence as it moves from the realm of software into the physical world.
This effort is part of President Xi Jinping's push to make China the world's leading superpower in science and technology. Last week, Beijing identified “embodied intelligence” as one of six future sectors to be fostered in its 2026-2030 five-year plan, calling for the development of training centers, AI models, and hardware to accelerate the rollout of humanoid robots.
Experts say the lack of robot-specific training data remains a major hurdle to translating recent AI advances into practical applications in robotics. The large language models that power ChatGPT and "DeepSeek" are fed vast amounts of text scraped from the internet, but robot data collection is still in its early stages.
Companies in China and the United States use a variety of methods to collect training data, including practical applications, simulation, and using AI-generated data. Tesla has explored using human demonstration videos to help train its "Optimus" robot, while Silicon Valley startup "1X Technologies" seeks to place robots in homes, partially controlled remotely by humans as they learn.
Beijing’s multi-year plan for the humanoid robot industry includes expanding robot training and data collection. Local governments, from the rich coastal city of Hangzhou to the small inland city of Mianyang, are pumping vast sums of money into creating new training centers. Hubei Province, home to the Wuhan lab, revealed a 10-billion-yuan government fund dedicated to humanoid robots.
Jay Huang, head of Asian industrial technologies at Bernstein Research, said: “It is clear that China has become smarter in how it supports emerging industries that face obstacles, and this is what it is doing through data collection centers. The government supports data sharing, which benefits everyone. It pushes everyone to work in the same direction.”
In Wuhan, Zhang Jia oversees 70 young trainers working eight-hour shifts to train 46 robots. They use remote controls or handheld devices equipped with sensors to operate the machines, repeating the same movements over and over again. Nearby, rows of employees review video outputs and add annotations every few seconds of footage, such as “turn left” or “extend arm.” The site produces about 100 hours of usable data daily, according to Zhang. He said: “We collect and organize the data, then upload it to our platform, where we classify and process it. But we are still in the exploratory phase.”
The idea relies on the possibility of feeding AI models for robots—known in the field as “Vision-Language-Action” models—a series of sensor readings and videos that track the positions, speeds, and torques of robot parts during their movement. This approach aims to simulate the achievements made by large language models in the field of robotics. This could lead to enabling machines to acquire more general skills, such as picking up a water bottle, without the need to be specifically programmed for that purpose.
Zhao Xiang, a co-founder of the startup Motviz, which developed a platform for simulating embodied intelligence, said: “Getting data at scale is not easy at all.” Young engineers in their state-backed lab in Wuhan wear VR goggles to train robots, allowing Motviz to collect data more efficiently and at a lower cost. Zhao explained: “Simulation is essential. Achieving a breakthrough in intelligence may require hundreds of millions or even trillions of hours of data.”
Experts point to another fundamental challenge looming for these ambitions: currently, data collected from one robot cannot easily be used to power another robot with different components. Huang from Bernstein said that data portability remains an active area of research, with progress expected. For example, AI models for robots from Google DeepMind have shown promise in transferring some skills between different hardware platforms.
China’s approach was outlined in a strategy document issued by the Ministry of Industry and Information Technology to develop the humanoid robot industry until the end of 2027. The plan identifies large-scale training databases and high-quality multimodal data as essential elements for building the “brain” of humanoid robots. Local governments have rushed to support the program, aiming to create jobs and support future industries in their regions.
But even top robot researchers and engineers involved in data collection pointed out that it remains unclear whether it will achieve the desired technological gains. There is one tangible benefit: robot purchases by data collection centers have helped humanoid robot manufacturers in China, while actual demand for their devices is still emerging. The Wuhan center purchased dozens of robots from Shanghai-based "Agibot" at a price of 350,000 yuan per robot. Bernstein analysts estimate that data collection sales accounted for about a fifth of the more than 20,000 humanoid robot shipments in China last year.