The Robot That Needs a Human to Hold Its Hand
X Square Robot has $276M from ByteDance, Alibaba, Meituan and Xiaomi. The robots it deployed in Shenzhen homes already require a human operator standing by to correct them — pay $21.90 per 3-hour shift for a machine that still puts slippers in the kitchen.

Four of China's biggest data-harvesting platforms are about to have a permanent physical presence in your home. Not an app. Not a suggestion. A robot.
X Square Robot, a Shenzhen startup backed by Xiaomi, ByteDance, Alibaba, and Meituan, unveiled Wall-B this week, its jointly-trained embodied AI model. The pitch is autonomous household robots in 35 days. The company has already placed machines in Shenzhen homes through a partnership with 58.com, a classifieds and services platform. A three-hour shift costs 149 yuan, about $21.90. When the robot puts slippers in the kitchen or freezes mid-task, a company employee intervenes remotely, The Independent and Reuters confirmed.
That employee standing by is the part that never appears in the product announcement.
The $276 million Series B, disclosed this week, was led by Xiaomi's strategic investment arm. Total disclosed funding across two rounds in four months reaches roughly $416 million, with earlier backing from ByteDance, HongShan, Alibaba, and Meituan, China Daily and The AI Insider reported. The company was founded in December 2023, operating then as Variable Robotics Technology Co.
ByteDance, Alibaba, Meituan, and Xiaomi are not passive investors here. They are infrastructure. ByteDance's algorithm learns what holds a household's attention. Alibaba knows what it buys. Meituan knows where it eats and shops. Xiaomi knows what it has inside the home, through its smart device ecosystem. A robot that navigates that home in three dimensions, in real time, learning the floor plan, the objects, the rhythms of daily life, is not comparable to a phone sitting in a pocket. It is the difference between a landlord who knows your name and one who has a key and a camera in every room.
That is the data play. The business model is the 58.com bundle: robot plus human cleaner, pay-as-you-go, generating real-world training data from every shift. CEO Wang Qian calls it a learning loop. The faster the deployment, the faster the model improves. It is a coherent theory of the case. It is also a theory that requires a human safety net permanently on call.
The architecture behind Wall-B is a genuine engineering departure. Rather than bolting on vision, language, and motion as separate modules, X Square trains perception, language, and physical prediction together from day one, PR Newswire reported. The argument is that home robots face a fundamentally different problem than factory robots: repetition versus generalization. A factory robot performs one action 10,000 times. A home robot must perform 10,000 different actions, each in a slightly different context. Joint training is one way to attack that long tail. Whether Wall-B actually does attack it is unverifiable without access to the model or independent benchmarks.
The 35-day deployment claim targets May 25. It is a pilot targeting more than 50 households, not a general release. The claim that household labor represents roughly 20% of GDP and therefore the market is 20% of GDP is CEO arithmetic, The Independent noted, not independent analysis.
The home robotics sector has spent years showing what it can do on stage. Humanoids run half-marathons, do backflips, dance in formation. What they cannot do, reliably, is load a dishwasher or fold a towel, Reuters reported. The gap between demo and domestic competence is not accidental. Viral footage of a robot putting slippers in the wrong room is bad for fundraising.
What X Square has built is a telepresence device with arms, bundled with a human worker to cover its gaps. That is a defensible early business. It is not what the press release describes.
The question for May 25 is whether the 35-day deployment expands that model or escapes it.





