Brown Team Uses Dog-Inspired Cues to Boost Robot Fetch Accuracy to 89%

Brown University researchers say they’ve improved robot object-finding by combining two human signals most systems treat separately: language and pointing gestures. In work led by Ivy Xiao He, the team reports an **89% success rate** in complex object-retrieval environments using a new planning ...

Brown University researchers say they’ve improved robot object-finding by combining two human signals most systems treat separately: language and pointing gestures.

In work led by Ivy Xiao He, the team reports an 89% success rate in complex object-retrieval environments using a new planning framework called LEGS-POMDP (Language and Gesture-Guided Object Search in Partially Observable Environments), according to Brown’s announcement and the arXiv preprint.

The system uses a probabilistic planner (POMDP) to handle uncertainty — for example, when objects are partially hidden, duplicated, or visually ambiguous — and updates beliefs as the robot gathers more evidence. Critically, it fuses natural-language instructions with a gesture model informed by canine cognition research from Brown’s Dog Lab.

The gesture model treats pointing as a probability cone rather than a single exact target, based on human eye-gaze and arm geometry (eye-elbow-wrist alignment). That gives the robot a more realistic estimate of what a person likely means when they point.

The team then combines that gesture signal with a vision-language model, so the robot can reason jointly over what a user says and where they appear to indicate. In lab tests on a quadruped robot, multimodal fusion outperformed language-only or gesture-only approaches.

The paper is scheduled for presentation at the ACM/IEEE International Conference on Human-Robot Interaction (HRI 2026) in Edinburgh.

What this means

This isn’t a flashy new foundation model. It’s something more useful: better interaction design for real robots. The notable step is not “robots understand language now” — we’ve heard that for years — but that the team formalizes messy human communication (words + pointing + ambiguity) into a deployable planning loop. If this transfers beyond lab setups, it could materially improve assistant robots in homes, hospitals, and warehouses.

Evidence quality

Primary evidence: arXiv preprint (not peer-reviewed journal publication yet).

Institutional source: Brown University release.

Independent validation: not yet reported.

Brown Team Uses Dog-Inspired Cues to Boost Robot Fetch Accuracy to 89%

What this means

Evidence quality

Sources

Share

Related Articles

The Governance Paradox at the Heart of the OpenAI Cap Table

Britain Is Pitching Anthropic on London Expansion and a Dual Listing — And It Needs the US to Keep Pushing

The Data-Labor Inversion Behind AI Healthcare

Stay in the loop

The Governance Paradox at the Heart of the OpenAI Cap Table

Britain Is Pitching Anthropic on London Expansion and a Dual Listing — And It Needs the US to Keep Pushing

The Data-Labor Inversion Behind AI Healthcare

Related Articles

The Governance Paradox at the Heart of the OpenAI Cap Table
Artificial Intelligence · 15m ago · 3 min read

Britain Is Pitching Anthropic on London Expansion and a Dual Listing — And It Needs the US to Keep Pushing

The Data-Labor Inversion Behind AI Healthcare