Embodied AI moves from screen to surgical suite in major healthcare shift
The COVID-19 pandemic accelerated the adoption of digital health tools, but the next evolution is already underway. Artificial intelligence is no longer confined to remote monitoring and data analysis. It is being embedded into machines that move, sense and intervene directly in patient care.
This transition is explored in "Embodied Artificial Intelligence in Healthcare: A Systematic Review of Robotic Perception, Decision-Making, and Clinical Impact," published in Healthcare. Researchers evaluate how embodied AI systems are advancing surgical robotics, stroke rehabilitation, hospital transport and telepresence medicine.
The researchers conducted a systematic review following PRISMA guidelines and a registered PROSPERO protocol. Out of 2,847 screened records, only 17 met strict inclusion criteria, underscoring both the rapid growth and the early-stage maturity of the field. The selected studies spanned randomized controlled trials, technical validations, feasibility studies and cohort analyses across multiple countries.
From perception to action: How embodied AI systems work
Embodied AI in healthcare operates through a layered model that links perception, decision processes, and physical action. Perception modules rely heavily on multimodal sensor fusion. Nearly all included studies used RGB cameras, while many incorporated depth sensing technologies such as structured light or stereo vision. In surgical and rehabilitation settings, force and torque sensors were critical to ensure safe physical contact with tissue or limbs. Some telepresence and rehabilitation platforms also integrated physiological signals such as electromyography or vital signs to guide adaptive responses.
This multimodal design reflects the complexity of clinical environments. Unlike industrial robotics operating in controlled factories, healthcare robots must navigate unpredictable human movement, variable lighting conditions, dynamic obstacles and fragile biological tissues. Visual input alone is insufficient. Combining sensory streams allows robots to build a more accurate understanding of their surroundings and respond safely.
Decision-making architectures largely rely on machine learning. In surgical applications, imitation learning dominates. Robots learn by observing expert demonstrations and replicating surgical maneuvers. Vision–language–action models, which connect visual input to linguistic instruction and motor output, represent a new direction in this space. Reinforcement learning, which trains robots through trial-and-error optimization in simulated environments, has gained traction in laparoscopic surgery through standardized frameworks that allow consistent benchmarking.
Rehabilitation robotics frequently employs hybrid approaches that blend reinforcement learning with adaptive impedance control, enabling devices to adjust resistance and assistance based on patient progress. In hospital logistics, simultaneous localization and mapping algorithms guide autonomous navigation, while human-aware motion planning ensures safe interaction with staff and patients.
Across domains, autonomy levels vary. Surgical systems often operate under conditional or supervised autonomy, with human oversight maintained for safety. Logistics robots may function at higher autonomy levels in designated corridors. Telepresence platforms typically use shared control models, where clinicians direct high-level actions and robots manage low-level navigation and stability.
Clinical gains and measured impact
The review categorizes outcomes into technical performance, functional improvement and operational feasibility. In surgical robotics, one of the most notable advances is the STAR system, which demonstrated autonomous intestinal anastomosis in animal models. The robot achieved greater consistency in suture spacing and bite depth compared to expert surgeons in controlled settings. While these findings represent a milestone in surgical automation, the authors emphasize that validation in human patients and regulatory approval remain necessary before clinical deployment.
Surgical research also highlights advances in autonomous camera positioning and partial task automation. Vision–language models enable robots to interpret high-level instructions and translate them into surgical motions, suggesting a move toward more flexible and generalizable systems.
Rehabilitation robotics presents the strongest clinical evidence base. An umbrella synthesis covering 396 randomized controlled trials found a pooled standardized mean difference of 0.29 in Fugl-Meyer Assessment scores for upper limb recovery after stroke when robot-assisted therapy was compared with conventional therapy. This effect size, though modest, is statistically significant and supported by multiple meta-analyses. Some multicenter trials demonstrated improvements in motor function metrics, indicating that robot-assisted therapy can enhance structured, repetitive rehabilitation tasks.
However, the authors caution that statistical significance does not automatically equate to meaningful clinical transformation. Questions remain regarding cost-effectiveness, patient selection criteria and integration into existing therapy workflows. Rehabilitation robots often require substantial equipment investment and clinician training.
Hospital logistics robotics demonstrated high operational success rates. Navigation systems achieved 94 to 98 percent success without human intervention in controlled hospital corridors. These robots transported medications, linens and laboratory samples, reducing staff exposure during infectious disease outbreaks and improving workflow efficiency. Yet the studies were often conducted in structured environments with predefined obstacle configurations. Real-world deployment in crowded or complex hospital layouts may yield different results.
Telepresence robotics gained momentum during the COVID-19 pandemic. Systems deployed in isolation wards achieved task completion rates of approximately 87 percent for predefined clinical activities. Nurses and physicians reported positive usability and acceptance in pilot trials, and patients expressed satisfaction with remote interactions in urology and emergency departments. Still, most telepresence studies focused on feasibility and user perception rather than direct measurement of patient health outcomes.
Barriers to translation and the road ahead
Despite technological progress, the review underscores substantial barriers to widespread adoption. Many studies involved small sample sizes, single-center trials or technical validation rather than large-scale clinical deployment. Risk of bias was common, particularly in non-randomized designs. The limited number of eligible studies reflects the emerging nature of embodied AI in healthcare.
Economic evaluation is notably scarce. The acquisition, maintenance and integration costs of advanced robotic systems are significant. Without rigorous cost-effectiveness analyses, healthcare administrators face uncertainty in adoption decisions. The review calls for health economic assessment to be integrated into future trial design.
Safety frameworks represent another critical gap. Learning-based systems may exhibit emergent behaviors that are difficult to predict with traditional verification methods. The authors recommend developing AI-specific safety-case methodologies, including hazard identification, risk mitigation strategies and continuous monitoring protocols.
Regulatory pathways also require adaptation. Existing frameworks were designed for static medical devices rather than continuously learning systems. Accountability for adverse outcomes becomes complex when autonomous decision-making is involved. Ethical considerations, including informed consent and patient trust, demand new governance models.
The workforce dimension cannot be overlooked. Training requirements for clinicians working alongside embodied AI systems remain underdefined. Certification pathways and standardized curricula are needed to ensure safe human–robot collaboration. Organizational readiness, workflow integration and staff acceptance may ultimately determine adoption success more than technical capability.
Future research priorities include the development of domain-specific benchmarking suites beyond surgical frameworks. Comparative configuration studies should examine which combinations of sensors, algorithms and autonomy levels produce superior patient outcomes. Prospective multicenter trials with patient-centered endpoints are necessary to move from experimental validation to routine clinical use. Hybrid effectiveness–implementation designs could accelerate real-world translation.
- FIRST PUBLISHED IN:
- Devdiscourse