EmbodiedAgents¶
The intelligence layer of EMOS – production-grade orchestration for Physical AI
EmbodiedAgents enables you to create interactive, physical agents that do not just chat, but understand, move, manipulate, and adapt to their environment. It bridges the gap between foundation AI models and real-world robotic deployment, offering a structured yet flexible programming model for building adaptive intelligence.
Production-Ready Physical Agents – Designed for autonomous systems in dynamic, real-world environments. Components are built around ROS2 Lifecycle Nodes with deterministic startup, shutdown, and error-recovery. Health monitoring, fallback behaviors, and graceful degradation are built in from the ground up.
Self-Referential and Event-Driven – Agents can start, stop, or reconfigure their own components based on internal and external events. Switch from cloud to local inference, swap planners based on vision input, or adjust behavior on the fly. In the spirit of Godel machines, agents become capable of introspecting and modifying their own execution graph at runtime.
Semantic Memory & Agentic Planning – Hierarchical spatio-temporal memory and semantic routing for arbitrarily complex agentic information flow. The graph-backed Memory component keeps an episodic, entity-aware record of what the robot perceives and of its own internal state, while Cortex turns plain-language goals into ordered calls against every component in the graph – no bloated GenAI frameworks required.
Pure Python, Native ROS2 – Define complex asynchronous execution graphs in standard Python without touching XML launch files. Underneath, everything is pure ROS2 – fully compatible with the entire ecosystem of hardware drivers, simulation tools, and visualization suites.
What You Can Build¶
Robots that listen, see, and speak – microphone in, visually-grounded answer out, all in one Python recipe. Ask “what’s on the table?” and get a real answer in real time.
One sentence in, the right capability fires. “How tall is Everest?” wakes the LLM. “What do you see?” wakes the VLM. “Take me to the kitchen.” dispatches the navigation stack. Behavior emerges from intent.
A robot arm that grabs “the red mug next to the laptop” without you writing a state machine for which mug. A VLM grounds the description; a VLA model translates straight to joint commands.
Every detection, every scene caption, every internal reading is folded into a graph indexed by meaning, place, and time – and persists across reboots. The robot starts knowing your space the way you do.
Drop a Cortex component into your recipe and your robot starts running missions, not commands. “Patrol the workshop and tell me if any lights are off.” Cortex auto-discovers every capability as an LLM tool, plans the steps, dispatches them, watches feedback, and replans on failure – with no orchestration code from you.
Compound goals like “go to the kitchen and tell me what’s on the counter” fall out of a single recipe. The robot recalls where the kitchen is from memory, navigates there with Kompass, looks at the counter, narrates the answer. End-to-end embodied reasoning, no behavior trees.
Next Steps¶
AI Components – The core building blocks: components and topics.
Cortex – The agentic planner-executor that drives the rest of the graph from natural-language goals.
Memory – Graph-backed spatio-temporal memory with perception and interoception layers.
Inference Clients – How inference backends connect to components.
Models – Available model wrappers and vector databases.