ChatGPT got a body: Humanoid robot from OpenAI and Figure can fully communicate with people

March 14, 2024  12:06

The American startup Figure, together with the OpenAI company, created the Figure 01 robot, which can talk to a person in real time and carry out his commands.

The video released by the companies is truly impressive. In response to a person’s question, the robot clearly and accurately describes what it sees on the table in front of it, and at the same time describes the interlocutor and even his posture. And when a person asks to give him something edible, the robot hands him an apple, after which the person dumps a bunch of garbage in front of the robot and asks to remove it, at the same time explaining why he handed him the apple. And the robot copes well with this task, as well as with the next one.

The Figure startup, founded by businessman Brett Adcock, has attracted the attention of major players in the robotics and artificial intelligence technology market, including Boston Dynamics, Tesla Google DeepMind, Archer Aviation and others. The startup’s goal is “to create the world’s first commercially available general-purpose humanoid robot,” and they are moving towards this goal with very confident steps.

Last October, Figure 01 demonstrated its capabilities in performing basic autonomous tasks, and by the end of the year the robot began learning to perform various tasks. Already in January 2024, Figure signed the first commercial contract for the use of Figure 01 at the BMW automobile plant in North Carolina. A month later, the startup showed a video of Figure 01 working in a warehouse.

Shortly after, the startup announced the development of a second generation of the robot and announced a collaboration with OpenAI to develop a new generation of AI models for humanoid robots. You can look at the first results of their collaboration today. Adcock claims that during the demonstration the robot was not remotely controlled and the video was shown at real speed.

The fact that AI can speak freely to humans is no longer surprising. You can already communicate with the same ChatGPT chatbot via voice commands. But how does the new robot understand what it sees around it?

Adcock said on his X (formerly Twitter) page that the cameras built into Figure 01 serve as its eyes and feed data to a large visual language AI model trained by OpenAI. According to him, OpenAI algorithms are also responsible for the robot’s ability to understand human speech, and the Figure neural network converts the flow of received information into fast, low-level and dexterous robot actions.

  • Archive