With robotics, AI behind ChatGPT tries to go beyond the digital world – 03/12/2024 – Tech

With robotics, AI behind ChatGPT tries to go beyond the digital world – 03/12/2024 – Tech

[ad_1]

Companies like OpenAI and Midjourney develop chatbots, image generators and other artificial intelligence tools that operate in the digital world.

Now, a startup founded by three former OpenAI researchers is using the technology development methods behind chatbots to build one that can interact with the material world.

Covariant, a robotics company based in Emeryville, California, is creating ways that enable robots to pick up, move and organize items as they are transported through warehouses and distribution centers.

The goal is to help robots understand what is happening around them and decide what they should do next.

The technology also gives robots a broad understanding of English, allowing people to talk to them as if they were chatting to ChatGPT.

Still in development, the technology is not perfect. But it’s a clear sign that the AI ​​systems behind the internet’s chatbots and imagers will also power machines in warehouses, roads and homes.

Just like chatbots and image generators, this robotics technology learns its skills by analyzing huge amounts of data. This means that engineers can improve technology by feeding it more and more data.

Covariant, backed by $222 million in funding, doesn’t build robots. It builds the software that powers the robots.

The company aims to deploy its new technology to those operating in warehouses, providing a baseline for others to do the same in factories and perhaps even on roads with self-driving cars.

These AI systems behind chatbots are called neural networks, referring to the network of neurons in the brain. By identifying patterns in large amounts of data, these systems can learn to recognize words, sounds and images — or even generate them on their own.

This is how OpenAI built ChatGPT, which can answer questions instantly, write academic papers and generate computer programs. He learned these skills from texts collected from across the internet.

Companies are now building systems that can learn from different types of data at the same time. By analyzing both a collection of photos and the captions that describe those photos, for example, a system can understand the relationships between the two. He can learn that the word “banana” describes a curved, yellow fruit.

OpenAI used this system to build Sora, its video generator. By analyzing thousands of subtitled videos, the system learned to generate videos based on a brief description of a scene.

Founded by UC Berkeley professor Pieter Abbeel and three of his former students, Peter Chen, Rocky Duan and Tianhao Zhang, Covariant used similar techniques in building a system that powers warehouse robots.

The company helps control sorting robots in warehouses around the world. She has spent years gathering data — from cameras and other sensors — that shows how these robots operate.

“The system collects all kinds of data that matters to robots — which can help them understand the physical world and interact with it,” Chen said.

By combining this data with the massive amount of text used to train chatbots like ChatGPT, the company has built AI technology that gives its bots a much broader understanding of the world around them.

After identifying patterns in this melting pot of images, sensory data and text, the technology gives the robot the power to deal with unexpected situations in the physical world. The robot knows how to pick up a banana, even if it has never seen a banana before.

It can also respond in simple English, just like a chatbot. If you tell him to get a banana, he knows what that means. If you tell him to pick a yellow fruit, he understands too.

It can even generate videos that predict what will likely happen when you try to pick up a banana. These videos have no practical use in a warehouse, but they show the robot’s understanding of its surroundings.

“If it can predict the next frames in a video, it can identify the right strategy to follow,” Abbeel said.

The technology, called RFM, (fundamental robotics model, for its acronym in English), makes mistakes, just like chatbots. Although you often understand what people ask, there is always the chance that you will not do it. He drops objects from time to time.

Gary Marcus, an AI entrepreneur and professor emeritus of psychology and neural science at NYU, said the technology could be useful in warehouses and other situations where errors are acceptable.

But he said it would be more difficult and risky to implement in factories and other potentially dangerous situations.

“It comes down to the cost of error,” he said. “If a robot weighing almost 70 kg can do something harmful, that cost can be high.”

Researchers believe this system will improve rapidly as companies train it with increasingly larger and more varied collections of data.

This is very different from the way robots operated in the past. Typically, engineers programmed robots to perform the same precise movement over and over again — like picking up a box of a certain size or attaching a rivet to a specific location on a car’s rear bumper. But these robots could not deal with unexpected or random situations.

By learning from data — hundreds of thousands of examples of what happens in the physical world — robots can begin to deal with the unexpected. And when these examples are combined with language, robots can also respond to text and voice suggestions, much like a chatbot would.

This means that, just like chatbots and image generators, robots will become more agile. “What’s in digital data can transfer to the real world,” Chen said.

[ad_2]

Source link