Artificial intelligence (AI) has dominated many areas of technology in recent years, and now it’s time for a revolution in robotics. Jim Fan, a senior scientist at Nvidia, predicts that we’ll see a breakthrough in this field in the next few years. In an interview with Sequoia Capital, he spoke of the hope for a “GPT-3 moment for robotics”—a breakthrough in robotic models similar to what GPT-3 achieved in natural language processing.

A breakthrough in robotics in the coming years

Fan, who leads AI research for robots at Nvidia, expects to see significant progress in creating baseline models for robots in the next two to three years. His team is working on the Groot project, which aims to create advanced baseline models for humanoid robots. Despite the optimistic outlook, Fan emphasizes that it will take even longer for robots to be widely implemented in everyday life. It’s not just about the technical aspects—the robots must be affordable, mass-producible, and hardware-secure and privacy-compliant.

The potential of humanoid robots

Fan sees huge potential in humanoid robots, arguing that the world is designed with the human body in mind. “All of our restaurants, factories, hospitals, and tools and equipment—they’re all made for the human form and human hands,” he emphasizes. In his opinion, humanoid robots could eventually perform all the tasks that are currently reserved for humans. He predicts that within two to three years, the hardware ecosystem for humanoids will be ready on a large scale.

The New Era of Robotics – Data and Simulations

Nvidia’s approach to robotic AI development is all about combining three types of data: data from the internet, simulations, and data from real robots. Fan notes that each method has its advantages and disadvantages, but that combining them is key to success.

Fan compares the current state of robotics to the state of natural language processing before GPT-3. He envisions a similar evolution—from specialized models to more general approaches that can then be tailored to specific tasks. One of the biggest challenges he sees is the data collection process. Fan believes that the full potential of Transformer-type models in robotics has not yet been realized, and that once the data pipeline is fully developed, the models will be able to scale even further.

Training agents in the physical and virtual worlds

Nvidia is working on innovative techniques like “Eureka,” which automates the process of creating reward functions for robots that previously required manual development. Additionally, Fan’s research extends to virtual environments, such as video games, where AI for virtual agents is being developed. The ultimate goal is to create a single model that can control both virtual and physical agents.

The future of robotics

During an interview, Fan quoted Nvidia CEO Jensen Huang as saying, “Everything that moves will eventually become autonomous.” According to Fan, if there are to be as many intelligent robots as smartphones in the future, they need to start building them now.

Despite his optimistic outlook, Fan acknowledges that many challenges remain. One is integrating fast, automated responses with slower, informed planning and decision-making processes into a single model. Nevertheless, his team at Nvidia is working hard to make AI a breakthrough in robotics on a par with what GPT-3 brought to natural language processing.

Artificial intelligence is constantly evolving, and humanoid robots may soon become an everyday reality. What challenges lie ahead? One thing is certain – the future of technology looks promising.