A mechanical hand is on show on the Robotic Mall, world’s first embodied clever robotic 4S retailer, on August 13, 2025 in Beijing, China.
Vcg | Visible China Group | Getty Photographs
BEIJING — Alibaba Cloud is investing in a brand new sort of synthetic intelligence designed to higher replicate the true world utilizing a unique strategy from chatbots similar to OpenAI’s ChatGPT.
The shift acknowledges the limits of “massive language fashions” skilled totally on textual content. As a substitute, builders are beginning to focus extra on “world fashions” constructed on movies and real-life bodily eventualities.
To leap on the development, Alibaba led a 2 billion yuan ($290 million) funding in ShengShu, the startup behind the AI video technology software Vidu, the corporate introduced Friday. TAL Schooling and Baidu Ventures additionally participated within the collection B funding spherical.
The funding comes about two months after ShengShu raised 600 million yuan from Qiming Enterprise Companions and different backers. The startup declined to reveal its valuation.
ShengShu mentioned the most recent funding will help the event of a “basic world mannequin” that makes use of AI to bridge two at the moment separate domains: the digital world of video games and AI-generated video, and the bodily world of autonomous driving and robots.
“ShengShu believes {that a} basic world mannequin, constructed on multimodal information similar to imaginative and prescient, audio, and contact, extra naturally captures how the bodily world works than massive language fashions,” the three-year-old startup mentioned in a press release.
“We purpose to attach notion and motion,” Zhu Jun, founding father of ShengShu, added in a press release, permitting AI methods to higher mannequin and predict real-world habits persistently.
ShengShu’s newest Vidu Q3 Professional mannequin, launched in January, ranks among the many prime 10 AI fashions for producing movies from textual content and pictures, in keeping with Synthetic Evaluation.
The corporate launched Vidu globally months earlier than OpenAI made its now-shuttered Sora software for AI video technology extensively out there. Chinese language short-video firms Kuaishou and ByteDance have additionally launched related competing AI instruments for producing movies.
World mannequin competitors
Alibaba has expanded its investments in associated startups.
The Chinese language tech large and Baidu Ventures final month led a $50 million funding in Tripo AI, a platform that makes use of AI to rapidly generate digital 3D fashions from pictures. Tripo mentioned it is usually transferring away from methods utilized by language fashions towards AI instruments grounded in bodily area and is creating its personal world mannequin.
In September, Alibaba additionally led a $60 million funding in PixVerse, which launched an AI world mannequin earlier this 12 months that enables customers to direct how a video unfolds whereas it’s being generated.
Alibaba, which obtained its begin in e-commerce, has additionally launched free, open-source AI fashions for video technology and, in February, launched one for powering robots.
Shengshu mentioned Friday it has strategic partnerships with firms creating embodied AI — methods similar to humanoid robots that work together with the bodily world — to be used throughout industrial, business and residential settings.
World fashions are essential for robotics as a result of the expertise wants greater than LLMs to work, Kevin Kelly, co-founder of the U.S. tech journal Wired, wrote final month on his Substack.
In the end, to copy human intelligence, AI will want three issues: reasoning, an understanding of the bodily world and steady studying, Kelly mentioned. Whereas AI for the educational class hasn’t been developed but, LLM-powered chatbots have created the data ingredient, he mentioned, making world fashions a key space requiring a breakthrough.
