Cold thinking under Sora explosion： the excitement of generative artificial intelligence is still behind the bubble.

　　　　　　In fact, people don’t know what they want until you show it to them.

Text/Qingyuan

Editor/Cheng Mo

Source/ten thousand research

"There is no trace of wings in the sky, but I have flown." Tagore’s famous sentence handed down from ancient times can just describe the phenomenal influence of Sora (Japanese "sky" pronunciation), the video generation model of OpenAI, on the public opinion field.

Under the influence of Sora, NVIDIA, regarded as the "hardware overlord of generative artificial intelligence", has surpassed the $2 trillion mark in market value, and its founder, Huang Renxun, has become the top evangelist of human science and technology development. Only a dozen months ago, NVIDIA, who was repeatedly hit by the bursting of the cryptocurrency bubble, saw his share price plummet by 60%. The Economist magazine even said, "When he looks at the fancy new model that he thinks will change the face of AI through his glasses, and the more vague concept like the meta-universe, is he in danger of underestimating the cruelty here and now?"

At present, it is dazzling, many technical predictions of generative AI are born, and investors are more and more excited about the prospect of generative AI.

In fact, as early as a quarter of a century ago, Ray Kurzweil, the evangelist of technological singularity theory, predicted the development of "massively parallel neural network computers" in his masterpiece "The Age of Soul Machines" and predicted that artificial intelligence technology would achieve some milestones under the support of this computing power foundation around 2020:

Most business transaction scenarios contain a virtual person;

Most roads are equipped with automatic driving systems;

People began to establish contact with robots and regard them as companions, teachers, caregivers and even lovers;

Virtual artists appear in various artistic fields;

The media widely reported that the computer passed the Turing test, although these tests did not meet the standards recognized by experts. …

After more than 20 years of twists and turns of excitement and disappointment, until the rise of OpenAI, it seems that we have caught up with the progress milestone described by the future scientific community.

Geoffrey Hinton, the godfather of deep learning who has always been low-key, also praised the technical significance of generative artificial intelligence: "AI will change the world more than anything in human history, and it can be compared with the industrial revolution in scale, or the invention of wheels and electric energy." （"AI is going to change the world more than anything in the history of humanity.”）

Indeed, even if you can’t say the golden sentences of Hinton or Huang Renxun, it is not difficult for the general public to awaken some simple and strong intuition from the viral spread of ChatGPT and Sora and realize that a major change is taking place at the moment. If the AlphaGo man-machine war in 2016 completed the popularization of the "usefulness" of artificial intelligence, then today’s AI big model, which is getting hotter and hotter, can be regarded as a clear display of "ease". The two prerequisites for technology diffusion have been completed, and the "long summer" of artificial intelligence is foreseeable.

It is claimed that Sora represents that the underlying model of OpenAI has the ability to recognize and understand the real world, and that AI can generate its own open world, and in this world, it can interact and evolve itself, and the road to General Artificial Intelligence (AGI) has been completed.

However, filtering out Li Yizhou-style "indigenous" players, the interpretation of generative AI has been overwhelming, but it can really explain "What’s the use? How useful is it? " There are still no clear answers to these essential questions.

In fact, in the polished story of technology and business evolution, many key milestones do not exist in the mind of a technical genius in advance, which are either the products of mutual enlightenment and deepening cognition in engineering practice, and gradually formed by the consensus of research groups, or are purely the characteristics of neural network models "emerging" on their own.

"People don’t know what they want until you show it in front of them." This famous saying of Jobs applies to both the audience and the subject of innovation.

Take OpenAI as an example, the performance of its GPT model "emerges", which is an "accident" brought by the increase of model parameters in engineering exploration. As for the inter-frame coherence and object consistency shown by Sora, the project developer Tim Brooks also admits that it is a kind of capability that has not been set in advance. From the so-called Difficulty Transformer engineering principle, I’m afraid Sora really can’t talk about the "world model". According to Yann LeCun’s description of the world model, the necessary intuitive "common sense" of the real physical world is obviously the opposite of the traditional neural network path that is good at approaching the implied probability distribution. The amazing video effect may only prove that Sora has learned the probability distribution of physical laws, not the physical laws themselves.

From the perspective of technical route, Sora still hasn’t proved or falsified an extremely important question: in the face of the "black box" of neural network, is Scaling Law a feasible path to AGI by violence, or is it a sweet illusion after a drooping fruit has been picked up and full?

If the answer is the former, then there is no doubt that the United States has firmly grasped all the key chips leading to AGI. From infrastructure providers represented by NVIDIA to model developers such as OpenAI and Google, its advantages over overseas competitors are mostly amazing, and its repeated suppression of China, a major competitor, shows the determination of Americans to actively defend this advantage. However, at the high-light moment when the American AI industry "won hemp", it may be necessary to bear in mind a cold law at the same time.

In a previous research on the impact of generative AI on human jobs conducted by Accenture, banking, insurance and software are among the top three industries with the highest risk exposure. As we all know, these are the high-end pillars of the current American economy. Once the technical maturity of generative AI crosses a certain balance point, its accelerated popularization will make the United States feel the pain of transformation at the first and deepest, and the socio-economic consequences during it are still unpredictable.

If the answer is the latter, then the judgment of the first big trough in the history of artificial intelligence can also be seamlessly used today: "The first person who climbs the tree can claim that this is a remarkable progress in flying to the moon."

In the case of diminishing marginal effect of Scaling Law, can the application of large-scale model of language and characters overcome intermittent hallucinations and catastrophic forgetfulness, and avoid outputting jokes similar to "the temperature of EMU trains reached 1538℃" some time ago?

Take Sora as an example, can its application prospect point to the so-called "one sentence to generate a movie"? Judging from the current speculation, if the model can’t achieve continuous Prompt correction, but can only try the effect repeatedly by means of prompt, its application in the image production scene will still be a mirage. Even if it is a short-term advertising video production, can its market segment scale support the current market value of generative AI concept stocks of no less than 10 trillion US dollars?

In any case, it is worth emphasizing that today’s public’s enthusiastic expectation for Sora has appeared many times since the beginning of the industrial revolution. Every time, people think that the new era of human society brought by automation is close at hand. Look at the possibility and influence of replacing human beings with machines discussed by Wiener, the father of cybernetics, in his 1950 work "The Use of People", which is similar to today’s public opinion topic: "From this stage on, all work can be done by machines. This mechanized method is also applicable to most of the work of industrial enterprise libraries and archives offices. In other words, machines do not prefer manual labor or bureaucratic work. Therefore, the new industrial revolution can penetrate into a wide range of fields, including all the labor that is not too brainy … The new industrial revolution is a double-edged knife, which can be used to benefit and destroy mankind. If we don’t make rational use of it, it may develop to this point very quickly. “

Of course, although today’s ChatGPT, Sora and AlphaGO, which are also landmark earlier, have a clear and profound impact on the public’s perception, it is not difficult for any ordinary person who switches to the producer’s perspective to immediately understand that there is still a deep gap between their capabilities and scenarios and the requirements of productivity tools, which arouses the public’s curiosity and is only the first step in the long March from technological possibility to commercial change.

Perhaps it is the wisest attitude to let time give the answer.

Undoubtedly, the current AI fanaticism is comparable to the Internet bubble in the Millennium. At that time, fanatical investors and entrepreneurs were also willing to put everything on the imaginary vision of change in the absence of clear application scenarios. Shortly after the tragic bursting of this bubble, Amazon turned a profit in the Christmas shopping season in 2001, which marked that the Internet economy found a sense of direction.

People set out again and again because they saw the mountain peak, but they were discouraged because they were looking for a path. Until at the bottom and edge, major breakthroughs in engineering and application innovation were ignited from the bottom up, and the historical context was always so simple and profound.

Taking history as a mirror, I am afraid that the road to industrialization of generative artificial intelligence will still be like this, moving forward and looking back clearly. After the completion of the "arms race" between the AI computing power and the AI model of the major platform giants, I am afraid that the capital bubble of OpenAI and even NVIDIA will also escape its fate and seize the time to realize its value. This may be the intention of the former to carefully package Sora for public relations, but the wonderful industry will not really be staged until the bubble bursts.

Another prediction made by Kurzweil may serve as the conclusion and expectation of this paper: "Considering all these factors, we have reason to estimate that a personal computer worth $1,000 will be equivalent to the human brain in terms of computing speed and capacity, especially in neural connection computing (the main computing mode of the human brain) by 2020".