Author: Haotian
After waking up, many friends asked me to look at #manus, which claims to be a globally truly universal AI Agent that can think independently and plan and execute complex tasks, delivering complete results. It sounds very cool, but besides the voices of many friends in the circle worrying about losing their jobs, what will it bring to the explosion of the web3 DeFai scene? Below, let me share my thoughts:
About a month ago, OpenAI launched a similar product called Operator, where AI can independently complete tasks such as restaurant reservations, shopping, ticket booking, and ordering takeout in the browser. Users can supervise visually and take control at any time.
There hasn’t been much discussion about the appearance of this Agent set, because it is a single-model-driven, tool-invoking framework. When users realize that critical decisions still need intervention, they lose the idea of relying on its execution tasks.
In short, AI is to imitate the PDCA cycle of human execution (plan - execute - check - act), which will be completed by multiple large models working together. Each model focuses on a specific link, which can reduce the decision-making risk of a single model performing tasks and improve execution efficiency. The so-called ‘multi-signature system’ is actually a decision verification mechanism of multi-model collaboration, which ensures the reliability of decision-making and execution by requiring the joint confirmation of multiple professional models.
The key point lies in the complexity of its execution tasks, as well as the fault tolerance of large models and the definition of the success rate of delivery results after non-uniform standard user input Prompt enters. Otherwise, following this innovation, can the DeFai scene of web3 immediately become a mature application? Obviously, it is not there yet:
For example, in the DeFai scenario, the Agent needs to make transaction decisions. There should be an Oracle layer Agent responsible for on-chain data collection and validation, data integration and analysis, as well as real-time monitoring of on-chain prices to capture trading opportunities. This process poses a great challenge to real-time analysis. There may be trading opportunities that were useful a second ago, but once the Oracle’s large model is transmitted to the transaction execution Agent, the trading opportunity no longer exists (arbitrage window).
This actually exposes the biggest weakness of such multimodal large models in making execution decisions, how to connect to the network, trigger chain calls to retrieve and analyze Real-Time level data, identify trading opportunities from it, and then capture trades. The network environment is actually okay, as the prices of many e-commerce website orders do not change in real time, which does not easily cause huge dynamic balance problems for the entire multimodal collaboration. If it’s on the chain, such challenges almost exist at all times.
We need to objectively understand the role of this matter in promoting the application scenarios of DeFai in web3.
It must be acknowledged that the significance is definitely significant, after all, the concept of LLM OS and Less Structure more intelligence it proposes, especially the multi-signature system, will provide great inspiration for the integration of DeFi and AI to expand web3.
This actually corrects a major misconception of most DeFai projects, don’t rush to rely on a large model to achieve complex goals such as AI Agent autonomous thinking + decision-making, which is simply not practical in financial scenarios.
The realization of the true DeFai vision requires solving complex problems such as the upper limit of single AI model capabilities, atomicity guarantee of multimodal interaction and collaboration, unified resource scheduling and allocation of multimodal systems, system fault tolerance, and fault handling mechanisms.
For example, the Oracle layer Agent is responsible for collecting on-chain data, analysis, and price monitoring to form an effective data source;
Decision-making layer Agent, analyzing and evaluating risks based on the data fed by Oracle, and formulating a set of decision-making and action plans;
The execution layer Agent executes multiple solutions provided by the decision-making layer, taking into account the actual situation, including gas cost optimization, cross-chain state, transaction ordering conflicts, and so on.
Only when this series of Agents are synchronized and a huge system framework is established, can a true DeFai revolution be triggered.