AI Agent: Building data-driven intelligences

Yanlin

Mar 3, 2025

Share on X

Author: Yanlin

In the past year, the large model field has mainly seen two hotspots: one is LLM (Large Language Models), which is innovating at an almost monthly pace, with concerns focused on effectiveness and cost. The other is AI Agents, where attempts are being made to solve application problems across various fields, with an emphasis on scenarios and competitiveness. Below, we will focus on sharing trends and practices in AI Agents.

AI Agent Insights

The evolution of AI Agents is accelerating from single-agent to multi-agent systems. Data-centric intelligent platform mechanisms will speed up their formation. Building high-quality data and continuously optimizing data quality capabilities will be crucial for the success of these agents.

AI Agent Architecture and Development Trends

What is an intelligent agent?

An AI Agent is an intelligent entity capable of perceiving the environment, making autonomous decisions, and executing actions. It has the ability to gradually complete goals through independent thinking and tool invocation.

Why do we need agents?

Because LLMs only simulate the neural reasoning processes in the human brain, carrying out specific tasks in the real world still requires a sensory system similar to human senses (hearing, touch, taste, etc.), supported by memory and experience to aid decision-making, ultimately leading to action.

Trends in Agent Development

In the past year, many have been exploring fixed, single-task agents to solve specific small problems. Starting this year, efforts have shifted towards building agent platforms and paradigms to enhance multi-agent collaboration, orchestration, and optimization of data quality systems. The ultimate goal is to develop a super agent capable of one-stop solutions for all problems, marking the true arrival of general artificial intelligence (AGI).

Considering generality and specificity, a balance is often needed, and cost effectiveness must also be balanced. The arrival of AGI will still take some time; therefore, we judge that the main future direction will be a data-centric, multi-agent collaboration model.

Building Competitiveness for AI Agents

During the construction of AI Agents, the first question we consider is: what is the core competitiveness of an intelligent agent?

We believe that a combination of model + data + scenarios are the three key points for building the competitiveness of AI products.

The model has sufficiently mined public domain data; the next focus may be on cost and performance (DeepSeek is accelerating this change).
Private domain data is the core barrier for each company. The focus is on thoroughly mining private data, solidifying this data, continuously optimizing it, and unlocking maximum customer value. With good production resources and underlying model productivity support, continuous evolution can be achieved.
Identify high-frequency, structured, and risk-controllable scenarios relevant to your field and progressively extend the specialization of scenarios to enhance customer efficiency. For example, in the DEVOPS field, smart coding has broken through the high-frequency scene with Lingma enhancing the efficiency of building intelligent agents.

AI Agent Data Flywheel

Everyone knows data is a core competitiveness. So how do we create high-quality data in our respective fields?

First, each application should collect and consolidate personalized and specialized data from customers. Secondly, each field has its own specialized data and SOPs, which can effectively solve customer problems when combined with customer data.

Once we have structured an AI Agent for release to customers, we need to prepare data assessment sets to meet customer SLA certainty requirements. After going online, we need to collect customer feedback data and use it to analyze and optimize our industry data, tool sets, and scenarios.

By continuously optimizing the private high-quality data system on the right through the evaluation data system on the left, we can achieve high-quality alignment between customer demands and data. Thus, the flywheel keeps turning, continuously enhancing the enterprise's competitiveness!

AI Agent: Building a Data-Centric Intelligent Agent Platform

What system will carry and circulate the above four types of data?

Our answer is to construct a data-centric intelligent agent platform.

Build a corporate knowledge base, transforming data into Markdown through platform tools, which is then pushed to a vector database to build domain data; tools will help agents access structured customer data.
Construct data assessment sets and an automated intelligent data evaluation system.
Establish a customer feedback and tracking system at the frontend.
Implement data and task automatic circulation through a multi-agent architecture.

AI Agent: Global Technical Architecture

How do we build a multi-agent architecture?

Alibaba introduced the Spring-AI-Alibaba framework and ecosystem toolset at last year's Yunqi Conference to assist enterprises in constructing intelligent agents.

Integrate system data and toolsets with one-click through Higress to obtain private customer data.
Complete full chain data quality observation through the Otel observation system.
Dynamically update prompt word data in real-time via Nacos to see optimization effects.
Use Apache RocketMQ to dynamically update RAG data for real-time feedback and optimization.

AI Agent Practice

Through the above introduction, we can see the significant implications and trends of building a data-centric intelligent agent. Next, we will share Alibaba's best practices in implementing AI Agents as a reference to accelerate the arrival of the AI era.

AI Agent Practice (Higress: One-Click Integration of Multiple Data Sources)

Higress is Alibaba's open-source AI-native API gateway, equipped with the most comprehensive AI ecosystem plugins, capable of helping developers integrate multiple data sources with a single click.

It supports the docking of various models, allowing for one-click integration across multiple models, with unified protocols, permissions, and disaster recovery.
Access domain data through search tools, and customer data through MCP Server, integrating the complete data needed for inference.
Standardize data format conversion, build short and long-term memory data through caching and vector retrieval, thereby reducing LLM calls, costs, and enhancing performance and throughput.
Integrate observability to ensure data compliance and conduct data quality assessments.

AI Agent Practice (Otel: Full-Chain Data Quality Tracking)

Based on the Otel observation system, we can automatically analyze the effectiveness of reasoning processes and recall results. If performance is subpar, we can trace the entire customer search and reasoning processes through an end-to-end tracking system to analyze whether the issue lies with the knowledge base, RAG, or toolset, thereby enhancing the optimization efficiency of the data.

AI Agent Practice (Nacos: Dynamic Prompt Data Updates)

The agent comprises a multitude of prompt words and algorithms; through Nacos, we can enable dynamic real-time pushing, obtaining timely optimization effects. If concerns arise post-launch regarding prompt changes, we can use gray configuration to gradually monitor the prompt data optimization effects.

AI Agent Practice (Apache RocketMQ: Enhancing RAG Data Timeliness)

Both system data and customer data are constantly updated. We can sync change events and data in real-time through RocketMQ, ensuring that the most timely data and effects are available for each inference.

AI Agent Practice (AI Industry Expert Solutions)

Using the above technical system, we have established an intelligent diagnostic system through open-source AI experts alongside Alibaba Cloud's native API gateway and Microservices Engine (MSE), solving over 95% of consulting issues and over 85% of anomalies.

By leveraging Higress to shield underlying models and tool systems, we build secure data links and account security systems. The Spring-AI-Alibaba helps construct agents and orchestration, offering chat modes for consultation issues and composer modes for addressing customer anomalies.

AI Agent Practice (DeepSeek Connected Search + Data Security Solutions)

DeepSeek has become popular; those who have used it know that the connected version is the true full-strength variant.

A large number of customers are currently using Higress for one-click integration with DeepSeek and connected capabilities, merging Quark search data for optimal experiences. With Higress, we can enable end-to-end TLS on the model access chain to protect data security. Content safety measures address data compliance and security issues, and we can manage API keys centrally to enhance concurrency, provide internal API keys to agents, prevent API key leakage risks, and control traffic and quotas based on internal API keys to avert costly token calls due to code bugs.