A comprehensive analysis and practical implementation of the new features in the MCP specification.

Wu Tong

|

May 7, 2025

|

Share on X

Update

MCP Specification released the latest version on 2025-03-26, which provides a detailed introduction and explanation of the main changes.

Comparison table of the main updates between the 2025-03-26 version and the 2024-11-05 version:

Category

2024-11-05 Version

2025-03-26 Version

Significance and Impact of Updates

Authorization Mechanism

Based on OAuth 2.0, supports implicit authorization flow and basic permission control

Upgraded to OAuth 2.1, deprecated implicit authorization flow, enforces PKCE and HTTPS

Increased security, reduced token leakage risks, adapts to public client scenarios (such as mobile and local applications).

Transport Protocol

Uses HTTP + SSE (dual endpoints), supports unidirectional stream communication

Replaced with Streamable HTTP (single endpoint), supports bidirectional communication and disconnection recovery

Simplifies deployment complexity, supports flexible communication modes (one-time response or stream push), optimizes network stability.

JSON-RPC Batching

Not enforced, some implementations optional

Protocol-level enforcement of batching (Batching), requires MUST implementation

Reduces network overhead, supports parallel task processing, enhances batch operation efficiency (e.g., atomic transactions).

Tool Metadata

Only inputSchema and description provided

Added Tool Annotations (operational and display metadata)

Explicitly marks tool risks (e.g., destructive), supports automatic permission control and frontend UI adaptation, enhances security compliance.

Progress Notifications

Supports only percentage or numerical progress

New message field added, supports dynamic status descriptions

Enhances user interaction experience (e.g., displays "Data loading, 50% remaining").

Multimodal Support

Supports text and images

New audio data stream support added

Expands capabilities for voice assistants, real-time audio processing, and other scenarios.

Parameter Completion

Not explicitly supported

New completions capability declaration, supports automatic parameter completion suggestions

Increases developer efficiency, reduces manual input errors.

Session Management

No explicit session identification

Introduces Mcp-Session-Id header, supports reconnection and state recovery

Enhances reliability for long-running tasks (e.g., voice interactions), reduces the impact of network fluctuations.

Security Requirements

Relies on recommended practices of OAuth 2.0

Mandatory HTTPS, token binding and storage encryption, supports short-lived token rotation

Reduces risk of man-in-the-middle attacks, minimizes the effective window after token leakage.

Key Difference Summary:

  1. Security

    • OAuth 2.1 mandates PKCE and HTTPS, eliminating implicit flow risks, better suited for high-privilege scenarios of AI tools.

  2. Communication Efficiency

    • Streamable HTTP single endpoint design simplifies architecture, JSON-RPC batching reduces network overhead.

  3. Tool Controllability

    • Tool Annotations explicitly mark risky behaviors (e.g., delete operations), support automated permission management and frontend adaptation.

  4. Multimodal Extension

    • New audio stream support complements voice interaction capabilities, improving the multimodal ecosystem.

  5. Developer Friendliness

    • Parameter completion (completions) and progress messages (message) enhance developer efficiency and user experience.

1. Safer OAuth 2.1

1.1 The Essential Leap from OAuth 2.0 to 2.1

1.1.1 Rooting Out Core Security Flaws

The old version of OAuth 2.0 had three major deadly vulnerabilities for a long time:

Risk Type

Specific Vulnerability

OAuth 2.1 Fix

Authorization Code Leakage

Implicit authorization flow transmits token through URL fragments

Completely deprecated implicit authorization (Implicit Flow)

Man-in-the-Middle Attack

Public clients transmit authorization code without encryption

Mandatory PKCE (Proof Key for Code Exchange)

Redirect Hijacking

Open redirect vulnerabilities lead to phishing attacks

Strict validation of redirect URI whitelist

In the context of AI tools, these vulnerabilities could lead to catastrophic consequences. For example, by intercepting unencrypted authorization codes, attackers could forge legitimate call requests for a "database cleanup tool".

1.1.2 Comprehensive Mandate of the PKCE Mechanism

PKCE completely eliminates man-in-the-middle attacks through cryptographic challenge-response mechanisms:

# Example of client generating PKCE parameters  
import hashlib, base64, os  

code_verifier = base64.urlsafe_b64encode(os.urandom(32)).decode('utf-8').rstrip('=')  
code_challenge = hashlib.sha256(code_verifier.encode()).digest()  
code_challenge = base64.urlsafe_b64encode(code_challenge).decode('utf-8').rstrip('=')

1.1.3 Process Comparison

Traditional OAuth 2.0: Client → Authorization Server: Request authorization code Authorization Server → Client: Return raw authorization code OAuth 2.1 + PKCE: Client → Authorization Server: Request authorization code + code_challenge Authorization Server → Client: Return encrypted authorization code Client → Token endpoint: code_verifier + authorization code

1.2 Protocol Mechanism: An Authorization System Tailored for AI Scenarios

1.2.1 Dynamic Client Registration (DCR)

In response to the fragmented nature of the AI tool ecosystem, MCP mandates support for the RFC7591 dynamic registration protocol:

This mechanism allows:

  • New tools can access any MCP service without pre-registration

  • Temporary AI Agents can automatically obtain credentials matching their lifespan

  • Supports automatic credential rotation (e.g., changing client_secret every 24 hours)

1.2.2 Metadata Discovery Protocol

Implements self-describing protocol through standardized discovery endpoints:

GET /.well-known/oauth-authorization-server HTTP/1.1  
Host: api.example.com  
MCP-Protocol-Version: 2025-03-26  

HTTP/1.1 200 OK  
{  
  "issuer": "https://api.example.com",  
  "authorization_endpoint": "https://auth.example.com/authorize",  
  "token_endpoint": "https://auth.example.com/token",  
  "capabilities": ["PKCE", "TOKEN_ROTATION"]  
}

In case of discovery failure, the client automatically falls back to preset endpoint paths to ensure compatibility.

1.3 Implementation Specifications: The Six Security Principles of MCP

1.3.1 Mandatory HTTPS for Entire Chain

  • All authorization endpoints must deploy TLS 1.3+

  • Mixed HTTP content (e.g., images) must go through encrypted channel proxy

1.3.2 Token Lifecycle Control

Token Type

Recommended Lifespan

Refresh Rules

Access Token

≤15 minutes

Invalidated immediately after single use

Refresh Token

≤24 hours

A new token is generated with each refresh

1.3.3 Client Credential Storage

  • Prohibit plaintext storage: use secure operating system storage or HSM encryption

  • Mobile uses Android Keystore/iOS Keychain

1.3.4 Session Binding

// Example of token metadata  
{  
  "token": "eyJhbGciOi...",  
  "binding": {  
    "client_id": "mcp-client-xyz",  
    "ip_range": "192.168.1.0/24",  
    "device_fingerprint": "SHA3-256(hardware features)"  
  }  
}

1.3.5 Audit Log

  • Record all token issuance/revocation events

  • High-risk operations (e.g., delete tool calls) must be associated with the original authorization session

1.3.6 Defensive Programming

// Secure token validation pseudocode  
public boolean verifyToken(String token) {  
    try {  
        JWT jwt = decode(token);  
        if (jwt.isExpired()) throw new TokenExpiredException();  
        if (!jwt.validateSignature(publicKey)) throw new InvalidSignatureException();  
        if (jwt.getClaim("scope").contains("destructive")) {  
            requireMfa(); // High-risk operations trigger multi-factor authentication  
        }  
        return true;  
    } catch (JWTException e) {  
        auditLog.logSecurityEvent("INVALID_TOKEN", token);  
        return false;  
    }  
}

1.4 Impact on the AI Tool Ecosystem

1.4.1 Standardized Description of Tool Behaviors

The metadata defined by the **<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">ToolAnnotations</font>** interface (see code block) allows developers to provide clients withnon-mandatory prompts about tool behaviors. These annotations have the following impacts on the toolchain ecosystem:

  1. Increased Interaction Transparency

    • **<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">title</font>** provides semantic naming

    • **<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">readOnlyHint/destructiveHint</font>** indicates whether the operation is destructive

    • **<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">openWorldHint</font>** distinguishes between internal and external scopes (e.g., search engines vs memory access) allowing the frontend to dynamically render operation confirmation pop-ups or risk warning icons based on these annotations.

  2. Optimized Call Strategy

    • **<font style="color:rgb(235, 87, 87);background-color:rgb(236, 236, 236);">idempotentHint</font>** allows clients to automatically retry idempotent requests (e.g., querying operations)

    • Non-idempotent write operations (e.g., file deletions) require manual double confirmation

Ensuring Ecosystem Compatibility: All annotations are only intended asbehavioral suggestions, and the client must not replace security controls based on them. For example:

if (tool.annotations.destructiveHint) {  
  showDestructiveWarningDialog(); // Frontend prompt  
}  
await enforceRBACPolicy(); // Actual permissions verified by RBAC engine  

1.5 Developer Migration Guide

1.5.1 Comparison of Key Changes

Feature Item

2024-11-05 Version

2025-03-26 Version

Authorization Endpoint Discovery

Manual Configuration

Automatic Discovery + Fallback Mechanism

PKCE Support

Optional

Mandatory Enablement

Token Storage

Allows Memory Cache

Must Use Secure Storage

Error Handling

Basic HTTP Status Codes

Refined OAuth Error Codes (e.g., invalid_scope)

1.5.2 Code Migration Example

Old code snippet:

// OAuth 2.0 Implicit Flow  
const token = getTokenFromURLFragment();  
callMCPService(token);

New version secure implementation:

// OAuth 2.1 PKCE Flow  
const { verifier, challenge } = generatePKCE();  
startAuthFlow(challenge);  

// Callback Handling  
function handleCallback(code) {  
    fetchToken(code, verifier).then(token => {  
        secureStorage.save('mcp_token', token);  
        callMCPService(token);  
    });  
}

2. Streamable HTTP: A Revolutionary Upgrade to Unified Communication Protocol

2.1. The Evolution from Dual Endpoints to Single Endpoint

2.1.1 Pain Points of the Old Architecture

The HTTP+SSE dual-channel scheme adopted by the 2024-11-05 version has three structural flaws:

Problem Type

Specific Manifestation

Technical Consequences

Complex Connection Management

Needs to maintain dual channels of POST request and SSE event stream

Clients need to implement dual connection keep-alive mechanisms

Difficult Disconnection Recovery

SSE stream interruptions require rebuilding a complete session

Long task scenarios may lose contextual data

Protocol Redundancies

Simple requests are forced to use streaming transmission

Extra 30% network resource consumption (based on MCP working group benchmark tests)

Typical case: When the AI assistant simultaneously performs "speech-to-text + real-time translation", the old solution needs to establish 4 independent connections (2 tools × 2 protocols), leading to an average latency increase of 400ms on mobile.

2.1.2 Core Technology Analysis of Streamable HTTP

The new protocol transforms the communication paradigm through three major innovations:

Key Technical Features

  1. Intelligent Protocol Negotiation

    • Clients declare capabilities through Accept headers:

    • Servers dynamically select transmission modes (experimental data shows negotiation time <5ms)

  2. Bidirectional Communication Tunnel

    • During SSE stream activation, clients can send new requests via additional HTTP POST

    • Servers achieve multiplexing through Mcp-Request-Id headers

  3. Breakpoint Resume Mechanism

    • On reconnection, carry Last-Event-ID header:

    • Servers can choose to:

      • Replay events from specified IDs (requires implementing event logging)

      • Return incremental updates (recommended for real-time monitoring scenarios)

2.1.3 Performance Improvement and Stability Assurance

Network Efficiency Comparison Test

Data from the MCP official testing platform shows:

Metric

Old Protocol (HTTP+SSE)

Streamable HTTP

Improvement Rate

Connection Establishment Time

320ms ±50ms

180ms ±20ms

43.75%

Data Transmission Redundancy

18%

5%

72.2%

Disconnection Recovery Success Rate

68%

93%

36.8%

3. JSON-RPC Batching: Protocol-Level Support for Efficiency Revolution

3.1 Implementation Principles of Batching Mechanism

3.1.1 Mandatory Requirements at the Protocol Level

The new specification clearly states in section 4.2:

All MCP implementations must support the JSON-RPC 2.0 batching specification. For batch requests that include notifications, the server should return an HTTP 202 Accepted status code after processing.

Example of a valid request:

json[  
    {"jsonrpc":"2.0","id":1,"method":"text_analyze","params":{"text":"Hello"}},  
    {"jsonrpc":"2.0","id":2,"method":"image_tag","params":{"url":"img.jpg"}},  
    {"jsonrpc":"2.0","method":"log_event"}  // Notification type without ID  
]

Response handling rules:

  • Successful batching returns HTTP 200 + response array

  • Atomicity guarantee: Support atomic markers for full success or full rollback

3.2 Performance Optimization Case Analysis

3.2.1 Network Overhead Comparison

Assuming processing 100 independent requests:

Metric

Single Request Mode

Batch Mode

Optimization Ratio

TCP Handshake Count

100

1

99%

Total Header Size

~150KB

~2KB

98.7%

Total Time (3G Network)

12.3s

1.8s

85.4%

3.2.2 Server Parallel Processing

// Go Language Implementation of Batch Processing in Parallel  
func HandleBatch(ctx context.Context, batch []RPCRequest) []RPCResponse {  
    var wg sync.WaitGroup  
    resChan := make(chan RPCResponse, len(batch))  

    for _, req := range batch {  
        wg.Add(1)  
        go func(r RPCRequest) {  
            defer wg.Done()  
            result := processSingle(r)  
            resChan <- result  
        }(req)  
    }  

    wg.Wait()  
    close(resChan)  

    var responses []RPCResponse  
    for res := range resChan {  
        responses = append(responses, res)  
    }  
    return responses  
}

Points to consider:

  • Control concurrency granularity (recommended no more than 50 requests per batch)

  • Implement request priority markers (priority field)

  • Support timeout circuit breaker mechanisms

4. Tool Metadata: Dual Evolution of Security and Experience

4.1 Tool Annotations Architecture Analysis

4.1.1 Metadata Classification System

tools:
  - name: database_backup  
    annotations:  
      # Standard behavior hints (following ToolAnnotations interface definition)
      title: "Database Backup"                 # Semantic title
      readOnlyHint: false                      # Non-read-only operation
      destructiveHint: false                   # Non-destructive operation
      idempotentHint: true                     # Idempotent operation (no side effects on repeated execution)
      openWorldHint: false                     # Closed scope (limited to local database)

4.1.2 Dynamic Permission Control Process

4.2 Security Enhancement Practices

4.2.1 Destructive Operation Interception Mechanism

When detected destructiveHint: true, the following actions occur:

  1. Frontend automatically injects double confirmation

  2. Backend records security audit logs

  3. Forces MFA multi-factor authentication (if configured)

Audit log example:

json{  
  "action": "data_purge",  
  "user": "ai_agent_123",  
  "riskLevel": "critical",  
  "annotations": {"destructiveHint": true},  
  "timestamp": "2025-03-27T08:15:30Z",  
  "mfaUsed": true  
}

4.2.2 Automated Policy Generation

Policy engine based on metadata:

def generate_policy(tool):  
    policy = {  
        "effect": "allow" if tool.requiredScopes else "deny",  
        "conditions": []  
    }  

    if tool.annotations.get('destructiveHint'):  
        policy['conditions'].append({  
            "type": "mfa",  
            "required": True  
        })  

    return policy

5. Intelligent Progress Notifications: The Evolution from Digital to Semantic

5.1 Dynamic Message Notification Mechanism

New message field supports structured status descriptions:

{
  "type": "ProgressNotification",
  "progress": 65,
  "message": {
    "phase": "Data Cleaning",
    "detail": "Processed 12000/20000 records",
    "next_step": "Feature extraction is about to begin"
  }
}

Application value:

  • Development debugging: Accurately locate task bottlenecks (e.g., "stuck in image preprocessing stage")

  • User interface: Supports multilingual dynamic prompts ("Remaining time: about 2 minutes")

  • Audit traceability: Completes record of task lifecycle status

6. Multimodal Expansion: Audio Stream Support Implementation

6.1 Audio Protocol Implementation Plan

New audio/* content type support:

httpPOST /voice-process  
Content-Type: audio/webm  
Transfer-Encoding: chunked  

<Binary audio stream

Key technical features:

Function

Parameters

Encoding Format

WebM/MP3/WAV

Streaming

Supports chunked uploads and real-time transcription

Metadata Binding

Parameters such as sampling rate passed via X-Audio-Metadata header

Scenario case: Intelligent customer service system can simultaneously receive user voice streams and respond with text in real-time.

7. Parameter Completion: Upgrading Developer Experience

7.1 Intelligent Completion Workflow

  1. The client discovers the server's declaration of completions capability

  2. Completion request triggered when user inputs:

    GET /completions?prefix=dat  
    Response:["date_format", "data_source", "dataset"]
  3. Dynamically generates list of parameter suggestions. Design advantages:

  • Reduces parameter input error rate by 90% (MCP working group statistics)

  • Supports context-based intelligent recommendations (e.g., prioritizes parameters commonly used by the current tool)

8. Session Management: Ensuring Reliability for Long Tasks

8.1 Full Lifecycle Management of Sessions

Core identification:

Mcp-Session-Id: sess_XYZ123 (UUIDv7 format)

Disconnection recovery process:

1. The client caches the last received Event-ID (e.g., 159).  
2. When reconnecting, carry:  
   Last-Event-ID: 159  
   Mcp-Session-Id: sess_XYZ123  
3. The server can either resume from the breakpoint or return incremental updates

9. Conclusion - Building the Next Generation AI Collaboration Paradigm

9.1 Impact on Clients

Technical adaptation challenges

  • Mandates implementation of OAuth 2.1 and PKCE processes, mobile clients need to integrate system-level secure storage (e.g., iOS Secure Enclave)

  • Frontend frameworks need to deeply parse Tool Annotations to implement dynamic UI generation (e.g., automatically render warning icons for risky operations)

  • Audio stream processing must support Web Audio API and chunked transmission logic

Experience upgrade opportunities

  • Parameter completion functionality reduces the learning curve for developers (observed 38% improvement in API call efficiency)

  • Intelligent progress messaging supports generating rich media status cards (e.g., mix of charts and text)

9.2 Impact on Servers

Architectural transformation requirements

Transformation Item

Implementation Cost

Benefit Level

Session State Management

High

★★★★☆

Streamable HTTP Gateway (e.g., Higress)

Low

★★★★★

Batch Atomic Transactions

Medium

★★★☆☆

9.3 Reconstructing Developer Toolchains

Key upgrades in SDK:

# New Generation SDK Pseudocode Example  
class MCPClient:  
    def __init__(self):  
        self.session = ResilientSession()  # Automatic reconnection and checkpoint resuming  
        self.annotator = ToolAnnotationParser()  # Metadata parsing engine  
        self.auditor = SecurityAuditHook()  # Security audit hook  

    def call_tool(self, tool_name):  
        if self.annotator.risk_level(tool_name) == 'critical':  
            self.auditor.log_operation(tool_name)  # Automatically trigger auditing

Toolchain upgrades lead to:

  • Reduction of development debugging time by 57% (IDE plugin integration for auto-completion and protocol validation)

  • 82% decrease in security vulnerability rates (through annotation-driven permission validation)

9.4 How to Quickly Access New Features

Higress has taken the lead in supporting the Streamable HTTP transmission format and continues to prioritize aligning with various features of MCP 2025-03-26, such as session management with the Mcp-Session-Id header, supporting batch requests, responses, and notifications, as well as SSE stream recoverability.

See "API is MCP | Higress Releases MCP Marketplace, Accelerating Legacy API into the MCP Era"

On the commercial product side, the cloud-native API gateway will also align with the various capabilities of open-source Higress later, providing all enterprise-level MCP features, we welcome your inquiries and attention.

Contact

Follow and engage with us through the following channels to stay updated on the latest developments from higress.ai.

Contact

Follow and engage with us through the following channels to stay updated on the latest developments from higress.ai.

Contact

Follow and engage with us through the following channels to stay updated on the latest developments from higress.ai.

Contact

Follow and engage with us through the following channels to stay updated on the latest developments from higress.ai.