Comparison of data before and after using Streamable HTTP

Jing Ze

|

Apr 28, 2025

|

Share on X

The MCP (Model Context Protocol) protocol is a standard protocol used for communication between AI models and tools. As AI applications become increasingly complex and widely deployed, existing communication mechanisms face a number of challenges. Recently, the MCP repository's PR #206 introduced a brand new Streamable HTTP transport layer to replace the original HTTP+SSE transport layer. The two protocols are simply compared as follows:

  • HTTP+SSE: The client sends requests via HTTP POST, and the server pushes responses through a separate SSE (Server-Sent Events) endpoint, requiring the maintenance of two separate connections.

  • Streamable HTTP: It uses a single HTTP endpoint to handle requests and responses uniformly, allowing the server to choose between returning standard HTTP responses or enabling SSE streaming as needed.

This article will detail the technical details and practical advantages of Streamable HTTP.

Key Points Overview

  • Streamable HTTP offers better stability compared to HTTP + SSE, performing better under high concurrency scenarios.

  • In terms of performance, Streamable HTTP has clear advantages over HTTP + SSE, with shorter and more stable response times.

  • The client implementation of Streamable HTTP is simpler than HTTP + SSE, requiring less code and lower maintenance costs.

Why Choose Streamable HTTP?

Problems with HTTP + SSE

In the implementation of the HTTP+SSE transmission process, clients and servers communicate through two main channels: (1) HTTP request/response: The client sends messages to the server via standard HTTP requests. (2) Server-Sent Events (SSE): The server pushes messages to the client through a dedicated /sse endpoint, leading to three main issues:

  • The server must maintain long connections, which can lead to significant resource consumption under high concurrency.

  • Server messages can only be transmitted via SSE, which creates unnecessary complexity and overhead.

  • Infrastructure compatibility: Many existing network infrastructures may not correctly handle long-term SSE connections. Enterprise firewalls may forcefully terminate timed-out connections, making the service unreliable.

Improvements in Streamable HTTP

Streamable HTTP is a significant upgrade to the MCP protocol, solving several key issues of the original HTTP + SSE transmission method through the following improvements:

1. Unified Endpoint Design

Streamable HTTP removes the dedicated /sse endpoint for establishing connections, integrating all communication into a single endpoint. The benefits of this design include:

  • Simplified architecture: Reduces the number of connections between the client and server, lowering system complexity.

  • Lower resource consumption: Single connection management is more efficient, reducing server resource usage.

  • Improved compatibility: Better adapts to existing network infrastructure, reducing compatibility issues with firewalls and proxy servers.

2. Flexible Transmission Modes

The server can flexibly choose to return standard HTTP responses or stream them via SSE based on the type and content of the request:

  • On-demand streaming: For simple requests, a direct HTTP response can be returned without establishing long connections.

  • Intelligent degradation: Can automatically degrade to standard HTTP mode in poor network conditions.

  • Resource optimization: Dynamically allocates server resources based on request complexity to improve overall efficiency.

3. Powerful Session Management

A comprehensive session mechanism has been introduced to support state management and recovery:

  • Session consistency: Ensures state consistency across requests via the Mcp-Session-Id header.

  • Reconnect on disconnection: Supports the Last-Event-ID mechanism to ensure that messages not received after a disconnection can be recovered.

  • State recovery: Allows clients to restore previous session states upon reconnection, enhancing user experience.

HTTP + SSE vs Streamable HTTP

The comparison illustrates the advantages of Streamable HTTP over HTTP + SSE from three aspects: stability, performance, and client complexity. The AI gateway Higress has currently implemented the Streamable HTTP protocol. An MCP Server using the HTTP + SSE protocol has been deployed via a sample server of the official MCP Python SDK, and a Streamable HTTP protocol MCP Server has been deployed through Higress.

Stability Comparison

TCP Connection Count Comparison

Using a Python program to simulate 1000 users concurrently accessing a remote MCP Server and calling for a tool list, it can be seen from the figure that the SSE connection of the SSE Server cannot be reused and requires long-term maintenance. The high concurrency demand will also lead to a sharp increase in the number of TCP connections, while the Streamable HTTP protocol can directly return responses, reusing the same TCP connection for multiple requests. The number of TCP connections reaches a maximum of only a few dozen, and the overall execution time is only a quarter of that of the SSE Server.

In the testing scenario with 1000 concurrent users, the number of TCP connections for the Streamable HTTP solution deployed by Higress is significantly lower than the HTTP + SSE solution:

  • HTTP + SSE: Requires maintaining a large number of long connections, with the TCP connection count continuously increasing over time.

  • Streamable HTTP: Establishes connections as needed, maintaining a low level of TCP connections.

Request Success Rate Comparison

In practical application scenarios, the process-level connection limits are usually capped at the maximum connection count, which is typically 1024 in Linux. Using a Python program to simulate different numbers of users accessing a remote MCP Server and calling for a tool list, the success rate of the SSE Server drops sharply after the concurrent request numbers reach the maximum connection limit, causing many concurrent requests to fail as they cannot establish new SSE connections.

In the request success rate testing under different numbers of concurrent users, the success rate of the Streamable HTTP deployed by Higress is significantly higher than that of the HTTP + SSE solution:

  • HTTP + SSE: The success rate drops significantly as the number of concurrent users increases.

  • Streamable HTTP: Can maintain a relatively high success rate even under high concurrency conditions.

Performance Comparison

Here we compare the community Python version of the GitHUB MCP Server with Higress’s MCP Market's GitHUB MCP Server.

Using a Python program to simulate different numbers of users concurrently accessing a remote MCP Server and calling for a tool list, the response time was recorded. The response time comparison shown in the figure uses logarithmic scale. The average response time of the SSE Server significantly increases from 0.0018s to 1.5112s when the number of concurrent users is high, while the Streamable HTTP Server deployed by Higress maintains a response time of 0.0075s, benefiting also from the production-grade performance of Higress compared to the Python Starlette framework.

The performance testing results show that the Streamable HTTP deployed by Higress has a clear advantage in response time:

  • The average response time of Streamable HTTP is shorter, with less fluctuation in response time, and grows more steadily with increasing concurrent user numbers.

  • The average response time of HTTP + SSE is longer, with significant fluctuations in response time under high concurrency scenarios.

Client Complexity Comparison

Streamable HTTP supports both stateless and stateful services. Most current scenarios can be solved with the stateless Streamable HTTP. By comparing the client implementation code of the two transmission solutions, the simplicity of the stateless Streamable HTTP client implementation can be clearly seen.

HTTP + SSE Client Sample Code

class SSEClient:
    def __init__(self, url: str, headers: dict = None):
        self.url = url
        self.headers = headers or {}
        self.event_source = None
        self.endpoint = None

    async def connect(self):
        # 1. Establish SSE connection
        async with aiohttp.ClientSession(headers=self.headers) as session:
            self.event_source = await session.get(self.url)

            # 2. Handle connection events
            print('SSE connection established')

            # 3. Handle message events
            async for line in self.event_source.content:
                if line:
                    message = json.loads(line)
                    await self.handle_message(message)

                    # 4. Handle errors and reconnect
                    if self.event_source.status != 200:
                        print(f'SSE error: {self.event_source.status}')
                        await self.reconnect()

    async def send(self, message: dict):
        # Need an additional POST request to send the message
        async with aiohttp.ClientSession(headers=self.headers) as session:
            async with session.post(self.endpoint, json=message) as response:
                return await response.json()

    async def handle_message(self, message: dict):
        # Handle the received message
        print(f'Received message: {message}')

    async def reconnect(self):
        # Implement reconnection logic
        print('Attempting to reconnect...')
        await self.connect()

Streamable HTTP Client Sample Code

class StreamableHTTPClient:
    def __init__(self, url: str, headers: dict = None):
        self.url = url
        self.headers = headers or {}

    async def send(self, message: dict):
        # 1. Send a POST request
        async with aiohttp.ClientSession(headers=self.headers) as session:
            async with session.post(self.url, json=message,
                headers={'Content-Type': 'application/json'}
            ) as response:
                # 2. Handle the response
                if response.status == 200:
                    return await response.json()
                else:
                    raise Exception(f'HTTP error: {response.status}')

The code comparison shows:

  1. Complexity: Streamable HTTP does not require managing connections, reconnections, or other complex logic.

  2. Maintainability: Streamable HTTP has a clearer code structure, making it easier to maintain and debug.

  3. Error handling: Streamable HTTP’s error handling is more straightforward, with no need to consider connection states.

Conclusion and Outlook

With the continuous development of the MCP protocol, the introduction of the Streamable HTTP transmission mechanism marks a significant step towards a more efficient and stable protocol. Through unified endpoint design, flexible transmission modes, and powerful session management, Streamable HTTP addresses many pain points of the HTTP+SSE solution, providing a more reliable communication foundation for AI applications.

For developers and enterprises looking to quickly deploy high-performance MCP services, mcp.higress.ai offers an MCP server hosting market built on the open-source AI gateway Higress. The unique advantages of this market include:

  • Dual protocol support: Simultaneously provides high-performance Streamable HTTP protocol and a service compatible with POST+SSE protocols, ensuring compatibility with various clients.

  • Direct API transformation: No coding required, directly converting specification-compliant API documentation into production-ready MCP services.

  • Zero-cost deployment: Existing APIs can be transformed into MCP services with almost no cost.

  • Practical value orientation: All provided capabilities are based on the usage scenarios of the APIs themselves, providing real application value.

Unlike other MCP markets, mcp.higress.ai does not simply collect unverified Python/TS projects from the open-source community, but focuses on transforming mature APIs into high-quality MCP services, providing reliable tools and resource support for AI applications. With the popularity of the MCP protocol and the expansion of application scenarios, the Higress-based MCP services will provide a more solid infrastructure support for the AI ecosystem.

Contact

Follow and engage with us through the following channels to stay updated on the latest developments from higress.ai.

Contact

Follow and engage with us through the following channels to stay updated on the latest developments from higress.ai.

Contact

Follow and engage with us through the following channels to stay updated on the latest developments from higress.ai.

Contact

Follow and engage with us through the following channels to stay updated on the latest developments from higress.ai.