Smart Insurance for the Future: The Journey of Cathay Insurance's AI Gateway Innovation
higress
|
Dec 24, 2024
|
In the tide of digital transformation, Cathay Property & Casualty Insurance embraces large model technology with a forward-looking perspective, achieving deep applications of large models in various business scenarios such as outbound calls, customer service, and content generation.
1. Introduction
In the wave of digital transformation, Cathay Property & Casualty Insurance embraces large model technology with a forward-looking perspective, achieving deep applications of large models in various business scenarios such as outbound calls, customer service, and content generation. By introducing Alibaba Cloud's cloud-native API gateway, Cathay Property & Casualty Insurance not only simplifies the complexity of integrating large models but also effectively enhances data security and cost control capabilities, becoming a model for digital transformation in the insurance industry.
2. Background
Cathay Property & Casualty Insurance Co., Ltd. (referred to as "Cathay Property & Casualty Insurance") was founded in Shanghai on August 28, 2008. The company has a registered capital of 2.63 billion yuan and has branches in several provinces and cities in the southeast coastal and central-western regions of China, covering various fields of non-life insurance, including short-term health insurance, accidental injury insurance, property loss insurance, and liability insurance. Cathay Property & Casualty Insurance adheres to the development concept of "customer first," fully embraces the digital wave, and opens a new chapter of "technology insurance," serving the personalized protection needs of families and micro-enterprises, safeguarding the happiness of every household. Cathay Property & Casualty Insurance has always strived to create a customer-centric technology insurance brand. The company has won numerous honors such as "Outstanding Insurance Company," "Best Service Insurance Company," "Outstanding Case of Digital Transformation," and "Outstanding Case of Inclusive Finance" for many consecutive years, achieving high-quality development and market competitive advantages through value innovation in niche markets.
Cathay Property & Casualty Insurance integrates the digital economy with the insurance industry to create a holistic data value delivery system—"Digital Intelligence Dual-Drive System." The system employs a "small front office + large middle platform" strategic framework and constructs an integrated insurance management platform based on mechanisms such as digital operations and technology co-construction. It helps Cathay Property & Casualty Insurance simplify product development and claims experience, making services more efficient and protections more comprehensive.
With the comprehensive advancement of digital transformation, various business applications of Cathay actively embrace large models, utilizing them in scenarios such as outbound calls, customer service, and content generation. For different scenarios, Cathay selects different large models and access methods, using not only self-built foundational models but also calling external vendors' large model APIs, such as Tongyi series models and Tsinghua Zhishu, etc.
3. Core Challenges
Cathay Property & Casualty Insurance faces five major challenges in its digital transformation process: unified access for multiple models, multi-tenancy and authentication, content security, cost control, and audit and risk control.
Unified access for multiple models: In Cathay's business, different large models are used for different business scenarios, and the data structures for requests and responses vary among different access methods, requiring adaptation to different access approaches during use, which is very costly;
Multi-tenancy and authentication: Different large model providers require an API key as access credentials, and when providing services externally, the permissions for different users to access large models need to be controlled, with the cost of self-built authentication and authorization being high;
Content security: The content returned by large models carries security risks, as it may contain non-compliant content, relying on reliable detection services to monitor the input and output of large models to ensure the safety of conversational content;
Cost control: Since the usage of large models is billed based on tokens, it is vital to monitor and observe token usage; understanding token consumption enables cost awareness and control;
Audit and risk control: In the event of certain anomalies, such as excessive token consumption or risks in conversational content, internal audit mechanisms need to be in place to pinpoint requests and callers for risk control.
4. Solutions
To address the aforementioned business pain points of Cathay Property & Casualty Insurance, Alibaba Cloud's cloud-native API gateway offers mature solutions:
1) Unified access for multiple models: The cloud-native API gateway supports connecting various large language models (LLMs) using a unified protocol, supporting 15 LLM providers, which covers most mainstream large model vendors. After unified access through the cloud-native API gateway, users do not need to worry about the differences in data structures for requests and responses from different large models. In addition to protocol unification, the cloud-native API gateway also provides API key management functions, which support managing various application platform API keys aside from the large model API keys. After unified access through the cloud-native API gateway, there is no need to include the large model's API key in the request header.
2) Multi-tenancy and authentication: The cloud-native API gateway provides multiple authentication methods, including JWT, HMAC, and API key. By connecting to various LLMs through the cloud-native API gateway, it can mask the differences in API keys of different large models, enabling the construction of a unified authentication mechanism on different large models based on the authentication features provided by the cloud-native API gateway to manage different consumers externally.
3) Content security: The cloud-native API gateway provides rapid access to Alibaba Cloud content security (Green Network), performing security checks on requests/responses that pass through the gateway. Alibaba Cloud content security has passed evaluations by the China Academy of Information and Communications Technology, meeting ability requirements in four categories: functional requirements, risk control technology requirements, performance requirements, and product safety feature requirements, providing security assurance for LLM conversational content.
4) Cost control: Unlike traditional APIs that charge based on the number of calls, the AI scene typically charges based on the token usage in requests. Therefore, the monitoring and observation of request token usage becomes a necessity. The cloud-native API gateway provides a comprehensive AI observability system, offering observability features across three dimensions: metrics, logs, and traces. With AI observability, users can track token usage for each request, each model, each consumer, and other dimensions, facilitating cost awareness and management.
5) Audit and risk control: The cloud-native API gateway offers detailed tracing mechanisms for providing audit and risk control features. For instance, when there are risks in conversational content, it is possible to ascertain which request, which consumer, and which keywords triggered the risk detection. Based on audit results, users can promptly address risks, such as throttling tokens for consumers or revoking access permissions.
5. Technical Advantages
Compared to other gateways, the cloud-native API gateway has the following technical advantages: high performance, high availability, ease of scalability, and high observability.
In the AI context, the traffic that goes through the gateway has the following three major characteristics, distinguishing it from other business traffic:
Long connections: Determined by the common WebSocket and SSE protocols in the AI scene, there is a high proportion of long connections, requiring the gateway's configuration updates to have no impact on long connections, and not disrupt business operations.
High latency: The response latency of LLM inference is much higher than that of ordinary applications, making AI applications vulnerable to malicious attacks, as they can be concurrently attacked by slow requests. The cost to attackers is low, but the server-side overhead is very high.
Large bandwidth: Given the back-and-forth transmission of context in LLMs and the high-latency characteristics, AI scenarios consume bandwidth far exceeding that of ordinary applications. If the gateway does not achieve good streaming processing capabilities and memory recycling mechanisms, there could be rapid increases in memory.
To respond to AI traffic, Alibaba Cloud's cloud-native API gateway, based on the Envoy core, has inherent advantages, including:
Lossless hot updates for long connections: Unlike Nginx, which requires a reload for configuration changes that cut connections, Higris, based on Envoy, achieves true hot updates without losing connections.
Security gateway capabilities: The security gateway capabilities based on Higris can provide multi-dimensional CC protection capabilities such as IP/Cookie. In the context of AI, in addition to Queries Per Second (QPS), it also supports flow control protection oriented toward token throughput.
Efficient streaming transmission: Higris supports full streaming forwarding, and the data plane is written in C++ based on Envoy, requiring very little memory in high-bandwidth scenarios. While memory is relatively inexpensive compared to GPUs, improper memory control can lead to Out Of Memory (OOM) situations, resulting in significant business disruptions and losses.
In terms of the cloud-native API gateway's own availability, it eliminates the availability issues of self-built infrastructure through multi-availability zone disaster recovery, elastic scaling, and self-healing, providing a 99.95% SLA guarantee.
In the AI ecosystem and scalability aspects, through the Wasm plugin mechanism, the cloud-native API gateway offers up to 15 AI-related plugins in its plugin market, covering large model proxies, sensitive data detection, content security audits, custom statistics, and token throttling scenarios. This enables access to various LLMs and integration with cloud services such as Alibaba Cloud content security, Redis, and vector retrieval services. By flexibly combining these plugins, it meets our fundamental needs in large model scenarios and allows for the establishment of different control strategies for various segmented business scenarios. Additionally, the plugin market supports uploading custom plugins, significantly enhancing the gateway's scalability.
In terms of observability, Alibaba Cloud's cloud-native API gateway integrates with cloud monitoring and log services, offering ready-to-use multi-dimensional dashboards that support business monitoring and fault localization. Users can leverage query analysis capabilities of cloud monitoring/log services to customize dashboards and alerts as needed.
6. Conclusion
With the implementation of Alibaba Cloud's cloud-native API gateway at Cathay, all traffic accessing large models at Cathay is now proxied through the cloud-native API gateway. While consuming nearly 100 million tokens daily, it efficiently filters sensitive information for each request, ensuring comprehensive audits for both the inputs to large models and the content generated by them, greatly reducing the data security risks associated with using large models. Through the AI plugins of the gateway, Cathay Property & Casualty Insurance can track each token's use, including which consumers are using it and in which scenarios, providing strong data support for subsequent analysis and cost control.