当前位置：首页 > news >正文

SpringAI 模型 API 调用中的错误处理、重试与熔断降级实战

news 2026/5/31 1:25:37

调用外部大模型 API 时，什么网络抖动、限流、服务端 500……简直家常便饭。如果代码一遇到错误就直接把堆栈甩给用户，体验基本为零。本文就从最基础的异常捕获讲起，一路到重试、熔断降级，帮大家把这套防御体系搭完整。

一、先认清“敌人”：常见错误类型

面对错误，首先要搞清楚是什么错，才能对症下药。我用一张表把常见情况归好类：

HTTP 状态码	错误类型	处理方式
401	API Key 无效或过期	报警，不重试
403	无权限（未开通某模型）	报警，不重试
429	限流（请求太频繁）	等待后重试，或切备用模型
500/502/503	服务端错误	重试 1-3 次
超时	网络或模型响应太慢	重试，或降级
余额不足	API 账户额度用完	报警，切备用

Spring AI 中，这些 HTTP 错误会被包装成HttpClientErrorException或HttpServerErrorException的子类，结构清晰，分支处理起来很方便。

二、基础异常处理：别再用一个大 catch 吞一切

此方法不优雅，了解下即可：

最基础的一层：根据异常类型分别处理，别一个catch (Exception e)把所有情况都混在一起。

@RestController public class ChatController { private final ChatClient chatClient; public ChatController(ChatClient chatClient) { this.chatClient = chatClient; } @GetMapping("/chat") public ResponseEntity<String> chat(@RequestParam String prompt) { try { String response = chatClient.call(prompt); return ResponseEntity.ok(response); } catch (HttpClientErrorException.Unauthorized e) { // 401，告警 log.error("API Key 无效", e); return ResponseEntity.status(401).body("认证失败，请检查 API Key"); } catch (HttpClientErrorException.Forbidden e) { // 403 log.error("无权限访问", e); return ResponseEntity.status(403).body("无权使用该模型"); } catch (HttpClientErrorException.TooManyRequests e) { // 429，重试或切备用 log.warn("触发限流"); return ResponseEntity.status(429).body("请求太频繁，请稍后重试"); } catch (HttpServerErrorException e) { // 5xx，重试 log.error("服务端错误", e); return ResponseEntity.status(502).body("AI 服务暂时不可用"); } } }

但这只能做到“有错就报”，对于 429 和 5xx 这类偶发性错误，更优雅的做法是等一等再试。

三、重试机制：给请求多一次机会

方案一：Spring Retry（推荐，声明式重试）

引入依赖：

<dependency> <groupId>org.springframework.retry</groupId> <artifactId>spring-retry</artifactId> </dependency> <dependency> <groupId>org.springframework</groupId> <artifactId>spring-aspects</artifactId> </dependency>

启动类加@EnableRetry：

@SpringBootApplication @EnableRetry public class AiApplication { //... }

在需要重试的 Service 方法上加@Retryable，失败全部由@Recover兜底：

@Service public class AiService { private final ChatClient chatClient; public AiService(ChatClient chatClient) { this.chatClient = chatClient; } @Retryable( retryFor = {HttpServerErrorException.class, HttpClientErrorException.TooManyRequests.class}, maxAttempts = 3, backoff = @Backoff(delay = 1000, multiplier = 2) ) public String callWithRetry(String prompt) { return chatClient.call(prompt); } @Recover public String recover(Exception e, String prompt) { log.error("重试全部失败，降级处理", e); return "AI 服务暂时不可用，请稍后重试"; } }

Controller 只需调用 Service 即可：

@GetMapping("/chat-retry") public ResponseEntity<String> chatRetry(@RequestParam String prompt) { String result = aiService.callWithRetry(prompt); return ResponseEntity.ok(result); }

方案二：手动重试，处理 Retry-After 头

某些 API 在返回 429 时，响应头里会有Retry-After，告诉你需要等多少秒。@Retryable无法直接读取响应头，这时手动循环更合适：

package com.jichi.springaialibaba.service; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.ai.chat.client.ChatClient; import org.springframework.beans.factory.annotation.Qualifier; import org.springframework.stereotype.Service; import org.springframework.web.client.HttpClientErrorException; import org.springframework.web.client.HttpServerErrorException; @Service public class ManualRetryChatService { private static final Logger log = LoggerFactory.getLogger(ManualRetryChatService.class); private final ChatClient chatClient; public ManualRetryChatService(@Qualifier("primaryChatClient") ChatClient chatClient) { this.chatClient = chatClient; } public String chatWithManualRetry(String message) { int maxAttempts = 3; long delayMs = 1000; for (int attempt = 1; attempt <= maxAttempts; attempt++) { try { return chatClient.prompt() .user(message) .call() .content(); } catch (HttpClientErrorException.TooManyRequests e) { if (attempt == maxAttempts) { throw new RuntimeException("请求频率超限，请稍后再试", e); } // 优先读响应头里的 Retry-After，没有就用默认等待时间 String retryAfter = e.getResponseHeaders() != null ? e.getResponseHeaders().getFirst("Retry-After") : null; long waitMs = retryAfter != null ? Long.parseLong(retryAfter) * 1000 : delayMs; log.warn("触发限流，等待 {}ms 后重试（第 {}/{} 次）", waitMs, attempt, maxAttempts); sleep(waitMs); } catch (HttpServerErrorException e) { if (attempt == maxAttempts) throw new RuntimeException("AI 服务异常", e); log.warn("服务端错误，{}ms 后重试（第 {}/{} 次）", delayMs, attempt, maxAttempts); sleep(delayMs); delayMs *= 2; // 指数退避 } catch (Exception e) { throw new RuntimeException("AI 调用失败", e); } } return "AI 服务暂时不可用"; } private void sleep(long ms) { try { Thread.sleep(ms); } catch (InterruptedException e) { Thread.currentThread().interrupt(); throw new RuntimeException("重试被中断", e); } } }

对应的 Controller：

package com.jichi.springaialibaba.controller; import com.jichi.springaialibaba.service.ManualRetryChatService; import org.springframework.web.bind.annotation.*; @RestController @RequestMapping("/api/manual-retry") public class ManualRetryChatController { private final ManualRetryChatService manualRetryChatService; public ManualRetryChatController(ManualRetryChatService manualRetryChatService) { this.manualRetryChatService = manualRetryChatService; } @GetMapping public String chat(@RequestParam String message) { return manualRetryChatService.chatWithManualRetry(message); } }

四、熔断降级：避免雪崩效应

重试只能应对偶发问题。如果模型 API 持续不可用（比如连续 5 分钟），每次还重试 3 次，会严重拖慢整个系统。这时候需要熔断器——错误率超过阈值就直接快速失败，不再调用真实 API，等一段时间后自动尝试恢复。

Resilience4j 是 Spring 生态中的首选。引入依赖：

<dependency> <groupId>io.github.resilience4j</groupId> <artifactId>resilience4j-spring-boot3</artifactId> <version>2.3.0</version> </dependency>

配置熔断器：

resilience4j: circuitbreaker: instances: aiService: sliding-window-size: 10 # 统计最近 10 次调用 failure-rate-threshold: 50 # 失败率超过 50% 触发熔断 wait-duration-in-open-state: 30s # 熔断后等待 30s 再尝试半开 permitted-number-of-calls-in-half-open-state: 3 # 半开状态测试 3 次 retry: instances: aiService: max-attempts: 3 wait-duration: 1s retry-exceptions: - org.springframework.web.client.HttpServerErrorException

package com.jichi.springaialibaba.service; import io.github.resilience4j.circuitbreaker.annotation.CircuitBreaker; import io.github.resilience4j.retry.annotation.Retry; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.ai.chat.client.ChatClient; import org.springframework.beans.factory.annotation.Qualifier; import org.springframework.stereotype.Service; @Service public class ResilientChatService { private static final Logger log = LoggerFactory.getLogger(ResilientChatService.class); private final ChatClient primaryChatClient; private final ChatClient backupChatClient; public ResilientChatService( @Qualifier("primaryChatClient") ChatClient primaryChatClient, @Qualifier("backupChatClient") ChatClient backupChatClient) { this.primaryChatClient = primaryChatClient; this.backupChatClient = backupChatClient; } /** * 先重试，重试都失败后触发熔断，熔断后走降级方法 */ @CircuitBreaker(name = "aiService", fallbackMethod = "fallbackChat") @Retry(name = "aiService") public String chat(String message) { return primaryChatClient.prompt() .user(message) .call() .content(); } /** * 降级方法：主模型熔断时切换到备用模型 * 签名必须和原方法一致，最后加一个 Throwable 参数 */ public String fallbackChat(String message, Throwable throwable) { log.warn("主模型不可用（{}），切换备用模型", throwable.getMessage()); try { return backupChatClient.prompt() .user(message) .call() .content(); } catch (Exception e) { log.error("备用模型也不可用", e); return "AI 服务暂时不可用，请稍后重试。如有紧急需求，请联系客服。"; } } }

对应的 Controller：

package com.jichi.springaialibaba.controller; import com.jichi.springaialibaba.service.ResilientChatService; import org.springframework.web.bind.annotation.*; @RestController @RequestMapping("/api/resilient") public class ResilientChatController { private final ResilientChatService resilientChatService; public ResilientChatController(ResilientChatService resilientChatService) { this.resilientChatService = resilientChatService; } @GetMapping public String chat(@RequestParam String message) { return resilientChatService.chat(message); } }

五、全局异常处理：让 Controller 清爽起来

前面我们在 Controller 里写了不少 try-catch，重复又难看。用@RestControllerAdvice统一拦截 AI 相关异常，返回统一的错误结构。

package com.jichi.springaialibaba.exception; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.http.HttpStatus; import org.springframework.http.ResponseEntity; import org.springframework.web.bind.annotation.ExceptionHandler; import org.springframework.web.bind.annotation.RestControllerAdvice; import org.springframework.web.client.HttpClientErrorException; import org.springframework.web.client.HttpServerErrorException; import java.util.concurrent.TimeoutException; @RestControllerAdvice public class AiExceptionHandler { private static final Logger log = LoggerFactory.getLogger(AiExceptionHandler.class); record ErrorResponse(String code, String message) {} @ExceptionHandler(HttpClientErrorException.Unauthorized.class) public ResponseEntity<ErrorResponse> handleUnauthorized(HttpClientErrorException.Unauthorized e) { log.error("API Key 无效", e); return ResponseEntity.status(HttpStatus.UNAUTHORIZED) .body(new ErrorResponse("UNAUTHORIZED", "API Key 无效或已过期")); } @ExceptionHandler(HttpClientErrorException.Forbidden.class) public ResponseEntity<ErrorResponse> handleForbidden(HttpClientErrorException.Forbidden e) { log.error("权限不足", e); return ResponseEntity.status(HttpStatus.FORBIDDEN) .body(new ErrorResponse("FORBIDDEN", "无权访问该模型")); } @ExceptionHandler(HttpClientErrorException.TooManyRequests.class) public ResponseEntity<ErrorResponse> handleRateLimit(HttpClientErrorException.TooManyRequests e) { return ResponseEntity.status(HttpStatus.TOO_MANY_REQUESTS) .body(new ErrorResponse("RATE_LIMIT", "请求过于频繁，请稍后再试")); } @ExceptionHandler(HttpServerErrorException.class) public ResponseEntity<ErrorResponse> handleServerError(HttpServerErrorException e) { log.error("AI 服务端错误", e); return ResponseEntity.status(HttpStatus.SERVICE_UNAVAILABLE) .body(new ErrorResponse("AI_SERVICE_ERROR", "AI 服务暂时不可用")); } @ExceptionHandler(TimeoutException.class) public ResponseEntity<ErrorResponse> handleTimeout(TimeoutException e) { return ResponseEntity.status(HttpStatus.GATEWAY_TIMEOUT) .body(new ErrorResponse("TIMEOUT", "AI 响应超时，请重试")); } }

这样一来，Controller 里只需要专注业务逻辑，异常全被统一处理，代码干净多了。