葫芦的运维日志_teg_AWS API Gateway多产品网关实战

打赏

✸ ✸ ✸

AWS API Gateway 多产品网关基础设施实战

当企业有多条产品线，每条产品线都需要独立的 API Gateway，但认证逻辑、路由策略、安全防护高度相似时，怎么管理？复制粘贴 Terraform 代码？那维护成本会指数级增长。

本文基于一个真实的生产级项目，详细拆解如何用 Terraform 模块化 + Python 代码生成 + Go Lambda Authorizer 构建一套可复用的多产品 API Gateway 基础设施。

一、整体架构概览

先看全局视角，理解各组件之间的关系：

核心设计思路：

请求经过 WAF 限流和 Resource Policy 安全过滤后，进入各产品线的 API Gateway
每个产品线（Product-A、Product-B）拥有独立的 API Gateway 实例，互不干扰
Go Lambda Authorizer 通过 VPC Endpoint 调用 Token Gateway（System Gateway）校验 Token
认证通过后，请求经 VPC Link -> NLB -> Nginx Ingress Controller 路由到 EKS 集群中的 Service 和 Pod
共享 Terraform 模块（gate / system），Python 脚本自动生成 OpenAPI body.json 驱动路由定义

二、项目目录结构

先看整体目录，理解代码组织方式：

gateway-infrastructure/
|-- module/                          # 共享 Terraform 模块
|   |-- gate/                        # 主网关模块（对外）
|   |   |-- api.tf                   # API Gateway + Stage + Deployment
|   |   |-- authorizer.tf            # Go Lambda Authorizer
|   |   |-- outputs.tf
|   |   |-- variables.tf
|   |   +-- src/tokenValidator/      # Go 认证器源码
|   |       |-- main.go
|   |       |-- go.mod
|   |       +-- tokenValidator.zip   # 编译后的部署包
|   +-- system/                      # 系统网关模块（内部）
|       |-- api.tf
|       |-- authorizer.tf            # Python Lambda Authorizer
|       +-- lambda/lambda.zip
|-- product-a/                       # 产品A 消费模块
|   |-- main.tf                      # Provider + S3 Backend
|   |-- api.tf                       # 引用 module/gate
|   |-- variables.tf
|   |-- environment/
|   |   |-- dev.tfvars.json          # 开发环境路由配置
|   |   |-- staging.tfvars.json
|   |   |-- preprod.tfvars.json
|   |   +-- prod.tfvars.json
|   |-- gateway_api/
|   |   |-- body.json                # 生成的 OpenAPI 定义
|   |   +-- policy.json              # 资源策略
|   +-- gateway_system/
|       |-- body.json
|       +-- policy.json
|-- product-b/                       # 产品B（同样结构）
|-- generate_path.py                 # 路由代码生成器
|-- generate_templates.py            # OpenAPI 模板定义
+-- deploy.sh                        # 一键部署脚本

这个结构的精妙之处在于：模块层（module/）定义"怎么建"，产品层（product-a/）定义"建什么"。新增产品线只需复制产品目录、修改环境配置，零代码改动。

三、Terraform 模块化设计

模块化是这个项目的骨架。两个核心模块承担不同职责：

3.1 主网关模块（module/gate）

对外暴露的 API Gateway，特点是使用 OpenAPI body 驱动 + Go Lambda Authorizer：

# module/gate/api.tf - 核心资源定义
resource "aws_api_gateway_rest_api" "main_gateway" {
  name = "${var.app_group}-gateway-${terraform.workspace}"

  # 关键：用 Resource Policy 做第一层安全防护
  policy = templatefile("${path.cwd}/gateway_api/policy.json", {
    account_id = data.aws_caller_identity.current.account_id
    env        = terraform.workspace
  })

  # 关键：用 OpenAPI body 定义所有路由，而非逐个声明 resource/method
  body = templatefile("${path.cwd}/gateway_api/body.json", {
    vpc_link_id            = var.vpc_link_id,
    env                    = terraform.workspace
    api_name               = "${var.app_group}-gateway-${terraform.workspace}",
    authorizer_uri         = aws_lambda_function.authorizer.invoke_arn
    authorizer_credentials = aws_iam_role.authorizer_execution_role.arn
    authorizer_result_ttl  = var.authorizer_result_ttl
    authorizer_name        = "${var.app_group}-api-authorizer-${terraform.workspace}"
    # 无缓存版本的 Authorizer（用于需要实时校验的路径）
    non_ttl_authorizer_uri         = aws_lambda_function.non_ttl_authorizer.invoke_arn
    non_ttl_authorizer_credentials = aws_iam_role.authorizer_execution_role.arn
    non_ttl_authorizer_name        = "${var.app_group}-api-non-ttl-authorizer-${terraform.workspace}"
  })

  binary_media_types = ["multipart/form-data"]
  endpoint_configuration {
    types = ["REGIONAL"]
  }
}

注意这里用了 body 参数而非逐个声明 aws_api_gateway_resource + aws_api_gateway_method。当路由数量达到几十甚至上百条时，逐个声明的方式会让 Terraform 代码膨胀到不可维护。用 OpenAPI body 驱动，路由定义集中在一个 JSON 文件中，由 Python 脚本自动生成。

部署和 Stage 配置：

# 部署触发器 - 任何相关资源变化都会触发重新部署
resource "aws_api_gateway_deployment" "main_gateway" {
  rest_api_id = aws_api_gateway_rest_api.main_gateway.id
  triggers = {
    redeployment = sha1(jsonencode(concat([
      aws_api_gateway_rest_api.main_gateway.id,
      aws_api_gateway_rest_api.main_gateway.body,
      aws_api_gateway_rest_api.main_gateway.policy,
      aws_api_gateway_rest_api.main_gateway.binary_media_types
    ], var.dependent_resources)))
  }
  lifecycle {
    create_before_destroy = true
  }
}

# Stage 配置 - 日志、限流、X-Ray 追踪
resource "aws_api_gateway_stage" "main_gateway_stage" {
  depends_on    = [aws_cloudwatch_log_group.execution_logs]
  deployment_id = aws_api_gateway_deployment.main_gateway.id
  rest_api_id   = aws_api_gateway_rest_api.main_gateway.id
  stage_name    = var.gateway_stage_name

  access_log_settings {
    destination_arn = aws_cloudwatch_log_group.execution_logs.arn
    format          = jsonencode(var.log_format)
  }
  xray_tracing_enabled = true  # 启用 X-Ray 分布式追踪
}

# 全局方法设置 - 限流和监控
resource "aws_api_gateway_method_settings" "all" {
  rest_api_id = aws_api_gateway_rest_api.main_gateway.id
  stage_name  = aws_api_gateway_stage.main_gateway_stage.stage_name
  method_path = "*/*"
  settings {
    metrics_enabled        = true
    logging_level          = "INFO"
    data_trace_enabled     = true
    throttling_burst_limit = var.throttling_burst_limit
    throttling_rate_limit  = var.throttling_rate_limit
  }
}

3.2 系统网关模块（module/system）

内部服务间通信的网关，结构类似但更轻量：

使用 Python Lambda Authorizer（逻辑简单，不需要 Go 的性能）
不需要 VPC 内部署 Lambda（无需访问内部认证服务）
不启用 binary_media_types
日志保留期更短（7天 vs 生产环境的30天）

3.3 产品层如何消费模块

每个产品目录是一个独立的 Terraform root module：

# product-a/main.tf - Provider 和 Backend 配置
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "> 5.0"
    }
  }
  backend "s3" {
    bucket  = "my-gateway-tf-state"
    key     = "product-a/terraform.tfstate"  # 每个产品独立的 state
    region  = "cn-north-1"
    encrypt = true
  }
}

# product-a/api.tf - 引用共享模块
module "api-gateway" {
  source                 = "../module/gate"
  app_group              = var.app_group
  gateway_stage_name     = var.gateway_stage_name
  vpc_link_id            = var.vpc_link_id
  vpc_id                 = var.vpc_id
  private_subnet         = var.private_subnet
  throttling_rate_limit  = var.throttling_rate_limit
  throttling_burst_limit = var.throttling_burst_limit
  log_format             = var.log_format
  retention_in_days      = var.retention_in_days
  authorizer_result_ttl  = var.authorizer_result_ttl
}

module "system-gateway" {
  source             = "../module/system"
  app_group          = var.app_group
  gateway_stage_name = var.gateway_stage_name
  vpc_link_id        = var.vpc_link_id
  # ... 其他参数
}

每个产品的 state 存储在 S3 的不同 key 下，完全隔离。产品 A 的变更不会影响产品 B 的基础设施。

四、OpenAPI Body 代码生成

这是整个项目最巧妙的设计。路由不是手写的，而是由 Python 脚本根据环境配置自动生成。

4.1 四类路由策略

系统定义了四种路由类型，覆盖所有业务场景：

路由类型	认证	TTL 缓存	集成方式	典型场景
`normal_path`	Lambda Authorizer	有（可配置秒数）	VPC Link -> EKS Ingress	常规业务 API
`normal_non_ttl_path`	Lambda Authorizer	无（每次校验）	VPC Link -> EKS Ingress	状态变更类接口（如账号合并）
`white_path`	无	无	VPC Link -> EKS Ingress	OAuth 回调、公开接口
`through_path`	无	无	HTTP 直接代理	透传到外部服务

为什么需要 non_ttl 版本？API Gateway 的 Authorizer 有 TTL 缓存机制，相同 Token 在缓存期内不会重复调用 Lambda。对于大多数接口这是好事（省钱省延迟），但某些场景下用户状态可能在缓存期内发生变化（比如账号合并后 Token 权限变了），这时就需要每次都实时校验。

下面这张图展示了四类路由的流量走向差异：

4.2 环境配置文件

每个环境的路由定义在 JSON 配置文件中：

{
  "app_group": "product-a",
  "gateway_stage_name": "dev",
  "vpc_link_id": "xxxxxx",
  "vpc_id": "vpc-xxxxxxxxx",
  "private_subnet": ["subnet-aaa", "subnet-bbb"],
  "throttling_rate_limit": 10000,
  "throttling_burst_limit": 10000,
  "authorizer_result_ttl": 10,
  "gateway_api": {
    "normal_path": [
      {"service_path": "/order-service/{proxy+}"},
      {"service_path": "/user-bff-service/{proxy+}"},
      {"service_path": "/payment-service/{proxy+}"}
    ],
    "normal_non_ttl_path": [
      {"service_path": "/user-bff-service/api/v1/account/status"},
      {"service_path": "/user-bff-service/api/v1/account/merge"}
    ],
    "white_path": [
      {"service_path": "/auth-service/api/oauth/callback/{proxy+}"},
      {"service_path": "/auth-service/api/v1/oauth/{proxy+}"},
      {"service_path": "/order-service/api/v1/public/{proxy+}"}
    ],
    "through_path": [
      {
        "pass_through_source_path": "/user-bff-service/api/v1/agreements",
        "pass_through_target_path": "https://internal-api.example.com/platform/api/v1/agreements"
      }
    ]
  }
}

这个设计的好处是：开发人员新增路由只需要在对应环境的 JSON 文件中加一行，不需要碰任何 Terraform 代码。路由管理变成了配置管理。

4.3 代码生成器实现

Python 脚本读取环境配置，套用 OpenAPI 模板，生成完整的 body.json：

# generate_path.py - 路由代码生成器
import json, sys
from generate_templates import (
    TOKEN_TEMPLATE, NO_TOKEN_TEMPLATE,
    PASS_THROUGH_TEMPLATE, NON_TTL_TOKEN_TEMPLATE,
    GATEWAY_RESPONSES
)

def generate_ingress_path(service_path, env):
    """根据服务路径生成 EKS Ingress 地址"""
    path_parts = service_path.split('/')
    if len(path_parts) <= 2:
        return f'k8s-internal.example.com/{{proxy}}'
    # 规则：取服务名作为子域名前缀
    return (f"{path_parts[1]}-{env}.k8s-internal.example.com/"
            f"{'/'.join(path_parts[2:]).replace('{proxy+}', '{proxy}')}")

def process_paths(paths, template, body_data, env):
    """批量处理路径，套用模板生成 OpenAPI 定义"""
    for path_info in paths:
        service_path = path_info['service_path']
        ingress_path = generate_ingress_path(service_path, env)
        new_path = json.loads(
            json.dumps(template)
            .replace('service_path', service_path)
            .replace('ingress_path', ingress_path)
        )
        body_data['paths'].update(new_path)

def main(env, app):
    config = json.load(open(f'{app}/environment/{env}.tfvars.json'))

    for gate in ['gateway_api', 'gateway_system']:
        body = json.load(open(f'{app}/{gate}/body.json'))
        body['paths'] = {}  # 清空旧路由，重新生成

        # 四类路由分别处理
        process_paths(config[gate]['normal_path'],
                      TOKEN_TEMPLATE, body, env)
        process_paths(config[gate]['normal_non_ttl_path'],
                      NON_TTL_TOKEN_TEMPLATE, body, env)
        process_paths(config[gate]['white_path'],
                      NO_TOKEN_TEMPLATE, body, env)

        # through_path 特殊处理（源路径 -> 目标路径映射）
        for tp in config[gate]['through_path']:
            new_path = json.loads(
                json.dumps(PASS_THROUGH_TEMPLATE)
                .replace('pass_through_source_path',
                         tp['pass_through_source_path'])
                .replace('pass_through_target_path',
                         tp['pass_through_target_path'])
            )
            body['paths'].update(new_path)

        # 添加统一的 Gateway Responses（401/403 自定义响应）
        body['x-amazon-apigateway-gateway-responses'] = GATEWAY_RESPONSES

        json.dump(body, open(f'{app}/{gate}/body.json', 'w'), indent=4)

4.4 OpenAPI 模板设计

模板定义了每种路由类型对应的 OpenAPI + API Gateway 扩展：

# generate_templates.py - 带认证 + TTL 缓存的路由模板
TOKEN_TEMPLATE = {
    "service_path": {
        # OPTIONS 方法 - 用于 CORS 预检，不走认证
        "options": {
            "parameters": [{"name": "proxy", "in": "path",
                           "required": True, "type": "string"}],
            "x-amazon-apigateway-integration": {
                "connectionId": "${vpc_link_id}",
                "httpMethod": "ANY",
                "uri": "http://ingress_path",
                "connectionType": "VPC_LINK",
                "type": "http_proxy"
            }
        },
        # 其他所有方法 - 走 Lambda Authorizer
        "x-amazon-apigateway-any-method": {
            "security": [{"${authorizer_name}": []}],
            "x-amazon-apigateway-integration": {
                "connectionId": "${vpc_link_id}",
                "httpMethod": "ANY",
                "uri": "http://ingress_path",
                "requestParameters": {
                    # 透传认证上下文到后端服务
                    "integration.request.header.x-user-token":
                        "context.authorizer.x-user-token",
                    "integration.request.header.x-user-id":
                        "context.authorizer.x-user-id",
                    "integration.request.header.x-auth-type":
                        "context.authorizer.x-auth-type",
                    "integration.request.header.x-request-id":
                        "context.requestId"
                },
                "connectionType": "VPC_LINK",
                "timeoutInMillis": 29000,
                "type": "http_proxy"
            }
        }
    }
}

模板中的占位符（service_path、ingress_path、${vpc_link_id}）会在两个阶段被替换：

Python 生成阶段：替换 service_path 和 ingress_path
Terraform templatefile 阶段：替换 ${vpc_link_id}、${authorizer_name} 等

这种两阶段替换的设计很巧妙，让 Python 只关心路由映射，Terraform 只关心基础设施引用。

五、Go Lambda Authorizer 多Token认证

这是整个系统最复杂的组件。一个 Lambda 函数要处理多种不同的 Token 类型，根据请求头动态选择认证策略。

5.1 认证流程

Authorizer 通过请求头中的标识字段判断当前请求使用哪种认证体系，然后调用对应的内部认证服务进行校验：

5.2 核心代码解析

Go 实现的 Lambda Authorizer，通过请求头判断 Token 类型：

// main.go - Lambda Authorizer 入口
func handler(ctx context.Context, event CustomAuthorizerRequest) (AuthResponse, error) {
    // 统一转小写，避免大小写不一致问题
    headers := make(map[string]string)
    for k, v := range event.Headers {
        headers[strings.ToLower(k)] = v
    }

    // 提取 Bearer Token
    token, err := extractToken(headers["authorization"])
    if err != nil {
        return AuthResponse{}, errors.New("Unauthorized")
    }

    var response AuthResponse

    // 根据请求头中的渠道标识选择认证策略
    switch headers["x-auth-channel"] {
    case "wechat-mini":
        response, err = handleWeChatToken(event, token)
    case "mobile-app":
        response, err = handleMobileToken(event, token)
    case "third-party":
        response, err = handleThirdPartyToken(event, token)
    default:
        // 默认走标准 Token 校验
        response, err = handleDefaultToken(event, token)
    }

    if err != nil {
        return AuthResponse{}, errors.New("Unauthorized")
    }
    return response, nil
}

每种 Token 的校验逻辑都是调用内部认证服务的 API，通过 VPC 内网通信：

// 环境路由表 - Lambda 根据 ENV 环境变量选择认证服务地址
// 每个环境对应不同的内部 API Gateway 端点（通过 VPC Endpoint 访问）
var authServiceURI = map[string]string{
    "dev":     "xxxxxxx.execute-api.region.amazonaws.com/dev",
    "staging": "xxxxxxx.execute-api.region.amazonaws.com/staging",
    "preprod": "xxxxxxx.execute-api.region.amazonaws.com/preprod",
    "prod":    "xxxxxxx.execute-api.region.amazonaws.com/prod",
}

// 微信小程序 Token 校验示例
func handleWeChatToken(event CustomAuthorizerRequest, token string) (AuthResponse, error) {
    authUrl := fmt.Sprintf("https://%s%s?token=%s",
        authServiceURI[os.Getenv("ENV")],
        "/auth-service/api/v1/validate-token",
        url.QueryEscape(token))

    result, err := httpGet(authUrl)
    if err != nil || result["code"].(string) != "0" {
        return AuthResponse{}, errors.New("Unauthorized")
    }

    // 提取内部 JWT 并构建认证上下文
    jwt := result["data"].(map[string]interface{})["jwt"].(string)
    auth := map[string]string{
        "x-user-token": token,
        "x-auth-type":  "wechat",
        "x-user-scope": "customer",
        "x-user-id":    jwt,
    }
    return generatePolicy(auth, "Allow", event.MethodArn), nil
}

认证上下文中的字段说明：

x-user-token：原始 Token，透传给后端（后端可能需要用它调用其他服务）
x-auth-type：认证类型标识，后端据此判断用户来源
x-user-scope：用户权限范围（如 customer / admin / internal）
x-user-id：认证服务返回的内部用户标识（JWT 或用户 ID）

5.3 IAM Policy 生成

认证通过后，Lambda 返回 IAM Policy 和上下文信息。API Gateway 会将上下文透传给后端服务：

func generatePolicy(auth map[string]string, effect, methodArn string) AuthResponse {
    authResponse := AuthResponse{}
    // 认证上下文 - 会被注入到 integration.request.header 中
    if auth != nil {
        authResponse.Context = auth
    }
    // IAM Policy - 控制是否允许调用 API
    if effect != "" && methodArn != "" {
        authResponse.PolicyDocument = &PolicyDocument{
            Version: "2012-10-17",
            Statement: []PolicyStatement{{
                Sid:      "FirstStatement",
                Action:   "execute-api:Invoke",
                Effect:   effect,
                Resource: methodArn,
            }},
        }
    }
    return authResponse
}

这里有个关键设计：认证上下文（Context）中的字段会通过 OpenAPI 模板中的 requestParameters 映射到 HTTP Header，后端服务可以直接从 Header 中获取用户身份信息，无需再次解析 Token。

5.4 为什么用 Go 而不是 Python/Node.js？

冷启动快：Go 编译为原生二进制，Lambda 冷启动时间约 100ms，Python/Node.js 通常 300-800ms
运行时性能：每次请求都要调用内部认证服务，Go 的 HTTP 客户端性能更好
内存占用低：128MB 内存足够，Python 可能需要 256MB
部署包小：编译后的 zip 只有几 MB，加载更快

注意 runtime 用的是 provided.al2（Amazon Linux 2 自定义运行时），而非 go1.x（已废弃）。Go 代码编译后命名为 bootstrap，Lambda 会自动执行。

5.5 双 Authorizer 设计

Terraform 中定义了两个 Lambda 函数，代码完全相同：

# 带 TTL 缓存的 Authorizer
resource "aws_lambda_function" "authorizer" {
  function_name = "${var.app_group}-api-authorizer-${terraform.workspace}"
  runtime       = "provided.al2"
  handler       = "main.handler"
  filename      = "${path.module}/src/tokenValidator/tokenValidator.zip"
  memory_size   = 128
  timeout       = 15
  vpc_config {
    subnet_ids         = var.private_subnet
    security_group_ids = [aws_security_group.lambda_sg.id]
  }
}

# 无 TTL 缓存的 Authorizer（代码相同，但 API Gateway 配置不同）
resource "aws_lambda_function" "non_ttl_authorizer" {
  function_name = "${var.app_group}-api-non-ttl-authorizer-${terraform.workspace}"
  # ... 配置完全相同
}

两个函数代码一样，区别在于 API Gateway 侧的 TTL 配置。带 TTL 的 Authorizer 缓存认证结果（比如 10 秒），无 TTL 的每次都调用 Lambda。这样在 OpenAPI body 中，不同路由可以引用不同的 Authorizer。

六、Resource Policy 安全防护

Resource Policy 是 API Gateway 的第一道防线，在请求到达 Lambda Authorizer 之前就生效。

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "allow-all-traffic",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "execute-api:Invoke",
      "Resource": "arn:aws-cn:execute-api:cn-north-1:${account_id}:*/**"
    },
    {
      "Sid": "deny-list",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "execute-api:Invoke",
      "Resource": [
        "arn:...:*/*/*/actuator",
        "arn:...:*/*/*/actuator/**",
        "arn:...:*/*/*/swagger-ui.html",
        "arn:...:*/*/*/swagger-resources/**",
        "arn:...:*/*/*/api-docs",
        "arn:...:*/*/*/internal/**",
        "arn:...:*/*/service-name/**/admin/**"
      ]
    }
  ]
}

Deny 优先于 Allow，所以即使有人绕过了前端路由，直接请求 /actuator（Spring Boot 健康检查）、/swagger-ui.html（API 文档）、/internal/**（内部接口），都会被直接拒绝，返回 403。

这个 deny-list 的设计思路：

框架暴露的管理端点（actuator、swagger）：生产环境绝对不能对外
内部服务间调用的接口（internal/）：只应通过 System Gateway 访问
管理后台接口（admin/）：应该走独立的管理网关

下面这张图展示了安全防护的三层架构：

七、VPC Link 私有集成

API Gateway 通过 VPC Link 连接到 EKS 集群内的 NLB（Network Load Balancer），实现私有网络集成。整个请求链路如下：

路由映射规则：API Gateway 的路径会被转换为 EKS Ingress 的地址。比如：

/order-service/{proxy+} -> http://order-service-dev.k8s-internal.example.com/{proxy}
/auth-service/api/v1/oauth/{proxy+} -> http://auth-service-dev.k8s-internal.example.com/api/v1/oauth/{proxy}

注意 {proxy+}（API Gateway 贪婪匹配）到 {proxy}（标准路径参数）的转换，这是 generate_path.py 中 generate_ingress_path 函数处理的。

Lambda Authorizer 也部署在 VPC 内（配置了 vpc_config），这样它可以通过内网调用认证服务，不需要走公网。代价是冷启动时间会增加（需要挂载 ENI），但对于认证这种高频调用，Lambda 基本不会冷启动。

八、Gateway Responses 自定义错误响应

API Gateway 默认的错误响应不带 CORS 头，前端会收到跨域错误而非真正的错误信息。通过自定义 Gateway Responses 解决：

# generate_templates.py 中的 Gateway Responses 定义
GATEWAY_RESPONSES = {
    "ACCESS_DENIED": {
        "statusCode": 403,
        "responseParameters": {
            # 关键：错误响应也要带 CORS 头
            "gatewayresponse.header.Access-Control-Allow-Origin":
                "method.request.header.origin",
            "gatewayresponse.header.Access-Control-Allow-Methods":
                "'DELETE,GET,HEAD,OPTIONS,PATCH,POST,PUT'",
            "gatewayresponse.header.Access-Control-Allow-Headers": "'*'",
        },
        "responseTemplates": {
            "application/json": '{"message": "Gateway Forbidden"}'
        }
    },
    "UNAUTHORIZED": {
        "statusCode": 401,
        "responseParameters": { /* 同上 CORS 头 */ },
        "responseTemplates": {
            "application/json":
                '{"code":401,"data":null,"message":"Please login again"}'
        }
    },
    "MISSING_AUTHENTICATION_TOKEN": {
        "statusCode": 401,
        "responseParameters": { /* 同上 CORS 头 */ },
        "responseTemplates": {
            "application/json":
                '{"message":"Access token is required"}'
        }
    }
}

三种错误场景的区别：

ACCESS_DENIED：Resource Policy 拒绝（deny-list 命中）
UNAUTHORIZED：Lambda Authorizer 返回 Unauthorized
MISSING_AUTHENTICATION_TOKEN：请求没带 Authorization 头

九、多环境部署

部署脚本将所有步骤串联起来：

#!/bin/bash -eu
# deploy.sh - 一键部署脚本
environment=$1  # dev / staging / preprod / prod
appdir=$2       # product-a / product-b

# Step 1: 根据环境配置生成 OpenAPI body.json
python generate_path.py $environment $appdir

# Step 2: 进入产品目录，初始化 Terraform
cd $appdir
terraform init
terraform fmt

# Step 3: 选择或创建 Terraform Workspace
if ! terraform workspace select $environment; then
  terraform workspace new $environment
fi

# Step 4: 根据环境选择 AWS 凭证
case $environment in
  dev|staging|preprod) aws_key="dev_admin" ;;
  prod)               aws_key="prod_admin" ;;
esac

# Step 5: 执行 Terraform Apply
terraform apply \
  -var-file=environment/$environment.tfvars.json \
  -var-file ../${aws_key}.aws.key

下面这张图展示了完整的部署流水线：

部署流程的关键设计：

Terraform Workspace 实现多环境隔离：同一套代码，不同 workspace 对应不同环境的 state
环境配置和 AWS 凭证分离：.tfvars.json 存路由和基础设施参数，.aws.key 存凭证
代码生成在 Terraform 之前：确保 body.json 是最新的
dev/staging/preprod 共用一套 AWS 凭证，prod 独立凭证，符合最小权限原则

部署一个新环境的完整命令：

# 部署 Product-A 的 staging 环境
bash deploy.sh staging product-a

# 部署 Product-B 的 prod 环境
bash deploy.sh prod product-b

十、完整请求链路

把所有组件串起来，看一个请求的完整生命周期：

十一、踩坑经验与最佳实践

11.1 OpenAPI body 驱动 vs 逐个声明资源

当路由超过 20 条时，强烈建议用 body 驱动。逐个声明 aws_api_gateway_resource + aws_api_gateway_method + aws_api_gateway_integration 的方式，每条路由至少 3 个资源，20 条路由就是 60+ 个 Terraform 资源，deployment 的 triggers 要引用所有资源的 id，维护成本极高。

11.2 Lambda Authorizer VPC 冷启动

Lambda 部署在 VPC 内时，冷启动需要挂载 ENI（弹性网络接口），耗时可达 5-10 秒。解决方案：

使用 Provisioned Concurrency 预热（有额外费用）
设置合理的 TTL 缓存，减少 Lambda 调用频率
Go 运行时本身冷启动快，比 Python/Java 好很多

11.3 Deployment 触发器要覆盖全

triggers = {
  redeployment = sha1(jsonencode(concat([
    aws_api_gateway_rest_api.main_gateway.id,
    aws_api_gateway_rest_api.main_gateway.body,    # body 变了要重新部署
    aws_api_gateway_rest_api.main_gateway.policy,  # policy 变了也要
    aws_api_gateway_rest_api.main_gateway.binary_media_types
  ], var.dependent_resources)))  # 外部依赖也纳入
}

漏掉任何一个，改了配置但 deployment 不更新，变更就不会生效。这是 Terraform 管理 API Gateway 最常见的坑。

11.4 CORS 处理

OPTIONS 方法单独处理，不走 Authorizer。如果 OPTIONS 也走认证，浏览器的 CORS 预检请求会被拒绝，前端直接报跨域错误。模板中 OPTIONS 方法没有 security 配置，就是这个原因。

11.5 through_path 的使用场景

当需要将某些路径透传到完全不同的后端服务时（比如第三方平台的 API），使用 through_path。它不走 VPC Link，直接 HTTP 代理到目标 URL。注意这种路由没有认证，要确保目标服务自己有鉴权机制。

11.6 多产品 State 隔离

每个产品的 Terraform state 存储在 S3 的不同 key 下：

# product-a
backend "s3" {
  key = "product-a/terraform.tfstate"
}

# product-b
backend "s3" {
  key = "product-b/terraform.tfstate"
}

这样 terraform destroy 一个产品不会影响另一个。但要注意，如果两个产品共享 VPC Link，销毁时要确认依赖关系。

十二、总结

这套多产品 API Gateway 基础设施的核心设计原则：

模块复用：共享 Terraform 模块，新产品零代码接入
配置驱动：路由定义在 JSON 中，Python 自动生成 OpenAPI body
安全分层：Resource Policy（第一层）-> Lambda Authorizer（第二层）-> 后端服务鉴权（第三层）
环境隔离：Terraform Workspace + S3 独立 state + 分离的 AWS 凭证
性能优化：Go Lambda + TTL 缓存 + VPC 内网通信

整个系统的扩展路径很清晰：新增产品线复制产品目录，新增路由编辑 JSON 配置，新增认证方式在 Go Authorizer 中加一个 handler。每个维度的变更都被限制在最小范围内，这就是好的基础设施设计。