中转接口文档

TokenGateway 中转接口兼容 OpenAI 标准格式,提供统一的 AI 服务接入方式。 平台自动选择最佳可用模型,支持流式和非流式响应,提供完善的错误处理和限流机制。

100%
OpenAI 兼容
智能路由
自动选择最优模型
高可用
多上游自动容错

快速开始

POST /xlei/v1/completions
curl -X POST http://localhost:8080/xlei/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key" \
  -d '{
    "messages": [
      {"role": "user", "content": "Hello"}
    ]
  }'

接口说明

统一接口支持两种模式,后端自动判断。平台会自动选择最佳可用模型,无需用户指定。

  • 聊天模式: 使用 messages 参数(推荐)
  • 补全模式: 使用 prompt 参数
  • 极简输入: 用户只需提供消息内容,其他参数由后端自动配置
  • 智能路由: 后端自动选择优先级最高且可用的上游模型
  • 容错机制: 当前模型失败时自动切换到备用模型

认证方式

支持两种认证方式,优先使用 X-API-Key:

# 方式1:X-API-Key(推荐)
curl -X POST http://localhost:8080/xlei/v1/completions \
  -H "Content-Type: application/json" \
  -H "X-API-Key: sk-your-api-key" \
  -d '{"messages": [{"role": "user", "content": "Hello"}]}'

# 方式2:Authorization Bearer
curl -X POST http://localhost:8080/xlei/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key" \
  -d '{"messages": [{"role": "user", "content": "Hello"}]}'

限流机制

平台采用双重限流策略,保障服务稳定:

IP限流
500次/分钟
单个IP的请求频率限制
API Key限流
100次/分钟
单个API Key的请求频率限制

超出限流阈值将返回 429 Too Many Requests 错误

请求参数

只需提供以下任一参数,其他参数由后端自动设置最优值:

参数类型必填说明
messagesarray聊天消息数组,与prompt二选一(推荐)
promptstring补全文本,与messages二选一

messages数组格式

"messages": [
  {"role": "system", "content": "你是一个乐于助人的助手"},
  {"role": "user", "content": "Hello!"},
  {"role": "assistant", "content": "Hello! How can I help you?"},
  {"role": "user", "content": "Tell me more"}
]

role可选值: system(系统提示)、user(用户输入)、assistant(助手回复)

后端自动配置

以下参数由后端自动设置最优值,无需用户传入:

  • max_tokens: 自动设置为1000
  • temperature: 自动设置为0.7
  • stream: 自动启用流式响应
  • top_p: 自动设置为1.0
  • frequency_penalty: 自动设置为0.0
  • presence_penalty: 自动设置为0.0

请求示例

聊天模式(推荐)

{
  "messages": [
    {"role": "user", "content": "Hello!"}
  ]
}

多轮对话

{
  "messages": [
    {"role": "user", "content": "What is AI?"},
    {"role": "assistant", "content": "AI stands for Artificial Intelligence..."},
    {"role": "user", "content": "Tell me more about it."}
  ]
}

补全模式

{
  "prompt": "Once upon a time,"
}

响应格式

流式响应(默认)

返回 SSE(Server-Sent Events)格式,适合实时显示:

data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1714166400,"choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1714166400,"choices":[{"delta":{"content":"!"}}]}
data: [DONE]

非流式响应

设置 stream: false 时返回完整JSON:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1714166400,
  "model": "auto-selected",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 15,
    "completion_tokens": 20,
    "total_tokens": 35
  }
}

错误码

错误码HTTP状态说明
0200成功
1001400缺少prompt或messages参数
2001401API Key无效或已禁用
2002402余额不足
2003429请求过于频繁(限流)
1004403用户账户冻结
4001503无可用上游模型
5000500服务器内部错误

错误响应格式

{
  "code": 2002,
  "message": "余额不足",
  "data": null
}

SDK示例

Py

Python

import requests
import os

base_url = "http://localhost:8080"
api_key = os.getenv("API_KEY")

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {api_key}"
}

data = {"messages": [{"role": "user", "content": "Hello"}]}
response = requests.post(
    f"{base_url}/xlei/v1/completions",
    json=data,
    headers=headers,
    stream=True
)

for line in response.iter_lines():
    if line and line.decode('utf-8').startswith('data: '):
        print(line.decode('utf-8')[6:])
JS

JavaScript

const response = await fetch('http://localhost:8080/xlei/v1/completions', {
    method: 'POST',
    headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer ' + process.env.API_KEY
    },
    body: JSON.stringify({
        messages: [{ role: 'user', content: 'Hello' }]
    })
});

const reader = response.body.getReader();
while (true) {
    const { value, done } = await reader.read();
    if (done) break;
    const text = new TextDecoder('utf-8').decode(value);
    for (const line of text.split('\\n')) {
        if (line.startsWith('data: ')) {
            console.log(line.slice(6));
        }
    }
}
J

Java

OkHttpClient client = new OkHttpClient();
String jsonBody = "{\"messages\":[{\"role\":\"user\",\"content\":\"Hello\"}]}";

Request request = new Request.Builder()
    .url("http://localhost:8080/xlei/v1/completions")
    .post(RequestBody.create(jsonBody, MediaType.parse("application/json")))
    .addHeader("Authorization", "Bearer " + System.getenv("API_KEY"))
    .build();

try (Response response = client.newCall(request).execute()) {
    BufferedReader reader = new BufferedReader(
        new InputStreamReader(response.body().byteStream()));
    String line;
    while ((line = reader.readLine()) != null) {
        if (line.startsWith("data: ")) {
            System.out.println(line.substring(6));
        }
    }
}
Go

Go

package main

import (
    "bufio"
    "bytes"
    "encoding/json"
    "fmt"
    "net/http"
    "os"
    "strings"
)

func main() {
    baseURL := "http://localhost:8080"
    apiKey := os.Getenv("API_KEY")
    
    body := map[string]interface{}{
        "messages": []interface{}{
            map[string]interface{}{"role": "user", "content": "Hello"},
        },
    }
    
    jsonBody, _ := json.Marshal(body)
    
    req, _ := http.NewRequest("POST", 
        baseURL+"/xlei/v1/completions", 
        bytes.NewBuffer(jsonBody))
    req.Header.Set("Authorization", "Bearer "+apiKey)
    
    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    
    scanner := bufio.NewScanner(resp.Body)
    for scanner.Scan() {
        line := scanner.Text()
        if strings.HasPrefix(line, "data: ") {
            fmt.Println(line[6:])
        }
    }
}