中转接口文档
TokenGateway 中转接口兼容 OpenAI 标准格式,提供统一的 AI 服务接入方式。 平台自动选择最佳可用模型,支持流式和非流式响应,提供完善的错误处理和限流机制。
100%
OpenAI 兼容
智能路由
自动选择最优模型
高可用
多上游自动容错
快速开始
POST
/xlei/v1/completions
curl -X POST http://localhost:8080/xlei/v1/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-your-api-key" \
-d '{
"messages": [
{"role": "user", "content": "Hello"}
]
}'
接口说明
统一接口支持两种模式,后端自动判断。平台会自动选择最佳可用模型,无需用户指定。
- 聊天模式: 使用
messages参数(推荐) - 补全模式: 使用
prompt参数 - 极简输入: 用户只需提供消息内容,其他参数由后端自动配置
- 智能路由: 后端自动选择优先级最高且可用的上游模型
- 容错机制: 当前模型失败时自动切换到备用模型
认证方式
支持两种认证方式,优先使用 X-API-Key:
# 方式1:X-API-Key(推荐)
curl -X POST http://localhost:8080/xlei/v1/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: sk-your-api-key" \
-d '{"messages": [{"role": "user", "content": "Hello"}]}'
# 方式2:Authorization Bearer
curl -X POST http://localhost:8080/xlei/v1/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-your-api-key" \
-d '{"messages": [{"role": "user", "content": "Hello"}]}'
限流机制
平台采用双重限流策略,保障服务稳定:
IP限流
500次/分钟
单个IP的请求频率限制
API Key限流
100次/分钟
单个API Key的请求频率限制
超出限流阈值将返回 429 Too Many Requests 错误
请求参数
只需提供以下任一参数,其他参数由后端自动设置最优值:
| 参数 | 类型 | 必填 | 说明 |
|---|---|---|---|
| messages | array | 否 | 聊天消息数组,与prompt二选一(推荐) |
| prompt | string | 否 | 补全文本,与messages二选一 |
messages数组格式
"messages": [
{"role": "system", "content": "你是一个乐于助人的助手"},
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hello! How can I help you?"},
{"role": "user", "content": "Tell me more"}
]
role可选值: system(系统提示)、user(用户输入)、assistant(助手回复)
后端自动配置
以下参数由后端自动设置最优值,无需用户传入:
max_tokens: 自动设置为1000temperature: 自动设置为0.7stream: 自动启用流式响应top_p: 自动设置为1.0frequency_penalty: 自动设置为0.0presence_penalty: 自动设置为0.0
请求示例
聊天模式(推荐)
{
"messages": [
{"role": "user", "content": "Hello!"}
]
}
多轮对话
{
"messages": [
{"role": "user", "content": "What is AI?"},
{"role": "assistant", "content": "AI stands for Artificial Intelligence..."},
{"role": "user", "content": "Tell me more about it."}
]
}
补全模式
{
"prompt": "Once upon a time,"
}
响应格式
流式响应(默认)
返回 SSE(Server-Sent Events)格式,适合实时显示:
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1714166400,"choices":[{"delta":{"content":"Hello"}}]}
data: {"id":"chatcmpl-xxx","object":"chat.completion.chunk","created":1714166400,"choices":[{"delta":{"content":"!"}}]}
data: [DONE]
非流式响应
设置 stream: false 时返回完整JSON:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1714166400,
"model": "auto-selected",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 15,
"completion_tokens": 20,
"total_tokens": 35
}
}
错误码
| 错误码 | HTTP状态 | 说明 |
|---|---|---|
| 0 | 200 | 成功 |
| 1001 | 400 | 缺少prompt或messages参数 |
| 2001 | 401 | API Key无效或已禁用 |
| 2002 | 402 | 余额不足 |
| 2003 | 429 | 请求过于频繁(限流) |
| 1004 | 403 | 用户账户冻结 |
| 4001 | 503 | 无可用上游模型 |
| 5000 | 500 | 服务器内部错误 |
错误响应格式
{
"code": 2002,
"message": "余额不足",
"data": null
}
SDK示例
Python
import requests
import os
base_url = "http://localhost:8080"
api_key = os.getenv("API_KEY")
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {api_key}"
}
data = {"messages": [{"role": "user", "content": "Hello"}]}
response = requests.post(
f"{base_url}/xlei/v1/completions",
json=data,
headers=headers,
stream=True
)
for line in response.iter_lines():
if line and line.decode('utf-8').startswith('data: '):
print(line.decode('utf-8')[6:])
JavaScript
const response = await fetch('http://localhost:8080/xlei/v1/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer ' + process.env.API_KEY
},
body: JSON.stringify({
messages: [{ role: 'user', content: 'Hello' }]
})
});
const reader = response.body.getReader();
while (true) {
const { value, done } = await reader.read();
if (done) break;
const text = new TextDecoder('utf-8').decode(value);
for (const line of text.split('\\n')) {
if (line.startsWith('data: ')) {
console.log(line.slice(6));
}
}
}
Java
OkHttpClient client = new OkHttpClient();
String jsonBody = "{\"messages\":[{\"role\":\"user\",\"content\":\"Hello\"}]}";
Request request = new Request.Builder()
.url("http://localhost:8080/xlei/v1/completions")
.post(RequestBody.create(jsonBody, MediaType.parse("application/json")))
.addHeader("Authorization", "Bearer " + System.getenv("API_KEY"))
.build();
try (Response response = client.newCall(request).execute()) {
BufferedReader reader = new BufferedReader(
new InputStreamReader(response.body().byteStream()));
String line;
while ((line = reader.readLine()) != null) {
if (line.startsWith("data: ")) {
System.out.println(line.substring(6));
}
}
}
Go
package main
import (
"bufio"
"bytes"
"encoding/json"
"fmt"
"net/http"
"os"
"strings"
)
func main() {
baseURL := "http://localhost:8080"
apiKey := os.Getenv("API_KEY")
body := map[string]interface{}{
"messages": []interface{}{
map[string]interface{}{"role": "user", "content": "Hello"},
},
}
jsonBody, _ := json.Marshal(body)
req, _ := http.NewRequest("POST",
baseURL+"/xlei/v1/completions",
bytes.NewBuffer(jsonBody))
req.Header.Set("Authorization", "Bearer "+apiKey)
resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close()
scanner := bufio.NewScanner(resp.Body)
for scanner.Scan() {
line := scanner.Text()
if strings.HasPrefix(line, "data: ") {
fmt.Println(line[6:])
}
}
}