注册

Chat 流式补全

Chat Stream

POST https://api.quickrouter.ai/v1/chat/completions 在线调试 →
Authorization

在 Header 添加参数 Authorization,其值为 Bearer 之后拼接 Token

示例: Authorization: Bearer ********************

请求参数

Header 参数
Content-Type string
必需
示例: application/json
Authorization string
必需
示例: Bearer $OPENAI_API_KEY
Body 参数 application/json
model string
必需
要使用的模型的 ID。有关哪些模型可与聊天 API 一起使用的详细信息,请参阅模型端点兼容性表。
messages array [object]
必需
至今为止对话所包含的消息列表。
role string
可选
content string
可选
temperature number
可选
使用什么采样温度,介于 0 和 2 之间。较高的值(如 0.8)将使输出更加随机,而较低的值(如 0.2)将使输出更加集中和确定。我们通常建议改变这个或 top_p 但不是两者。
top_p number
可选
一种替代温度采样的方法,称为核采样,其中模型考虑具有 top_p 概率质量的标记的结果。所以 0.1 意味着只考虑构成前 10% 概率质量的标记。我们通常建议改变这个或 temperature 但不是两者。
n integer
可选
默认为 1。为每个输入消息生成多少个聊天补全选择。
stream boolean
可选
默认为 false。如果设置,则像在 ChatGPT 中一样会发送部分消息增量。标记将以仅数据的服务器发送事件的形式发送,并在 data: [DONE] 消息终止流。
stop string
可选
默认为 null。最多 4 个序列,API 将停止进一步生成标记。
max_tokens integer
可选
默认为 inf。在聊天补全中生成的最大标记数。输入标记和生成标记的总长度受模型的上下文长度限制。
presence_penalty number
可选
-2.0 和 2.0 之间的数字。正值会根据到目前为止是否出现在文本中来惩罚新标记,从而增加模型谈论新主题的可能性。
frequency_penalty number
可选
默认为 0。-2.0 到 2.0 之间的数字。正值根据文本目前的存在频率惩罚新标记,降低模型重复相同行的可能性。
logit_bias object
可选
修改指定标记出现在补全中的可能性。接受一个 JSON 对象,该对象将标记映射到相关的偏差值(-100 到 100)。
user string
可选
代表您的最终用户的唯一标识符,可以帮助监控和检测滥用行为。
tools array [object]
可选
模型可以调用的一组工具列表。目前,只支持作为工具的函数。
tool_choice object
可选
控制模型调用哪个函数(如果有的话)。none 表示不调用函数,auto 表示自动选择。
response_format object
可选
指定模型必须输出的格式的对象。将 { "type": "json_object" } 启用 JSON 模式。
seed integer
可选
此功能处于测试阶段。如果指定,我们的系统将尽最大努力确定性地进行采样。
stream_options object
可选
流式响应选项,如 { "include_usage": true } 可在最后一个 chunk 中返回 usage 信息。
示例
{
  "model": "gpt-5-mini",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "你好"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}

请求示例代码

curl --location --request POST 'https://api.quickrouter.ai/v1/chat/completions' \
--header 'Accept: application/json' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--header 'Content-Type: application/json' \
--data-raw '{
  "model": "gpt-5-mini",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "你好"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}'
var myHeaders = new Headers();
myHeaders.append("Accept", "application/json");
myHeaders.append("Authorization", "Bearer YOUR_API_KEY");
myHeaders.append("Content-Type", "application/json");

var raw = JSON.stringify({
  "model": "gpt-5-mini",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "你好"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
});

var requestOptions = {
   method: 'POST',
   headers: myHeaders,
   body: raw,
   redirect: 'follow'
};

fetch("https://api.quickrouter.ai/v1/chat/completions", requestOptions)
   .then(response => response.text())
   .then(result => console.log(result))
   .catch(error => console.log('error', error));
import java.io.*;
import java.net.*;
import java.util.*;

URL url = new URL("https://api.quickrouter.ai/v1/chat/completions");
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("POST");
conn.setRequestProperty("Accept", "application/json");
conn.setRequestProperty("Authorization", "Bearer YOUR_API_KEY");
conn.setRequestProperty("Content-Type", "application/json");
conn.setDoOutput(true);
String jsonInputString = "{
  \"model\": \"gpt-5-mini\",
  \"max_tokens\": 1000,
  \"messages\": [
    {
      \"role\": \"system\",
      \"content\": \"You are a helpful assistant.\"
    },
    {
      \"role\": \"user\",
      \"content\": \"你好\"
    }
  ],
  \"temperature\": 1.0,
  \"stream\": true,
  \"stream_options\": {
    \"include_usage\": true
  }
}";
try(OutputStream os = conn.getOutputStream()) {
    byte[] input = jsonInputString.getBytes("utf-8");
    os.write(input, 0, input.length);
}
int responseCode = conn.getResponseCode();
System.out.println("Response Code: " + responseCode);
import Foundation

let urlString = "https://api.quickrouter.ai/v1/chat/completions"
guard let url = URL(string: urlString) else { return }
var request = URLRequest(url: url)
request.httpMethod = "POST"
request.addValue("application/json", forHTTPHeaderField: "Accept")
request.addValue("Bearer YOUR_API_KEY", forHTTPHeaderField: "Authorization")
request.addValue("application/json", forHTTPHeaderField: "Content-Type")
let httpBody = "{
  \"model\": \"gpt-5-mini\",
  \"max_tokens\": 1000,
  \"messages\": [
    {
      \"role\": \"system\",
      \"content\": \"You are a helpful assistant.\"
    },
    {
      \"role\": \"user\",
      \"content\": \"你好\"
    }
  ],
  \"temperature\": 1.0,
  \"stream\": true,
  \"stream_options\": {
    \"include_usage\": true
  }
}"
request.httpBody = httpBody.data(using: .utf8)

let task = URLSession.shared.dataTask(with: request) { data, response, error in
    if let data = data {
        print(String(data: data, encoding: .utf8)!)
    }
}
task.resume()
package main

import (
    "fmt"
    "io"
    "net/http"
)

func main() {
    body := strings.NewReader(`{
  "model": "gpt-5-mini",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "你好"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}`)
    req, _ := http.NewRequest("POST", "https://api.quickrouter.ai/v1/chat/completions", body)
    req.Header.Set("Accept", "application/json")
    req.Header.Set("Authorization", "Bearer YOUR_API_KEY")
    req.Header.Set("Content-Type", "application/json")
    client := &http.Client{}
    resp, _ := client.Do(req)
    defer resp.Body.Close()
    bodyBytes, _ := io.ReadAll(resp.Body)
    fmt.Println(string(bodyBytes))
}
<?php

$curl = curl_init();
curl_setopt_array($curl, array(
  CURLOPT_URL => 'https://api.quickrouter.ai/v1/chat/completions',
  CURLOPT_RETURNTRANSFER => true,
  CURLOPT_CUSTOMREQUEST => 'POST',
  CURLOPT_POSTFIELDS => '{
  "model": "gpt-5-mini",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "你好"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}',
  CURLOPT_HTTPHEADER => array(
    "Accept: application/json",
    "Authorization: Bearer YOUR_API_KEY",
    "Content-Type: application/json",
  ),
));
$response = curl_exec($curl);
curl_close($curl);
echo $response;
import http.client
import json

conn = http.client.HTTPSConnection("api.quickrouter.ai")
payload = json.dumps({
  "model": "gpt-5-mini",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "你好"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
})
headers = {
   'Accept': 'application/json',
   'Authorization': 'Bearer YOUR_API_KEY',
   'Content-Type': 'application/json',
}
conn.request("POST", "/v1/chat/completions", payload, headers)
res = conn.getresponse()
data = res.read()
print(data.decode("utf-8"))
POST https://api.quickrouter.ai/v1/chat/completions HTTP/1.1
Accept: application/json
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

{
  "model": "gpt-5-mini",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "你好"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}
CURL *hnd = curl_easy_init();
curl_easy_setopt(hnd, CURLOPT_CUSTOMREQUEST, "POST");
curl_easy_setopt(hnd, CURLOPT_URL, "https://api.quickrouter.ai/v1/chat/completions");

struct curl_slist *headers = NULL;
headers = curl_slist_append(headers, "Accept: application/json");
headers = curl_slist_append(headers, "Authorization: Bearer YOUR_API_KEY");
headers = curl_slist_append(headers, "Content-Type: application/json");
curl_easy_setopt(hnd, CURLOPT_HTTPHEADER, headers);
curl_easy_setopt(hnd, CURLOPT_POSTFIELDS, "{
  \"model\": \"gpt-5-mini\",
  \"max_tokens\": 1000,
  \"messages\": [
    {
      \"role\": \"system\",
      \"content\": \"You are a helpful assistant.\"
    },
    {
      \"role\": \"user\",
      \"content\": \"你好\"
    }
  ],
  \"temperature\": 1.0,
  \"stream\": true,
  \"stream_options\": {
    \"include_usage\": true
  }
}");
CURLcode ret = curl_easy_perform(hnd);
var client = new RestClient("https://api.quickrouter.ai/v1/chat/completions");
var request = new RestRequest(Method.POST);
request.AddHeader("Accept", "application/json");
request.AddHeader("Authorization", "Bearer YOUR_API_KEY");
request.AddHeader("Content-Type", "application/json");
request.AddParameter("application/json", @"{
  "model": "gpt-5-mini",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "你好"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}", ParameterType.RequestBody);
IRestResponse response = client.Execute(request);
Console.WriteLine(response.Content);
#import <Foundation/Foundation.h>

NSURL *url = [NSURL URLWithString:@"https://api.quickrouter.ai/v1/chat/completions"];
NSMutableURLRequest *request = [NSMutableURLRequest requestWithURL:url];
[request setHTTPMethod:@"POST"];
[request setValue:@"application/json" forHTTPHeaderField:@"Accept"];
[request setValue:@"Bearer YOUR_API_KEY" forHTTPHeaderField:@"Authorization"];
[request setValue:@"application/json" forHTTPHeaderField:@"Content-Type"];
[request setHTTPBody:[@"{
  \"model\": \"gpt-5-mini\",
  \"max_tokens\": 1000,
  \"messages\": [
    {
      \"role\": \"system\",
      \"content\": \"You are a helpful assistant.\"
    },
    {
      \"role\": \"user\",
      \"content\": \"你好\"
    }
  ],
  \"temperature\": 1.0,
  \"stream\": true,
  \"stream_options\": {
    \"include_usage\": true
  }
}" dataUsingEncoding:NSUTF8StringEncoding]];
NSURLSessionDataTask *task = [[NSURLSession sharedSession] dataTaskWithRequest:request completionHandler:^(NSData *data, NSURLResponse *response, NSError *error) {
    NSLog(@"%@", [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding]);
}];
[task resume];
require "uri"
require "net/http"
require "json"

url = URI("https://api.quickrouter.ai/v1/chat/completions")
http = Net::HTTP.new(url.host, url.port)
http.use_ssl = true
request = Net::HTTP::Post.new(url)
request["Accept"] = "application/json"
request["Authorization"] = "Bearer YOUR_API_KEY"
request["Content-Type"] = "application/json"
request.body = '{
  "model": "gpt-5-mini",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "你好"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}'
response = http.request(request)
puts response.read_body
(* Requires cohttp and lwt *)

let url = "https://api.quickrouter.ai/v1/chat/completions" in
let headers = Cohttp.Header.of_list [
  ("Accept", "application/json");
  ("Authorization", "Bearer YOUR_API_KEY");
  ("Content-Type", "application/json");
] in
let body = Cohttp_lwt.Body.of_string '{
  "model": "gpt-5-mini",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "你好"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}' in
Lwt_main.run (
  Cohttp_lwt_unix.Client.request ?body:(Some body) ~method_:`POST ~headers (Uri.of_string url)
  >>= fun (resp, body) ->
  Cohttp_lwt.Body.to_string body >|= fun s -> print_endline s
)
import 'package:http/http.dart' as http;
import 'dart:convert';

var headers = {
  "Accept": "application/json",
  "Authorization": "Bearer YOUR_API_KEY",
  "Content-Type": "application/json",
};
var body = json.encode({
  "model": "gpt-5-mini",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "你好"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
});
var response = await http.post(Uri.parse("https://api.quickrouter.ai/v1/chat/completions"), headers: headers, body: body);
print(response.body);
library(httr)

url <- "https://api.quickrouter.ai/v1/chat/completions"
body <- '{
  "model": "gpt-5-mini",
  "max_tokens": 1000,
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "你好"
    }
  ],
  "temperature": 1.0,
  "stream": true,
  "stream_options": {
    "include_usage": true
  }
}'
response <- post(url, body = body, add_headers("Accept" = "application/json", "Authorization" = "Bearer YOUR_API_KEY", "Content-Type" = "application/json"))
content(response, "text", encoding = "UTF-8")

返回响应

响应参数 🟢 200 OK · application/json
id string
必需
object string
必需
created integer
必需
model string
必需
choices array [object]
必需
usage object
必需
示例
{
    "id": "chatcmpl-123",
    "object": "chat.completion",
    "created": 1677652288,
    "model": "gpt-3.5-turbo-0613",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "\n\nHello there, how may I assist you today?"
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 12,
        "total_tokens": 21
    }
}

OpenAI API 中转的流式输出场景

当应用需要像 ChatGPT 一样逐字返回内容时,可以使用本页的 Chat 流式补全接口。通过 QuickRouter API 中转站,国内服务器可以直接请求 /v1/chat/completions,并设置 stream: true 接收 SSE 数据流。

场景建议
聊天机器人使用 stream: true,前端按 chunk 渲染
AI 写作工具边生成边展示,减少用户等待感
Agent 工作流将模型输出、工具调用日志和最终结果分阶段展示
国内应用后端服务端请求 QuickRouter API,再转发 SSE 给前端