API Reference
Base URL
Every Squad Server exposes its API at:
https://<your-squad-id>.syndicai.dev/v1
Replace <your-squad-id> with the ID shown in your Squad Server dashboard.
Authentication
All API requests require a Bearer token using your API key:
Authorization: Bearer your-api-key
Generate API keys from your Squad Server dashboard. Each squad member should use their own key for individual usage tracking.
Endpoints
POST /v1/chat/completions
Create a chat completion. This is the primary endpoint for all coding interactions.
Request body:
| Field | Type | Required | Description |
|---|---|---|---|
model |
string | Yes | Model identifier (e.g., minimax-m2.5) |
messages |
array | Yes | Array of message objects with role and content |
temperature |
number | No | Sampling temperature (0.0–2.0, default: 0.7) |
max_tokens |
number | No | Maximum tokens to generate |
stream |
boolean | No | Enable streaming responses (default: false) |
top_p |
number | No | Nucleus sampling parameter (default: 1.0) |
stop |
string or array | No | Stop sequences |
Example request (curl):
curl -X POST https://squad-abc123.syndicai.dev/v1/chat/completions \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "minimax-m2.5",
"messages": [
{"role": "system", "content": "You are a senior software engineer."},
{"role": "user", "content": "Write a TypeScript function that debounces an async function and returns the result of the latest call."}
],
"temperature": 0.3,
"max_tokens": 2048
}'
Example response:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1741900800,
"model": "minimax-m2.5",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Here's a TypeScript debounce function for async operations:\n\n```typescript\nfunction debounceAsync<T extends (...args: any[]) => Promise<any>>(\n fn: T,\n delay: number\n): (...args: Parameters<T>) => Promise<ReturnType<T>> {\n let timeoutId: ReturnType<typeof setTimeout> | null = null;\n let latestResolve: ((value: any) => void) | null = null;\n\n return (...args: Parameters<T>) => {\n return new Promise((resolve) => {\n if (timeoutId) clearTimeout(timeoutId);\n latestResolve = resolve;\n\n timeoutId = setTimeout(async () => {\n const result = await fn(...args);\n if (latestResolve === resolve) {\n resolve(result);\n }\n }, delay);\n });\n };\n}\n```"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 42,
"completion_tokens": 186,
"total_tokens": 228
}
}
GET /v1/models
List available models on your Squad Server.
Example request:
curl https://squad-abc123.syndicai.dev/v1/models \
-H "Authorization: Bearer your-api-key"
Example response:
{
"object": "list",
"data": [
{
"id": "minimax-m2.5",
"object": "model",
"owned_by": "syndicai"
}
]
}
Streaming
Set "stream": true to receive Server-Sent Events (SSE) as the model generates tokens:
curl -X POST https://squad-abc123.syndicai.dev/v1/chat/completions \
-H "Authorization: Bearer your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "minimax-m2.5",
"messages": [{"role": "user", "content": "Explain quicksort in one paragraph"}],
"stream": true
}'
Each SSE event contains a delta with the incremental content:
data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Quick"},"index":0}]}
data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"sort"},"index":0}]}
...
data: [DONE]
Code examples
Python
from openai import OpenAI
client = OpenAI(
base_url="https://squad-abc123.syndicai.dev/v1",
api_key="your-api-key",
)
# Non-streaming
response = client.chat.completions.create(
model="minimax-m2.5",
messages=[
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write tests for a React useDebounce hook"},
],
temperature=0.3,
)
print(response.choices[0].message.content)
# Streaming
stream = client.chat.completions.create(
model="minimax-m2.5",
messages=[{"role": "user", "content": "Explain the visitor pattern"}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
TypeScript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://squad-abc123.syndicai.dev/v1",
apiKey: "your-api-key",
});
const response = await client.chat.completions.create({
model: "minimax-m2.5",
messages: [
{ role: "system", content: "You are a senior TypeScript developer." },
{ role: "user", content: "Refactor this function to use generics" },
],
temperature: 0.3,
});
console.log(response.choices[0].message.content);
Rate limiting
syndicAI Squad Servers have no token-based rate limits. Your server processes requests as fast as the GPU allows. The only limit is your tier's daily GPU-hours — once reached, the server auto-stops until the next day.
During active hours, the vLLM engine handles concurrent requests from all squad members efficiently via continuous batching. Typical throughput: 80–120 tokens/second for generation, depending on the model and GPU configuration.
Error codes
| Status | Code | Description |
|---|---|---|
| 401 | invalid_api_key |
API key is missing, invalid, or revoked |
| 403 | squad_not_member |
Your API key is not associated with this squad |
| 404 | model_not_found |
The requested model is not loaded on this server |
| 429 | server_busy |
Server is at capacity; retry after a short delay |
| 503 | server_stopped |
Squad Server is stopped (daily hours exhausted or manually stopped) |
| 503 | server_provisioning |
Squad Server is still starting up |