API Reference

Base URL

Every Squad Server exposes its API at:

https://<your-squad-id>.syndicai.dev/v1

Replace <your-squad-id> with the ID shown in your Squad Server dashboard.

Authentication

All API requests require a Bearer token using your API key:

Authorization: Bearer your-api-key

Generate API keys from your Squad Server dashboard. Each squad member should use their own key for individual usage tracking.

Endpoints

POST /v1/chat/completions

Create a chat completion. This is the primary endpoint for all coding interactions.

Request body:

Field Type Required Description
model string Yes Model identifier (e.g., minimax-m2.5)
messages array Yes Array of message objects with role and content
temperature number No Sampling temperature (0.0–2.0, default: 0.7)
max_tokens number No Maximum tokens to generate
stream boolean No Enable streaming responses (default: false)
top_p number No Nucleus sampling parameter (default: 1.0)
stop string or array No Stop sequences

Example request (curl):

curl -X POST https://squad-abc123.syndicai.dev/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax-m2.5",
    "messages": [
      {"role": "system", "content": "You are a senior software engineer."},
      {"role": "user", "content": "Write a TypeScript function that debounces an async function and returns the result of the latest call."}
    ],
    "temperature": 0.3,
    "max_tokens": 2048
  }'

Example response:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1741900800,
  "model": "minimax-m2.5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Here's a TypeScript debounce function for async operations:\n\n```typescript\nfunction debounceAsync<T extends (...args: any[]) => Promise<any>>(\n  fn: T,\n  delay: number\n): (...args: Parameters<T>) => Promise<ReturnType<T>> {\n  let timeoutId: ReturnType<typeof setTimeout> | null = null;\n  let latestResolve: ((value: any) => void) | null = null;\n\n  return (...args: Parameters<T>) => {\n    return new Promise((resolve) => {\n      if (timeoutId) clearTimeout(timeoutId);\n      latestResolve = resolve;\n\n      timeoutId = setTimeout(async () => {\n        const result = await fn(...args);\n        if (latestResolve === resolve) {\n          resolve(result);\n        }\n      }, delay);\n    });\n  };\n}\n```"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 42,
    "completion_tokens": 186,
    "total_tokens": 228
  }
}

GET /v1/models

List available models on your Squad Server.

Example request:

curl https://squad-abc123.syndicai.dev/v1/models \
  -H "Authorization: Bearer your-api-key"

Example response:

{
  "object": "list",
  "data": [
    {
      "id": "minimax-m2.5",
      "object": "model",
      "owned_by": "syndicai"
    }
  ]
}

Streaming

Set "stream": true to receive Server-Sent Events (SSE) as the model generates tokens:

curl -X POST https://squad-abc123.syndicai.dev/v1/chat/completions \
  -H "Authorization: Bearer your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "minimax-m2.5",
    "messages": [{"role": "user", "content": "Explain quicksort in one paragraph"}],
    "stream": true
  }'

Each SSE event contains a delta with the incremental content:

data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"Quick"},"index":0}]}
data: {"id":"chatcmpl-abc","choices":[{"delta":{"content":"sort"},"index":0}]}
...
data: [DONE]

Code examples

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://squad-abc123.syndicai.dev/v1",
    api_key="your-api-key",
)

# Non-streaming
response = client.chat.completions.create(
    model="minimax-m2.5",
    messages=[
        {"role": "system", "content": "You are a helpful coding assistant."},
        {"role": "user", "content": "Write tests for a React useDebounce hook"},
    ],
    temperature=0.3,
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="minimax-m2.5",
    messages=[{"role": "user", "content": "Explain the visitor pattern"}],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://squad-abc123.syndicai.dev/v1",
  apiKey: "your-api-key",
});

const response = await client.chat.completions.create({
  model: "minimax-m2.5",
  messages: [
    { role: "system", content: "You are a senior TypeScript developer." },
    { role: "user", content: "Refactor this function to use generics" },
  ],
  temperature: 0.3,
});

console.log(response.choices[0].message.content);

Rate limiting

syndicAI Squad Servers have no token-based rate limits. Your server processes requests as fast as the GPU allows. The only limit is your tier's daily GPU-hours — once reached, the server auto-stops until the next day.

During active hours, the vLLM engine handles concurrent requests from all squad members efficiently via continuous batching. Typical throughput: 80–120 tokens/second for generation, depending on the model and GPU configuration.

Error codes

Status Code Description
401 invalid_api_key API key is missing, invalid, or revoked
403 squad_not_member Your API key is not associated with this squad
404 model_not_found The requested model is not loaded on this server
429 server_busy Server is at capacity; retry after a short delay
503 server_stopped Squad Server is stopped (daily hours exhausted or manually stopped)
503 server_provisioning Squad Server is still starting up