XantlyANTLY
API Reference

Completions (Legacy)

Create text completions using the legacy prompt-based API. This endpoint translates requests into the chat completions pipeline internally, giving you access to the full Xantly routing engine while ma

Create text completions using the legacy prompt-based API. This endpoint translates requests into the chat completions pipeline internally, giving you access to the full Xantly routing engine while maintaining backward compatibility.

  • POST /v1/completions
  • Auth: Authorization: Bearer <token>
  • Drop-in compatible with the OpenAI legacy Completions API.

Note: This is a legacy endpoint. For new integrations, use Chat Completions instead — it supports the same models with a richer feature set (tool calling, multimodal input, structured output).


Quick start

curl -sS https://api.xantly.com/v1/completions \
  -H "Authorization: Bearer $XANTLY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "prompt": "The capital of France is"
  }'

Request body

FieldTypeRequiredDescription
modelstringYesModel slug or "auto" for intelligent routing.
promptstring | array<string>NoThe prompt text. Multiple strings are joined with newlines. Defaults to empty string.
max_tokensintegerNoMaximum tokens to generate.
temperaturenumberNoSampling temperature (0.02.0).
top_pnumberNoNucleus sampling parameter.
nintegerNoNumber of completions to generate.
streambooleanNoNot supported — returns a validation error. Use /v1/chat/completions for streaming.
logprobsintegerNoInclude log probabilities on the most likely tokens.
echobooleanNoEcho back the prompt in addition to the completion.
stopstring | array<string>NoStop sequences.
presence_penaltynumberNoPenalize new tokens based on presence in text so far (-2.02.0).
frequency_penaltynumberNoPenalize new tokens based on frequency in text so far (-2.02.0).
best_ofintegerNoAccepted for compatibility.
suffixstringNoAccepted for compatibility.
userstringNoEnd-user identifier.

Response body

{
  "id": "chatcmpl-abc123",
  "object": "text_completion",
  "created": 1741400000,
  "model": "deepseek-chat",
  "choices": [
    {
      "text": " Paris, which is also the largest city in France.",
      "index": 0,
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 7,
    "completion_tokens": 12,
    "total_tokens": 19
  }
}
FieldTypeDescription
idstringUnique completion identifier.
objectstringAlways "text_completion".
createdintegerUnix epoch timestamp.
modelstringModel that generated the completion.
choicesarrayOne choice per completion.
choices[].textstringGenerated text. If echo is true, includes the prompt.
choices[].indexintegerChoice index (0-based).
choices[].logprobsobject | nullLog probabilities if requested.
choices[].finish_reasonstring"stop", "length", etc.
usageobjectToken counts for billing.

Code examples

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["XANTLY_API_KEY"],
    base_url="https://api.xantly.com/v1",
)

response = client.completions.create(
    model="auto",
    prompt="Once upon a time in a land far away,",
    max_tokens=50,
)
print(response.choices[0].text)
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.XANTLY_API_KEY,
  baseURL: "https://api.xantly.com/v1",
});

const response = await client.completions.create({
  model: "auto",
  prompt: "Once upon a time in a land far away,",
  max_tokens: 50,
});
console.log(response.choices[0].text);

Errors

HTTPerror.typeTypical trigger
400invalid_request_errorstream: true (not supported), missing model.
401authentication_errorMissing or invalid Bearer token.
429rate_limit_errorRate limit exceeded.
402billing_errorToken quota or budget exceeded.
500internal_errorProvider error or internal failure.

Next steps

On this page