API reference

The Zylo API

One OpenAI-compatible endpoint for Claude, GPT, Gemini, DeepSeek and more. This is the full Zylo API reference — authentication, base URL, endpoints, streaming, tool calling, web-extract, rate limits per plan and error codes — with copy-paste examples in Python, JavaScript, Go, Ruby and PHP.

Base URL: https://api.zyloai.net/v1 Auth: Bearer token OpenAI-compatible 40+ models

On this page

Jump to any part of the Zylo API reference.

Authentication

Every Zylo API request is authenticated with your API key.

Create a key at console.zyloai.net — it is issued on the free Basic plan, no credit card. Send it on every request, either as an Authorization: Bearer header (recommended — this is what the OpenAI SDKs send) or as an X-API-Key header. Keep keys server-side; you can rotate or replace a key from the console at any time.

Terminal
# Recommended — Bearer header (works with every OpenAI SDK)
curl https://api.zyloai.net/v1/chat/completions \
  -H "Authorization: Bearer YOUR_ZYLO_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "claude-opus-4.8", "messages": [{"role":"user","content":"Hi"}] }'

# Also accepted — X-API-Key header
curl https://api.zyloai.net/v1/chat/completions \
  -H "X-API-Key: YOUR_ZYLO_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "gpt-5.5", "messages": [{"role":"user","content":"Hi"}] }'

Endpoints

All paths are relative to the base URL https://api.zyloai.net/v1.

MethodPathWhat it does
POST /v1/chat/completions OpenAI-compatible chat completions. Supports stream, tools and multimodal input. The main endpoint for most apps.
POST /generate Native generation endpoint with attachments and tools, returning a flat message plus usage, latency and credits.
POST /v1/web-extract Scrape and extract clean text from a URL for RAG. 10 requests/minute per key.
POST /v1/images/generations Generate or edit images with supported standard and premium models.
GET /v1/models List the models your key can call on its current plan.
GET /stats Your current credit balance, request limits and usage history.
GET /validate Check whether a key is valid and return its plan capabilities.
A standard 200 OK chat completion returns an OpenAI-shaped object with id, choices[].message, finish_reason and a usage block (prompt_tokens, completion_tokens, total_tokens). See the full documentation for every field.

Core request parameters

Sent in the JSON body of a chat completions or generate request.

ParameterTypeNotes
modelstringRequired. A bare model id, e.g. claude-opus-4.8, gpt-5.5, gemini-3.1-pro-preview.
messagesarrayRequired. Conversation history of {role, content} objects.
temperaturefloatSampling temperature, 0–1. Default 0.7.
max_tokensintegerMax new tokens to generate. Alias of max_new_tokens.
top_pfloatNucleus sampling, 0–1. Default 1.0.
frequency_penaltyfloatReduce repetition, 0–2. Default 0.
presence_penaltyfloatEncourage new topics, 0–2. Default 0.
response_formatstringSet to "json_object" to force valid JSON output.
streambooleanStream tokens as server-sent events. Default false.
toolsarrayFunction definitions and/or native tools the model may call. Paid plans only.

Streaming responses

Set stream: true to receive tokens as server-sent events in the OpenAI delta format, ending with a data: [DONE] line.

Python
# pip install openai
from openai import OpenAI

client = OpenAI(api_key="YOUR_ZYLO_KEY", base_url="https://api.zyloai.net/v1")

stream = client.chat.completions.create(
    model="claude-opus-4.8",
    messages=[{"role": "user", "content": "Write a haiku about streaming."}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)
Node.js
// npm install openai
import OpenAI from "openai";

const client = new OpenAI({ apiKey: "YOUR_ZYLO_KEY", baseURL: "https://api.zyloai.net/v1" });

const stream = await client.chat.completions.create({
  model: "claude-opus-4.8",
  messages: [{ role: "user", content: "Write a haiku about streaming." }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}
Terminal
curl https://api.zyloai.net/v1/chat/completions \
  -H "Authorization: Bearer YOUR_ZYLO_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4.8",
    "messages": [{"role": "user", "content": "Write a haiku about streaming."}],
    "stream": true
  }'
# -> data: {"choices":[{"delta":{"content":"..."}}]}  ... ends with  data: [DONE]

Tool / function calling

Pass a tools array so the model can call your functions, or attach Zylo's native tools. Available on paid plans (not Basic).

Python
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]

resp = client.chat.completions.create(
    model="gpt-5.5",
    messages=[{"role": "user", "content": "What's the weather in Lima?"}],
    tools=tools,
)
# resp.choices[0].message.tool_calls -> call get_weather, then send the result back
JSON body
{
  "model": "gemini-3.1-pro-preview",
  "messages": [{"role": "user", "content": "Summarise today's AI news."}],
  "tools": [
    { "type": "function", "function": { "name": "google_search" } },
    { "type": "function", "function": { "name": "url_context" } },
    { "type": "function", "function": { "name": "code_execution" } }
  ]
}
# Native tools share platform-wide limits (see Rate limits below).

Web-extract for RAG

The /v1/web-extract endpoint pulls clean text from a URL so you can ground answers in real sources — no separate scraping stack. Limited to 10 requests/minute per key.

Python
# 1) Extract clean content from the web
import requests

extract = requests.post(
    "https://api.zyloai.net/v1/web-extract",
    headers={"Authorization": "Bearer YOUR_ZYLO_KEY"},
    json={"url": "https://example.com/article"},
).json()

# 2) Ground a normal chat completion in the extracted text
answer = client.chat.completions.create(
    model="gemini-3.1-pro-preview",
    messages=[
        {"role": "system", "content": "Answer using only the provided sources."},
        {"role": "user", "content": f"Sources:\n{extract}\n\nQuestion: What changed?"},
    ],
)
print(answer.choices[0].message.content)
Building retrieval pipelines? See the deeper walkthrough in Build a RAG pipeline with built-in web-extract.

Multimodal / image input

Send images alongside text using the OpenAI content-parts format. Multimodal input requires a paid plan (Basic is text-only).

Python
resp = client.chat.completions.create(
    model="gemini-3.1-pro-preview",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url",
             "image_url": {"url": "data:image/jpeg;base64,..."}},
        ],
    }],
)
print(resp.choices[0].message.content)

Error codes & retries

The Zylo API uses standard HTTP status codes.

StatusMeaningWhat to do
200OKRequest completed successfully.
400Bad RequestMissing parameters or invalid JSON — fix the body.
401UnauthorizedInvalid or missing API key — check the header.
402Payment RequiredOut of credits — top up to keep using premium models.
403ForbiddenPlan limit reached or model not on your plan — upgrade.
429Too Many RequestsRate limit exceeded — back off and retry (see below).
500Server ErrorTransient internal error — retry with backoff.

Retry 429 and 5xx responses with exponential backoff. Do not retry 400/401/403 — fix the request instead.

Python
import time
from openai import OpenAI, APIStatusError

client = OpenAI(api_key="YOUR_ZYLO_KEY", base_url="https://api.zyloai.net/v1")

def chat_with_retry(messages, model="claude-opus-4.8", retries=4):
    for attempt in range(retries):
        try:
            return client.chat.completions.create(model=model, messages=messages)
        except APIStatusError as e:
            # Retry only on rate limits / server errors
            if e.status_code in (429, 500, 502, 503) and attempt < retries - 1:
                time.sleep(2 ** attempt)  # 1s, 2s, 4s, 8s
                continue
            raise

SDKs & languages

Because the Zylo API is OpenAI-compatible, any OpenAI SDK or plain HTTP client works — just repoint the base URL. Here it is in Go, Ruby and PHP.

Go
package main

import (
    "bytes"
    "net/http"
)

func main() {
    body := []byte(`{"model":"claude-opus-4.8","messages":[{"role":"user","content":"Hello from Zylo!"}]}`)
    req, _ := http.NewRequest("POST", "https://api.zyloai.net/v1/chat/completions", bytes.NewBuffer(body))
    req.Header.Set("Authorization", "Bearer YOUR_ZYLO_KEY")
    req.Header.Set("Content-Type", "application/json")
    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    // decode resp.Body into your struct
}
Ruby
require "net/http"
require "json"

uri = URI("https://api.zyloai.net/v1/chat/completions")
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true

req = Net::HTTP::Post.new(uri)
req["Authorization"] = "Bearer YOUR_ZYLO_KEY"
req["Content-Type"]  = "application/json"
req.body = {
  model: "claude-opus-4.8",
  messages: [{ role: "user", content: "Hello from Zylo!" }]
}.to_json

res = http.request(req)
puts JSON.parse(res.body).dig("choices", 0, "message", "content")
PHP
<?php
$ch = curl_init("https://api.zyloai.net/v1/chat/completions");
curl_setopt_array($ch, [
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_HTTPHEADER => [
        "Authorization: Bearer YOUR_ZYLO_KEY",
        "Content-Type: application/json",
    ],
    CURLOPT_POSTFIELDS => json_encode([
        "model" => "claude-opus-4.8",
        "messages" => [["role" => "user", "content" => "Hello from Zylo!"]],
    ]),
]);
$data = json_decode(curl_exec($ch), true);
echo $data["choices"][0]["message"]["content"];
Prefer the official SDK? Python and JavaScript work unchanged — see the quick start for the two-line setup.

Rate limits per plan

Plans set your daily request and token limits; usage is billed separately from prepaid credits at base per-token rates.

PlanRequestsDaily tokensModels & extras
Basic · $07.2k/day · 10/min200kBasic models only · text only · no credits
Go · $1028.8k/day512kPremium models · web search · $10 credits
Pro · $5043.2k/day1MCode execution · $50 credits
Mega · $20086.4k/day5MPriority access · $200 credits
Enterprise · $400UnlimitedUnlimitedDedicated GPU · $400 credits
Tool & endpoint limits are enforced separately and platform-wide: web-extract is 10 req/min per key; the native web search and URL context tools share a combined 10 req/min budget; code execution has its own 20 req/min budget. Exceeding any limit returns 429. Full breakdown on the pricing page.

Switching to the Zylo API

You keep your OpenAI-compatible client. Repoint the base URL, use your Zylo key, and pick a model id. Here is the before/after.

From OpenAI

Before · OpenAI
Python
client = OpenAI(
    api_key="OPENAI_KEY",
)
model = "gpt-5.5"
After · Zylo
Python
client = OpenAI(
    api_key="YOUR_ZYLO_KEY",
    base_url="https://api.zyloai.net/v1",
)
model = "gpt-5.5"  # or claude-opus-4.8, gemini-3.1-pro...

From OpenRouter

Before · OpenRouter
Python
client = OpenAI(
    api_key="OPENROUTER_KEY",
    base_url="https://openrouter.ai/api/v1",
)
model = "anthropic/claude-opus-4.8"
After · Zylo
Python
client = OpenAI(
    api_key="YOUR_ZYLO_KEY",
    base_url="https://api.zyloai.net/v1",
)
model = "claude-opus-4.8"  # drop the vendor/ prefix

Model changelog

Recent additions to the Zylo API catalogue. The models page is always the live, complete list.

New models appear here as they ship. For the authoritative, real-time list and pricing, query GET /v1/models or browse all models.

Reliability & status

Operational facts you can verify, not marketing numbers.

Live status page

Per-model availability and incident history are published at status.zyloai.net, updated by an automated prober.

Per-response metrics

Every generation returns a real latency value and a usage token count, so you measure throughput from your own traffic — no guessing.

Zero data retention

Prompts and completions are never stored. Zylo is a passthrough to the providers; only numeric usage is logged for billing.

Throughput by plan

Sustained capacity scales with your plan — up to 5M tokens/day on Mega, and unlimited requests and tokens on Enterprise.

Zylo API FAQ

The questions developers ask most about the Zylo API.

How do I get a Zylo API key?

Sign up at console.zyloai.net and your API key is created on the Basic plan with no credit card. Copy it from the console, send it as an Authorization: Bearer header, and you can call any Basic-tier model immediately. Upgrade to a paid plan to unlock premium models and credits. You can rotate or replace the key from the console at any time.

What is the Zylo API base URL?

The Zylo API base URL is https://api.zyloai.net/v1. Point any OpenAI-compatible SDK at that base URL, use your Zylo API key, and call /chat/completions with a bare model id such as claude-opus-4.8 or gpt-5.5.

Is the Zylo API OpenAI-compatible?

Yes. The Zylo API implements the OpenAI Chat Completions schema, so the official OpenAI SDKs work unchanged — set base_url to https://api.zyloai.net/v1 and use your Zylo key. Request and response shapes, streaming and tool calling all match.

Does the Zylo API support streaming?

Yes. Set "stream": true on a chat completions request and the Zylo API returns tokens as server-sent events in the OpenAI delta format, terminated by a data: [DONE] line. Any OpenAI-compatible streaming client works.

Does the Zylo API support function (tool) calling?

Yes. Pass a tools array of function definitions and the model can return tool calls, exactly like the OpenAI API. Zylo also exposes native tools — web search, URL context and a code-execution sandbox. Tool calling is available on paid plans, not the free Basic plan.

What are the Zylo API rate limits?

Rate limits depend on your plan: Basic allows 7,200 requests/day and 10 requests/minute with 200k daily tokens; Go 28,800/day with 512k tokens; Pro 43,200/day with 1M tokens; Mega 86,400/day with 5M tokens; Enterprise is unlimited. Exceeding a limit returns 429. The web-extract endpoint is limited to 10 requests/minute per key, and native tools share platform-wide limits.

Is the Zylo API free?

The Basic plan is free with no credit card and includes a daily token and request allowance on Basic-tier models. Premium models require a paid plan and prepaid credits; usage is billed at each model's base per-token rate with no markup, and a flat 25% platform fee applies only when you add credits.

Which models does the Zylo API support?

Frontier and cost-efficient models from seven providers — Anthropic (Claude), OpenAI (GPT), Google (Gemini), DeepSeek, Qwen, MiniMax and Moonshot (Kimi). Call GET /v1/models for the live list your key can access, or see the models page for pricing.

How do I switch from OpenAI or OpenRouter to the Zylo API?

Keep your existing OpenAI-compatible client. Change base_url to https://api.zyloai.net/v1 and use your Zylo key. Coming from OpenAI, just pick a Zylo model id; coming from OpenRouter, drop the vendor/ prefix so anthropic/claude-opus-4.8 becomes claude-opus-4.8. See the OpenAI and OpenRouter guides.

What error codes does the Zylo API return?

Standard HTTP status codes: 200 OK, 400 bad request, 401 invalid or missing API key, 402 out of credits, 403 plan limit reached or model restricted, 429 rate limited, and 500 internal error. Retry 429 and 5xx with backoff; fix the request for the others.

Start building on the Zylo API

Free API key, OpenAI-compatible, 40+ models behind one base URL. No card required to start.

Get free API key