API reference

The Zylo API

Q: What is the Zylo API base URL?

The Zylo API base URL is https://api.zyloai.net/v1. Point any OpenAI-compatible SDK at that base URL, use your Zylo API key, and call the /chat/completions endpoint with a bare model id such as claude-opus-4.8 or gpt-5.5.

Q: Does the Zylo API support streaming?

Yes. Set "stream": true on a chat completions request and the Zylo API returns tokens as server-sent events in the OpenAI delta format, terminated by a data: [DONE] line. Any OpenAI-compatible streaming client works.

Q: Does the Zylo API support function (tool) calling?

Yes. Pass a tools array of function definitions on a chat completions request and the model can return tool calls, exactly like the OpenAI API. Zylo also exposes native tools — web search, URL context and a code-execution sandbox. Tool calling is available on paid plans, not the free Basic plan.

Q: What are the Zylo API rate limits?

Rate limits depend on your plan. Basic allows 7,200 requests/day and 10 requests/minute with 200k daily tokens; Go allows 28,800 requests/day with 512k tokens; Pro 43,200/day with 1M tokens; Mega 86,400/day with 5M tokens; Enterprise is unlimited. Exceeding a limit returns HTTP 429. The web-extract endpoint is limited to 10 requests/minute per key, and native tools share platform-wide limits.

Q: Which models does the Zylo API support?

The Zylo API serves frontier and cost-efficient models from seven providers — Anthropic (Claude), OpenAI (GPT), Google (Gemini), DeepSeek, Qwen, MiniMax and Moonshot (Kimi). Call GET /v1/models for the live list your key can access, or see the models page for pricing.

Q: How do I switch from OpenAI or OpenRouter to the Zylo API?

Keep your existing OpenAI-compatible client. Change base_url to https://api.zyloai.net/v1 and use your Zylo key. Coming from OpenAI, just pick a Zylo model id; coming from OpenRouter, drop the vendor/ prefix so anthropic/claude-opus-4.8 becomes claude-opus-4.8.

Q: What error codes does the Zylo API return?

The Zylo API uses standard HTTP status codes: 200 OK, 400 for bad requests, 401 for an invalid or missing API key, 402 when you are out of credits, 403 when a plan limit is reached or a model is restricted, 429 for rate limiting, and 500 for an internal error.

One OpenAI-compatible endpoint for Claude, GPT, Gemini, DeepSeek and more. This is the full Zylo API reference — authentication, base URL, endpoints, streaming, tool calling, web-extract, rate limits per plan and error codes — with copy-paste examples in Python, JavaScript, Go, Ruby and PHP.

Base URL: https://api.zyloai.net/v1 Auth: Bearer token OpenAI-compatible 40+ models

Get free API key Full documentation Try in playground

Authentication

Every Zylo API request is authenticated with your API key.

Create a key at console.zyloai.net — it is issued on the free Basic plan, no credit card. Send it on every request, either as an Authorization: Bearer header (recommended — this is what the OpenAI SDKs send) or as an X-API-Key header. Keep keys server-side; you can rotate or replace a key from the console at any time.

Terminal

# Recommended — Bearer header (works with every OpenAI SDK)
curl https://api.zyloai.net/v1/chat/completions \
  -H "Authorization: Bearer YOUR_ZYLO_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "claude-opus-4.8", "messages": [{"role":"user","content":"Hi"}] }'

# Also accepted — X-API-Key header
curl https://api.zyloai.net/v1/chat/completions \
  -H "X-API-Key: YOUR_ZYLO_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "gpt-5.5", "messages": [{"role":"user","content":"Hi"}] }'

Endpoints

All paths are relative to the base URL https://api.zyloai.net/v1.

Method	Path	What it does
POST	`/v1/chat/completions`	OpenAI-compatible chat completions. Supports `stream`, `tools` and multimodal input. The main endpoint for most apps.
POST	`/generate`	Native generation endpoint with attachments and tools, returning a flat `message` plus `usage`, `latency` and `credits`.
POST	`/v1/web-extract`	Scrape and extract clean text from a URL for RAG. 10 requests/minute per key.
POST	`/v1/images/generations`	Generate or edit images with supported standard and premium models.
GET	`/v1/models`	List the models your key can call on its current plan.
GET	`/stats`	Your current credit balance, request limits and usage history.
GET	`/validate`	Check whether a key is valid and return its plan capabilities.

A standard 200 OK chat completion returns an OpenAI-shaped object with id, choices[].message, finish_reason and a usage block (prompt_tokens, completion_tokens, total_tokens). See the full documentation for every field.

Core request parameters

Sent in the JSON body of a chat completions or generate request.

Parameter	Type	Notes
`model`	string	Required. A bare model id, e.g. `claude-opus-4.8`, `gpt-5.5`, `gemini-3.1-pro-preview`.
`messages`	array	Required. Conversation history of `{role, content}` objects.
`temperature`	float	Sampling temperature, 0–1. Default `0.7`.
`max_tokens`	integer	Max new tokens to generate. Alias of `max_new_tokens`.
`top_p`	float	Nucleus sampling, 0–1. Default `1.0`.
`frequency_penalty`	float	Reduce repetition, 0–2. Default `0`.
`presence_penalty`	float	Encourage new topics, 0–2. Default `0`.
`response_format`	string	Set to `"json_object"` to force valid JSON output.
`stream`	boolean	Stream tokens as server-sent events. Default `false`.
`tools`	array	Function definitions and/or native tools the model may call. Paid plans only.

Streaming responses

Set stream: true to receive tokens as server-sent events in the OpenAI delta format, ending with a data: [DONE] line.

Python

# pip install openai
from openai import OpenAI

client = OpenAI(api_key="YOUR_ZYLO_KEY", base_url="https://api.zyloai.net/v1")

stream = client.chat.completions.create(
    model="claude-opus-4.8",
    messages=[{"role": "user", "content": "Write a haiku about streaming."}],
    stream=True,
)
for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)

Node.js

// npm install openai
import OpenAI from "openai";

const client = new OpenAI({ apiKey: "YOUR_ZYLO_KEY", baseURL: "https://api.zyloai.net/v1" });

const stream = await client.chat.completions.create({
  model: "claude-opus-4.8",
  messages: [{ role: "user", content: "Write a haiku about streaming." }],
  stream: true,
});
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

Terminal

curl https://api.zyloai.net/v1/chat/completions \
  -H "Authorization: Bearer YOUR_ZYLO_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4.8",
    "messages": [{"role": "user", "content": "Write a haiku about streaming."}],
    "stream": true
  }'
# -> data: {"choices":[{"delta":{"content":"..."}}]}  ... ends with  data: [DONE]

Tool / function calling

Pass a tools array so the model can call your functions, or attach Zylo's native tools. Available on paid plans (not Basic).

Python

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]

resp = client.chat.completions.create(
    model="gpt-5.5",
    messages=[{"role": "user", "content": "What's the weather in Lima?"}],
    tools=tools,
)
# resp.choices[0].message.tool_calls -> call get_weather, then send the result back

JSON body

{
  "model": "gemini-3.1-pro-preview",
  "messages": [{"role": "user", "content": "Summarise today's AI news."}],
  "tools": [
    { "type": "function", "function": { "name": "google_search" } },
    { "type": "function", "function": { "name": "url_context" } },
    { "type": "function", "function": { "name": "code_execution" } }
  ]
}
# Native tools share platform-wide limits (see Rate limits below).

Web-extract for RAG

The /v1/web-extract endpoint pulls clean text from a URL so you can ground answers in real sources — no separate scraping stack. Limited to 10 requests/minute per key.

Python

# 1) Extract clean content from the web
import requests

extract = requests.post(
    "https://api.zyloai.net/v1/web-extract",
    headers={"Authorization": "Bearer YOUR_ZYLO_KEY"},
    json={"url": "https://example.com/article"},
).json()

# 2) Ground a normal chat completion in the extracted text
answer = client.chat.completions.create(
    model="gemini-3.1-pro-preview",
    messages=[
        {"role": "system", "content": "Answer using only the provided sources."},
        {"role": "user", "content": f"Sources:\n{extract}\n\nQuestion: What changed?"},
    ],
)
print(answer.choices[0].message.content)

Building retrieval pipelines? See the deeper walkthrough in Build a RAG pipeline with built-in web-extract.

Multimodal / image input

Send images alongside text using the OpenAI content-parts format. Multimodal input requires a paid plan (Basic is text-only).

Python

resp = client.chat.completions.create(
    model="gemini-3.1-pro-preview",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {"type": "image_url",
             "image_url": {"url": "data:image/jpeg;base64,..."}},
        ],
    }],
)
print(resp.choices[0].message.content)

Error codes & retries

The Zylo API uses standard HTTP status codes.

Status	Meaning	What to do
200	OK	Request completed successfully.
400	Bad Request	Missing parameters or invalid JSON — fix the body.
401	Unauthorized	Invalid or missing API key — check the header.
402	Payment Required	Out of credits — top up to keep using premium models.
403	Forbidden	Plan limit reached or model not on your plan — upgrade.
429	Too Many Requests	Rate limit exceeded — back off and retry (see below).
500	Server Error	Transient internal error — retry with backoff.

Retry 429 and 5xx responses with exponential backoff. Do not retry 400/401/403 — fix the request instead.

Python

import time
from openai import OpenAI, APIStatusError

client = OpenAI(api_key="YOUR_ZYLO_KEY", base_url="https://api.zyloai.net/v1")

def chat_with_retry(messages, model="claude-opus-4.8", retries=4):
    for attempt in range(retries):
        try:
            return client.chat.completions.create(model=model, messages=messages)
        except APIStatusError as e:
            # Retry only on rate limits / server errors
            if e.status_code in (429, 500, 502, 503) and attempt < retries - 1:
                time.sleep(2 ** attempt)  # 1s, 2s, 4s, 8s
                continue
            raise

SDKs & languages

Because the Zylo API is OpenAI-compatible, any OpenAI SDK or plain HTTP client works — just repoint the base URL. Here it is in Go, Ruby and PHP.

package main

import (
    "bytes"
    "net/http"
)

func main() {
    body := []byte(`{"model":"claude-opus-4.8","messages":[{"role":"user","content":"Hello from Zylo!"}]}`)
    req, _ := http.NewRequest("POST", "https://api.zyloai.net/v1/chat/completions", bytes.NewBuffer(body))
    req.Header.Set("Authorization", "Bearer YOUR_ZYLO_KEY")
    req.Header.Set("Content-Type", "application/json")
    resp, _ := http.DefaultClient.Do(req)
    defer resp.Body.Close()
    // decode resp.Body into your struct
}

Ruby

require "net/http"
require "json"

uri = URI("https://api.zyloai.net/v1/chat/completions")
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true

req = Net::HTTP::Post.new(uri)
req["Authorization"] = "Bearer YOUR_ZYLO_KEY"
req["Content-Type"]  = "application/json"
req.body = {
  model: "claude-opus-4.8",
  messages: [{ role: "user", content: "Hello from Zylo!" }]
}.to_json

res = http.request(req)
puts JSON.parse(res.body).dig("choices", 0, "message", "content")

PHP

<?php
$ch = curl_init("https://api.zyloai.net/v1/chat/completions");
curl_setopt_array($ch, [
    CURLOPT_RETURNTRANSFER => true,
    CURLOPT_HTTPHEADER => [
        "Authorization: Bearer YOUR_ZYLO_KEY",
        "Content-Type: application/json",
    ],
    CURLOPT_POSTFIELDS => json_encode([
        "model" => "claude-opus-4.8",
        "messages" => [["role" => "user", "content" => "Hello from Zylo!"]],
    ]),
]);
$data = json_decode(curl_exec($ch), true);
echo $data["choices"][0]["message"]["content"];

Prefer the official SDK? Python and JavaScript work unchanged — see the quick start for the two-line setup.

Rate limits per plan

Plans set your daily request and token limits; usage is billed separately from prepaid credits at base per-token rates.

Plan	Requests	Daily tokens	Models & extras
Basic · $0	7.2k/day · 10/min	200k	Basic models only · text only · no credits
Go · $10	28.8k/day	512k	Premium models · web search · $10 credits
Pro · $50	43.2k/day	1M	Code execution · $50 credits
Mega · $200	86.4k/day	5M	Priority access · $200 credits
Enterprise · $400	Unlimited	Unlimited	Dedicated GPU · $400 credits

Tool & endpoint limits are enforced separately and platform-wide: web-extract is 10 req/min per key; the native web search and URL context tools share a combined 10 req/min budget; code execution has its own 20 req/min budget. Exceeding any limit returns 429. Full breakdown on the pricing page.

Switching to the Zylo API

You keep your OpenAI-compatible client. Repoint the base URL, use your Zylo key, and pick a model id. Here is the before/after.

From OpenAI

Before · OpenAI

Python

client = OpenAI(
    api_key="OPENAI_KEY",
)
model = "gpt-5.5"

After · Zylo

Python

client = OpenAI(
    api_key="YOUR_ZYLO_KEY",
    base_url="https://api.zyloai.net/v1",
)
model = "gpt-5.5"  # or claude-opus-4.8, gemini-3.1-pro...

From OpenRouter

Before · OpenRouter

Python

client = OpenAI(
    api_key="OPENROUTER_KEY",
    base_url="https://openrouter.ai/api/v1",
)
model = "anthropic/claude-opus-4.8"

After · Zylo

Python

client = OpenAI(
    api_key="YOUR_ZYLO_KEY",
    base_url="https://api.zyloai.net/v1",
)
model = "claude-opus-4.8"  # drop the vendor/ prefix

Full OpenAI migration guide Full OpenRouter migration guide

Model changelog

Recent additions to the Zylo API catalogue. The models page is always the live, complete list.

June 2026 Claude Opus 4.8, Gemini 3.1 Pro and GPT-5.5 now live — all callable by their bare model ids.
Recently added DeepSeek V4 Pro, Qwen 3.7 Max, MiniMax M2.7 and Kimi K2.6 joined the catalogue.
Platform Built-in web-extract endpoint and native web search, URL context and code-execution tools available across compatible models.

New models appear here as they ship. For the authoritative, real-time list and pricing, query GET /v1/models or browse all models.

Reliability & status

Operational facts you can verify, not marketing numbers.

Live status page

Per-model availability and incident history are published at status.zyloai.net, updated by an automated prober.

Per-response metrics

Every generation returns a real latency value and a usage token count, so you measure throughput from your own traffic — no guessing.

Zero data retention

Prompts and completions are never stored. Zylo is a passthrough to the providers; only numeric usage is logged for billing.

Throughput by plan

Sustained capacity scales with your plan — up to 5M tokens/day on Mega, and unlimited requests and tokens on Enterprise.

Zylo API FAQ

The questions developers ask most about the Zylo API.

How do I get a Zylo API key?

Sign up at console.zyloai.net and your API key is created on the Basic plan with no credit card. Copy it from the console, send it as an Authorization: Bearer header, and you can call any Basic-tier model immediately. Upgrade to a paid plan to unlock premium models and credits. You can rotate or replace the key from the console at any time.

What is the Zylo API base URL?

The Zylo API base URL is https://api.zyloai.net/v1. Point any OpenAI-compatible SDK at that base URL, use your Zylo API key, and call /chat/completions with a bare model id such as claude-opus-4.8 or gpt-5.5.

Is the Zylo API OpenAI-compatible?

Yes. The Zylo API implements the OpenAI Chat Completions schema, so the official OpenAI SDKs work unchanged — set base_url to https://api.zyloai.net/v1 and use your Zylo key. Request and response shapes, streaming and tool calling all match.

Does the Zylo API support streaming?

Yes. Set "stream": true on a chat completions request and the Zylo API returns tokens as server-sent events in the OpenAI delta format, terminated by a data: [DONE] line. Any OpenAI-compatible streaming client works.

Does the Zylo API support function (tool) calling?

Yes. Pass a tools array of function definitions and the model can return tool calls, exactly like the OpenAI API. Zylo also exposes native tools — web search, URL context and a code-execution sandbox. Tool calling is available on paid plans, not the free Basic plan.

What are the Zylo API rate limits?

Rate limits depend on your plan: Basic allows 7,200 requests/day and 10 requests/minute with 200k daily tokens; Go 28,800/day with 512k tokens; Pro 43,200/day with 1M tokens; Mega 86,400/day with 5M tokens; Enterprise is unlimited. Exceeding a limit returns 429. The web-extract endpoint is limited to 10 requests/minute per key, and native tools share platform-wide limits.

Is the Zylo API free?

The Basic plan is free with no credit card and includes a daily token and request allowance on Basic-tier models. Premium models require a paid plan and prepaid credits; usage is billed at each model's base per-token rate with no markup, and a flat 25% platform fee applies only when you add credits.

Which models does the Zylo API support?

Frontier and cost-efficient models from seven providers — Anthropic (Claude), OpenAI (GPT), Google (Gemini), DeepSeek, Qwen, MiniMax and Moonshot (Kimi). Call GET /v1/models for the live list your key can access, or see the models page for pricing.

How do I switch from OpenAI or OpenRouter to the Zylo API?

Keep your existing OpenAI-compatible client. Change base_url to https://api.zyloai.net/v1 and use your Zylo key. Coming from OpenAI, just pick a Zylo model id; coming from OpenRouter, drop the vendor/ prefix so anthropic/claude-opus-4.8 becomes claude-opus-4.8. See the OpenAI and OpenRouter guides.

What error codes does the Zylo API return?

Standard HTTP status codes: 200 OK, 400 bad request, 401 invalid or missing API key, 402 out of credits, 403 plan limit reached or model restricted, 429 rate limited, and 500 internal error. Retry 429 and 5xx with backoff; fix the request for the others.

Start building on the Zylo API

Free API key, OpenAI-compatible, 40+ models behind one base URL. No card required to start.

Get free API key

On this page

Authentication

Endpoints

Core request parameters

Streaming responses

Tool / function calling

Web-extract for RAG

Multimodal / image input

Error codes & retries

SDKs & languages

Rate limits per plan

Switching to the Zylo API

From OpenAI

From OpenRouter

Model changelog

Reliability & status

Live status page

Per-response metrics

Zero data retention

Throughput by plan

Zylo API FAQ

Start building on the Zylo API