Z.ai logo
Z.ai

GLM 4.7 Flash

Call GLM 4.7 Flash for fast, low-cost inference at scale — through one OpenAI-compatible endpoint, with local payments, on the Go plan and up.

Model id glm-4.7-flash Context 203K tokens Plan Go plan & up Input $0.06 /1M Output $0.40 /1M

Last updated June 5, 2026

Pricing

Per 1M tokens, billed from your credit balance — there is no markup on usage.

DirectionPrice / 1M tokens
Input$0.06
Output$0.40
How billing works. The rate above is what usage costs against your prepaid credits on a paid plan — no per-token markup, and Zylo's flat 25% platform fee applies only when you add credits. The free Basic plan instead gives a daily allowance of Basic-tier models (10 requests/min, no card, no credits). GLM 4.7 Flash requires the Go plan or higher — the free Basic plan only includes Basic-tier models. Prices update live from our catalogue.

Quickstart

Already using the OpenAI SDK? Change two lines — base_url and your key — and set the model to glm-4.7-flash.

Terminal
curl https://api.zyloai.net/v1/chat/completions \
  -H "Authorization: Bearer YOUR_ZYLO_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-4.7-flash",
    "messages": [{"role": "user", "content": "Hello from Zylo!"}]
  }'
Python
# pip install openai
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_ZYLO_KEY",
    base_url="https://api.zyloai.net/v1",
)

response = client.chat.completions.create(
    model="glm-4.7-flash",
    messages=[{"role": "user", "content": "Hello from Zylo!"}],
)
print(response.choices[0].message.content)
Node.js
// npm install openai
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_ZYLO_KEY",
  baseURL: "https://api.zyloai.net/v1",
});

const response = await client.chat.completions.create({
  model: "glm-4.7-flash",
  messages: [{ role: "user", content: "Hello from Zylo!" }],
});
console.log(response.choices[0].message.content);

Migrating?

On OpenRouter this model is z-ai/glm-4.7-flash; on Zylo, use the id glm-4.7-flash.

Frequently asked questions

How much does GLM 4.7 Flash cost on Zylo?

GLM 4.7 Flash is billed at its base per-token rate: $0.06 per 1M input tokens and $0.40 per 1M output tokens, deducted from your prepaid credits. There is no markup on usage — Zylo's 25% platform fee applies only when you add credits.

Which plan do I need to use GLM 4.7 Flash?

GLM 4.7 Flash requires the Go plan or higher. The free Basic plan only includes Basic-tier models; paid plans (Go and up) add premium models like GLM 4.7 Flash and include credits you spend on usage at the rate above.

What is the context window of GLM 4.7 Flash?

GLM 4.7 Flash supports up to 203K tokens of context through Zylo's OpenAI-compatible API.

Is GLM 4.7 Flash OpenAI-compatible?

Yes. Point any OpenAI SDK at https://api.zyloai.net/v1, use your Zylo API key, and set the model to glm-4.7-flash.

How do I switch GLM 4.7 Flash from OpenRouter to Zylo?

On OpenRouter this model is z-ai/glm-4.7-flash. On Zylo, use the bare id glm-4.7-flash with base URL https://api.zyloai.net/v1 — no vendor prefix.

Related reading

Guides for building on GLM 4.7 Flash and other models through one API.

Start building with GLM 4.7 Flash

GLM 4.7 Flash runs on the Go plan or higher — create an account and upgrade to the Go plan to call it.

Create your account