Using AI Models

Overview

shallowseek provides a fully OpenAI-compatible API format. You can use any OpenAI SDK or OpenAI-compatible client directly — just change the Base URL and API Key.

Compatibility Note

The platform API is compatible with the OpenAI API format. Existing code and third-party clients (such as ChatBox, Cherry Studio, LobeChat, etc.) work without modification. Just replace the API address and Key.

Base URL Configuration

When calling the API, you need to set the correct Base URL. Please use the API address provided by the platform (contact the administrator for the actual address):

Purpose	Base URL
Chat Completions	`https://api.shallowseek.top/`
Image Generation	`https://api.shallowseek.top/`
Audio (TTS/STT)	`https://api.shallowseek.top/`
Embeddings	`https://api.shallowseek.top/`
Rerank	`https://api.shallowseek.top/`

All requests must include the API Key in the HTTP header:

Authorization: Bearer sk-your-api-key-here

Chat Completion

The most commonly used API — send conversation messages to an LLM and receive a response.

curl Example

curl https://api.shallowseek.top//chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful AI assistant."},
      {"role": "user", "content": "Explain artificial intelligence in one sentence."}
    ],
    "temperature": 0.7,
    "max_tokens": 500
  }'

Python Example

from openai import OpenAI

client = OpenAI(
    base_url="https://api.shallowseek.top/",
    api_key="sk-your-api-key-here"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful AI assistant."},
        {"role": "user", "content": "Explain artificial intelligence in one sentence."}
    ],
    temperature=0.7,
    max_tokens=500
)

print(response.choices[0].message.content)
# Print token usage
print(f"Input: {response.usage.prompt_tokens}, Output: {response.usage.completion_tokens}")

JavaScript (Node.js) Example

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.shallowseek.top/",
  apiKey: "sk-your-api-key-here"
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: "You are a helpful AI assistant." },
    { role: "user", content: "Explain artificial intelligence in one sentence." }
  ],
  temperature: 0.7,
  max_tokens: 500
});

console.log(response.choices[0].message.content);

JavaScript (Browser fetch) Example

const response = await fetch("https://api.shallowseek.top//chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": "Bearer sk-your-api-key-here"
  },
  body: JSON.stringify({
    model: "gpt-4o",
    messages: [
      { role: "user", content: "Hello, please introduce yourself." }
    ]
  })
});

const data = await response.json();
console.log(data.choices[0].message.content);

Streaming

Add the stream: true parameter to enable streaming output for a word-by-word display effect:

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a poem about spring"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Image Generation

Generate images using models like DALL-E:

curl https://api.shallowseek.top//images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -d '{
    "model": "dall-e-3",
    "prompt": "A cat walking on the moon, digital painting style",
    "n": 1,
    "size": "1024x1024"
  }'

Audio (TTS / STT)

Text-to-Speech (TTS):

curl https://api.shallowseek.top//audio/speech \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello, this is a text-to-speech test.",
    "voice": "alloy"
  }' \
  --output speech.mp3

Speech-to-Text (STT):

curl https://api.shallowseek.top//audio/transcriptions \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -F file="@audio.mp3" \
  -F model="whisper-1"

Embeddings

Generate text vector embeddings for semantic search, RAG, and other use cases:

curl https://api.shallowseek.top//embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "Artificial intelligence is changing the world"
  }'

Rerank

Re-rank search results to improve retrieval quality:

curl https://api.shallowseek.top//rerank \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -d '{
    "model": "jina-reranker-v2-base-multilingual",
    "query": "What is machine learning?",
    "documents": [
      "Deep learning is a subset of machine learning...",
      "The weather is nice today...",
      "Machine learning is a branch of artificial intelligence..."
    ]
  }'

Model Naming Convention

The same model typically has two billing methods, distinguished by a suffix in the model name:

Naming Pattern	Example	Billing Method
Without `-c` suffix	`gemini-2.5-pro`	Per-token billing (input + output tokens priced separately)
With `-c` suffix	`gemini-2.5-pro-c`	Per-request billing (fixed charge per call)

Both billing methods call the exact same model — only the charging method differs. Choose the appropriate billing mode based on your usage scenario.

Available Models

The platform supports 40+ LLM providers. Here are the main model categories:

Category	Representative Models
OpenAI	gpt-4o, gpt-4.1, gpt-5, o3, o4-mini
Claude	claude-3.5-sonnet, claude-3-opus, claude-3.7-sonnet
Gemini	gemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite
DeepSeek	deepseek-chat, deepseek-reasoner
Qwen	qwen-plus, qwen-max, qwen-turbo, qwen3-235b-a22b
Image Models	dall-e-3, midjourney, stable-diffusion
Embedding Models	text-embedding-3-small, text-embedding-3-large
Reranking	jina-reranker-v2-base-multilingual

For more model details and the latest pricing, please check the "Pricing" page in the console.

Common Error Handling

HTTP Status	Meaning	Solution
401	Invalid or missing API Key	Check if the Key is correct, expired, or disabled
403	No permission to access the model	Check the Key's model restriction settings
429	Request rate too high or insufficient credits	Reduce request frequency or top up credits
500	Internal server error	Retry later; contact the administrator if the issue persists
503	Service temporarily unavailable	The model may be temporarily unavailable; retry later

Best Practices

Store Keys in environment variables — Do not hardcode API Keys in your code
Enable streaming — For long responses, streaming significantly improves user experience
Set reasonable max_tokens — Avoid consuming excessive credits
Handle errors and retries — Implement exponential backoff retry strategy for unstable networks
Monitor Token usage — Regularly check consumption logs to understand usage
Choose the right model — Use cheaper small models for simple tasks, top-tier models for complex tasks