Using AI Models

Overview

shallowseek provides a fully OpenAI-compatible API format. You can use any OpenAI SDK or OpenAI-compatible client directly — just change the Base URL and API Key.

Compatibility Note

The platform API is compatible with the OpenAI API format. Existing code and third-party clients (such as ChatBox, Cherry Studio, LobeChat, etc.) work without modification. Just replace the API address and Key.

Base URL Configuration

When calling the API, you need to set the correct Base URL. Please use the API address provided by the platform (contact the administrator for the actual address):

PurposeBase URL
Chat Completionshttps://api.shallowseek.top/
Image Generationhttps://api.shallowseek.top/
Audio (TTS/STT)https://api.shallowseek.top/
Embeddingshttps://api.shallowseek.top/
Rerankhttps://api.shallowseek.top/

All requests must include the API Key in the HTTP header:

Authorization: Bearer sk-your-api-key-here

Chat Completion

The most commonly used API — send conversation messages to an LLM and receive a response.

curl Example

curl https://api.shallowseek.top//chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful AI assistant."},
      {"role": "user", "content": "Explain artificial intelligence in one sentence."}
    ],
    "temperature": 0.7,
    "max_tokens": 500
  }'

Python Example

from openai import OpenAI

client = OpenAI(
    base_url="https://api.shallowseek.top/",
    api_key="sk-your-api-key-here"
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful AI assistant."},
        {"role": "user", "content": "Explain artificial intelligence in one sentence."}
    ],
    temperature=0.7,
    max_tokens=500
)

print(response.choices[0].message.content)
# Print token usage
print(f"Input: {response.usage.prompt_tokens}, Output: {response.usage.completion_tokens}")

JavaScript (Node.js) Example

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.shallowseek.top/",
  apiKey: "sk-your-api-key-here"
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: "You are a helpful AI assistant." },
    { role: "user", content: "Explain artificial intelligence in one sentence." }
  ],
  temperature: 0.7,
  max_tokens: 500
});

console.log(response.choices[0].message.content);

JavaScript (Browser fetch) Example

const response = await fetch("https://api.shallowseek.top//chat/completions", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": "Bearer sk-your-api-key-here"
  },
  body: JSON.stringify({
    model: "gpt-4o",
    messages: [
      { role: "user", content: "Hello, please introduce yourself." }
    ]
  })
});

const data = await response.json();
console.log(data.choices[0].message.content);

Streaming

Add the stream: true parameter to enable streaming output for a word-by-word display effect:

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a poem about spring"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Image Generation

Generate images using models like DALL-E:

curl https://api.shallowseek.top//images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -d '{
    "model": "dall-e-3",
    "prompt": "A cat walking on the moon, digital painting style",
    "n": 1,
    "size": "1024x1024"
  }'

Audio (TTS / STT)

Text-to-Speech (TTS):

curl https://api.shallowseek.top//audio/speech \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello, this is a text-to-speech test.",
    "voice": "alloy"
  }' \
  --output speech.mp3

Speech-to-Text (STT):

curl https://api.shallowseek.top//audio/transcriptions \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -F file="@audio.mp3" \
  -F model="whisper-1"

Embeddings

Generate text vector embeddings for semantic search, RAG, and other use cases:

curl https://api.shallowseek.top//embeddings \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "Artificial intelligence is changing the world"
  }'

Rerank

Re-rank search results to improve retrieval quality:

curl https://api.shallowseek.top//rerank \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your-api-key-here" \
  -d '{
    "model": "jina-reranker-v2-base-multilingual",
    "query": "What is machine learning?",
    "documents": [
      "Deep learning is a subset of machine learning...",
      "The weather is nice today...",
      "Machine learning is a branch of artificial intelligence..."
    ]
  }'

Model Naming Convention

The same model typically has two billing methods, distinguished by a suffix in the model name:

Naming PatternExampleBilling Method
Without -c suffix gemini-2.5-pro Per-token billing (input + output tokens priced separately)
With -c suffix gemini-2.5-pro-c Per-request billing (fixed charge per call)

Both billing methods call the exact same model — only the charging method differs. Choose the appropriate billing mode based on your usage scenario.

Available Models

The platform supports 40+ LLM providers. Here are the main model categories:

CategoryRepresentative Models
OpenAIgpt-4o, gpt-4.1, gpt-5, o3, o4-mini
Claudeclaude-3.5-sonnet, claude-3-opus, claude-3.7-sonnet
Geminigemini-2.5-pro, gemini-2.5-flash, gemini-2.5-flash-lite
DeepSeekdeepseek-chat, deepseek-reasoner
Qwenqwen-plus, qwen-max, qwen-turbo, qwen3-235b-a22b
Image Modelsdall-e-3, midjourney, stable-diffusion
Embedding Modelstext-embedding-3-small, text-embedding-3-large
Rerankingjina-reranker-v2-base-multilingual

For more model details and the latest pricing, please check the "Pricing" page in the console.

Common Error Handling

HTTP StatusMeaningSolution
401Invalid or missing API KeyCheck if the Key is correct, expired, or disabled
403No permission to access the modelCheck the Key's model restriction settings
429Request rate too high or insufficient creditsReduce request frequency or top up credits
500Internal server errorRetry later; contact the administrator if the issue persists
503Service temporarily unavailableThe model may be temporarily unavailable; retry later

Best Practices

  • Store Keys in environment variables — Do not hardcode API Keys in your code
  • Enable streaming — For long responses, streaming significantly improves user experience
  • Set reasonable max_tokens — Avoid consuming excessive credits
  • Handle errors and retries — Implement exponential backoff retry strategy for unstable networks
  • Monitor Token usage — Regularly check consumption logs to understand usage
  • Choose the right model — Use cheaper small models for simple tasks, top-tier models for complex tasks