Venice AI

Access 75+ Venice AI models through the Cluster Protocol unified API

75+

AI Models

Frontier, open-source, and private

E2EE

Private Inference

End-to-end encrypted models

USDC

x402 Payments

Pay-per-request on Base

Quick Start

Make your first Venice AI request through Cluster Protocol in seconds. The API is OpenAI-compatible — use any existing OpenAI SDK by changing the base URL.

curl -X POST https://api.clusterprotocol.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "llama-3.3-70b",
    "provider": "venice",
    "messages": [
      {"role": "user", "content": "Hello from Venice AI!"}
    ]
  }'

Authentication

Two authentication methods are available. Use an API key for standard access, or leverage x402 for permissionless, pay-per-request access with USDC.

Method 1: API Key

Include your API key in the Authorization header.

bash

Authorization: Bearer YOUR_API_KEY

Method 2: x402 (No API Key)

Send requests without an API key. The server responds with HTTP 402 and a payment-required header containing payment details. Pay with USDC on Base (chain 8453) and re-submit with the payment proof.

Chat Completion

$0.003 / request

Image Generation

$0.02 / request

Network

Base (Chain 8453)

Currency

USDC

Chat Completions

Generate text responses from Venice AI models. Compatible with the OpenAI chat completions format.

POST/v1/chat/completions

Parameters

Parameter	Type	Description
modelrequired	string	Model ID from the Venice catalog
providerrequired	string	Set to "venice"
messagesrequired	array	Array of message objects with role and content
stream	boolean	Enable SSE streaming (default: false)
max_tokens	integer	Maximum tokens to generate
temperature	number	Sampling temperature (0-2, default: 1)

Example Response

json

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "llama-3.3-70b",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

Image Generation

Generate images using Venice AI image models through Cluster Protocol.

POST/v1/images/generations

Parameters

Parameter	Type	Description
modelrequired	string	Image model ID
providerrequired	string	Set to "venice"
promptrequired	string	Text description of the desired image
n	integer	Number of images to generate (1-4)
size	string	Image dimensions (e.g. "1024x1024")

curl -X POST https://api.clusterprotocol.ai/v1/images/generations \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "fluently-xl",
    "provider": "venice",
    "prompt": "A futuristic city skyline at sunset",
    "n": 1,
    "size": "1024x1024"
  }'

Model Catalog

Venice AI provides access to 75+ models across multiple categories. Set the model parameter to any model ID below.

Frontier22 models

claude-opus-4-5claude-opus-4-6claude-opus-4-6-fastclaude-opus-4-7claude-opus-4-7-fastclaude-sonnet-4-5claude-sonnet-4-6openai-gpt-52openai-gpt-52-codexopenai-gpt-53-codexopenai-gpt-54openai-gpt-54-miniopenai-gpt-54-proopenai-gpt-55openai-gpt-55-proopenai-gpt-4o-2024-11-20openai-gpt-4o-mini-2024-07-18gemini-3-1-pro-previewgemini-3-flash-previewgrok-4-20grok-4-20-multi-agentgrok-4-3

Open Source31 models

llama-3.2-3bllama-3.3-70bhermes-3-llama-3.1-405bdeepseek-v3.2deepseek-v4-flashdeepseek-v4-proqwen-3-6-plusqwen3-235b-a22b-instruct-2507qwen3-235b-a22b-thinking-2507qwen3-5-35b-a3bqwen3-5-397b-a17bqwen3-5-9bqwen3-6-27bqwen3-coder-480b-a35b-instruct-turboqwen3-next-80bqwen3-vl-235b-a22bmistral-small-2603mistral-small-3-2-24b-instructgoogle-gemma-3-27b-itgoogle-gemma-4-26b-a4b-itgoogle-gemma-4-31b-itnvidia-nemotron-3-nano-30b-a3bnvidia-nemotron-cascade-2-30b-a3bminimax-m25minimax-m27kimi-k2-5kimi-k2-6mercury-2aion-labs-aion-2-0arcee-trinity-large-thinkingopenai-gpt-oss-120b

E2EE Private11 models

e2ee-gemma-3-27b-pe2ee-glm-4-7-flash-pe2ee-glm-4-7-pe2ee-glm-5-1e2ee-gpt-oss-120b-pe2ee-gpt-oss-20b-pe2ee-qwen-2-5-7b-pe2ee-qwen3-30b-a3b-pe2ee-qwen3-5-122b-a10be2ee-qwen3-vl-30b-a3b-pe2ee-venice-uncensored-24b-p

Uncensored4 models

venice-uncensored-1-2venice-uncensored-role-playgemma-4-uncensoredolafangensan-glm-4.7-flash-heretic

GLM Series7 models

z-ai-glm-5-turboz-ai-glm-5v-turbozai-org-glm-4.6zai-org-glm-4.7zai-org-glm-4.7-flashzai-org-glm-5zai-org-glm-5-1

x402 Payment Integration

The x402 protocol enables permissionless API access without pre-registration. Send a request without an API key, receive payment instructions, pay with USDC, and re-submit with proof.

Flow

1Send request without Authorization header

2Server responds HTTP 402 with payment-required header (base64 JSON)

3Decode payment info, execute USDC transfer on Base (chain 8453)

4Re-submit original request with payment proof in header

Example: Handling 402 Response

typescript

const response = await fetch("https://api.clusterprotocol.ai/v1/chat/completions", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    model: "llama-3.3-70b",
    provider: "venice",
    messages: [{ role: "user", content: "Hello!" }],
  }),
});

if (response.status === 402) {
  const paymentHeader = response.headers.get("payment-required");
  const paymentInfo = JSON.parse(atob(paymentHeader!));

  // paymentInfo contains:
  // - recipient: USDC recipient address
  // - amount: amount in USDC (e.g. "0.003")
  // - chain: 8453 (Base)
  // - token: USDC contract address

  // Execute payment on Base, then re-submit with proof
}

Streaming

Enable real-time token streaming via Server-Sent Events (SSE) by setting stream: true. Each chunk contains a delta of the response.

curl -X POST https://api.clusterprotocol.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "llama-3.3-70b",
    "provider": "venice",
    "messages": [{"role": "user", "content": "Write a haiku"}],
    "stream": true
  }'

SSE Format

text

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Error Handling

The API returns standard HTTP status codes. Error responses include a JSON body with details about what went wrong.

Status	Meaning	Action
400	Bad Request	Check request body and parameters
401	Unauthorized	Verify your API key
402	Payment Required	Complete x402 payment or add funds
404	Not Found	Verify model ID and endpoint path
429	Rate Limited	Back off and retry with exponential delay
500	Server Error	Retry after a brief delay
503	Model Unavailable	The model is temporarily offline; try another

Error Response Format

json

{
  "error": {
    "message": "Invalid model ID: unknown-model",
    "type": "invalid_request_error",
    "code": "model_not_found"
  }
}

Back to full API Documentation