Get your first response in under 2 minutes. OriginalPoint is OpenAI-compatible — if you've used the OpenAI SDK, you already know how to use this API.
Create a free account at originalpoint.ai/signup. After signup, navigate to Dashboard → API Keys and create your first key. Keys look like: op_xxxxxxxxxxxxxxxxxxxx
# Python pip install openai # Node.js npm install openai
from openai import OpenAI client = OpenAI( api_key="op_your_key_here", base_url="https://api.originalpoint.ai/v1" ) response = client.chat.completions.create( model="gpt-5", # or "auto" for smart routing messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is 2+2?"} ] ) print(response.choices[0].message.content) # → "4"
api_key (your OP key) and base_url. All models, parameters, and response formats are identical.
All API requests require a Bearer token in the Authorization header.
Authorization: Bearer op_your_key_here
You can create, rotate, and revoke keys from Dashboard → API Keys. Best practices:
POST /v1/chat/completions
Creates a model response for the given chat conversation. Fully compatible with the OpenAI Chat Completions API.
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Required | Model ID (e.g. gpt-5, claude-opus-4.5) or "auto" for smart routing. |
| messages | array | Required | Array of message objects. Each has role (system/user/assistant) and content (string or array for vision). |
| temperature | number | Optional | Sampling temperature 0–2. Higher = more random. Default: 1. Mutually exclusive with top_p. |
| max_tokens | integer | Optional | Maximum tokens to generate. Model-specific maximum applies. If omitted, model uses its default max. |
| stream | boolean | Optional | If true, returns a stream of Server-Sent Events. Each event contains a partial completion chunk. Default: false. |
| top_p | number | Optional | Nucleus sampling: considers tokens comprising the top top_p probability mass. Range: 0–1. Default: 1. |
| n | integer | Optional | Number of completion choices to generate. Each uses additional tokens. Default: 1. |
| stop | string | array | Optional | Up to 4 sequences where generation stops. The token triggering stop is not included in the output. |
| user | string | Optional | End-user ID for abuse monitoring. Passed through to providers that support it. |
| extra_body.routing | string | Optional | OriginalPoint routing mode: "cost", "latency", or "reliability". Only used when model="auto". |
{
"id": "chatcmpl-op_abc123",
"object": "chat.completion",
"created": 1746123456,
"model": "gpt-5",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "4"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 23,
"completion_tokens": 1,
"total_tokens": 24,
"cost_usd": 0.000061 // OriginalPoint extension field
}
}
GET /v1/models
Returns a list of all available models with their IDs, providers, and capabilities.
{
"object": "list",
"data": [
{
"id": "gpt-5",
"object": "model",
"created": 1746000000,
"owned_by": "openai",
"context_length": 131072,
"pricing": {
"input_per_million": 2.50,
"output_per_million": 10.00
}
},
{
"id": "claude-opus-4.5",
"object": "model",
"created": 1746000000,
"owned_by": "anthropic",
"context_length": 204800,
"pricing": {
"input_per_million": 15.00,
"output_per_million": 75.00
}
}
// ... 14 more models
]
}
models = client.models.list() for model in models.data: print(model.id, model.owned_by)
Set stream=True to receive a Server-Sent Events stream. Tokens are emitted as they're generated — ideal for chat UIs.
import sys with client.chat.completions.stream( model="claude-sonnet-4", messages=[{"role": "user", "content": "Write a haiku."}] ) as stream: for text in stream.text_stream(): print(text, end="", flush=True)
Use the official openai Python package — it's fully compatible.
pip install openai
export OPENAI_API_KEY="op_your_key_here" export OPENAI_BASE_URL="https://api.originalpoint.ai/v1"
from openai import OpenAI import os # Reads OPENAI_API_KEY and OPENAI_BASE_URL from environment client = OpenAI() resp = client.chat.completions.create( model="auto", messages=[{"role": "user", "content": "Hello!"}] ) print(resp.choices[0].message.content)
Use the official openai npm package.
npm install openai
import OpenAI from 'openai'; const client = new OpenAI({ apiKey: process.env.ORIGINALPOINT_API_KEY, baseURL: 'https://api.originalpoint.ai/v1' }); const response = await client.chat.completions.create({ model: 'gemini-3-pro', messages: [{role: 'user', content: 'Explain quantum entanglement simply.'}], max_tokens: 256 }); console.log(response.choices[0].message.content);
Set model="auto" to enable smart routing. The OriginalPoint router selects the optimal model in real-time based on your configured routing mode.
model="auto", the router evaluates all 16 available models against your request's requirements (token budget, context length, modality) and applies the routing strategy you specify.
response = client.chat.completions.create( model="auto", messages=[{"role": "user", "content": "Summarize this text: ..."}], extra_body={ "routing": "cost" # Always selects cheapest capable model } ) # Check which model was actually used print(f"Routed to: {response.model}") print(f"Cost: ${response.usage.cost_usd:.6f}")
| Mode | Value | Best For |
|---|---|---|
| Cost | cost |
High-volume tasks, classification, extraction, summarization where cost matters most |
| Latency | latency |
Real-time chat UIs, interactive applications, any UX where response time is critical |
| Reliability | reliability |
Production workflows, automated pipelines where 100% success rate matters |
model="auto" without specifying a routing mode, "cost" is used. You can set a default routing mode per-key in the dashboard.
Connect your own provider API keys to route through your credentials and pay provider rates directly.
sk-...), Anthropic (sk-ant-...), or Google API key. It's encrypted immediately.OriginalPoint uses standard HTTP status codes. Error bodies follow the OpenAI error format.
| Status | Code | Meaning |
|---|---|---|
| 200 | ok | Request succeeded |
| 400 | invalid_request | Malformed request (missing required field, invalid model ID, etc.) |
| 401 | invalid_api_key | API key missing, invalid, or revoked |
| 402 | quota_exceeded | Monthly spend cap or included token limit reached |
| 429 | rate_limit_exceeded | Too many requests. Check Retry-After header. |
| 500 | internal_error | Unexpected server error. If persistent, check status page. |
| 503 | provider_unavailable | Provider is down. Use "routing": "reliability" to auto-failover. |
| Plan | Requests/min | Tokens/min | Concurrent |
|---|---|---|---|
| Free | 60 | 150K | 5 |
| Pro | 300 | 1M | 25 |
| Enterprise | Custom | Custom | Custom |
Rate limit headers are included in every response: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.