Claude, GPT‑4o, Gemini, and Groq behind one unified API. 9 models. 4 providers. Real-time streaming. Zero re-implementation.
Five focused endpoints. No bloat. No setup friction. Swap providers per request.
From zero to live AI in the time it takes to brew coffee.
Request access. Add X-API-Key to your header. Done in 10 seconds.
Pass "provider":"claude", "openai", "gemini", or "groq" — switch any time, per request.
Hit /ask or /chat. Structured AI responses in milliseconds.
Switch between Claude, GPT-4.1, Gemini 2.5, and Groq with a single field — no re-implementation, no SDK swaps.
The default provider. Exceptional reasoning, structured outputs, and RAG pipelines. 200K context window with industry-leading consistency at low latency.
Cost-efficient and industry-standard. Ideal for code generation, JSON mode, and when clients prefer the OpenAI ecosystem.
Google's multimodal flagship. Industry-leading 1M token context window — perfect for huge documents, long conversations, and multimodal workloads.
Hardware-accelerated inference on Groq's custom LPUs. Open-source models running up to 10× faster than GPU providers. Best-in-class token throughput.
Production-ready examples. Copy. Paste. Ship.
# Single-turn Q&A ── Claude ────────────────────────────────────── curl -X POST https://axiom-ai-production-aaec.up.railway.app/ask \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_KEY" \ -d '{"question": "What is RAG?", "provider": "claude"}' # Multi-turn chat ── start a new session ───────────────────────── curl -X POST https://axiom-ai-production-aaec.up.railway.app/chat \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_KEY" \ -d '{"message": "Hello!", "provider": "openai"}' # Custom system prompt ─────────────────────────────────────────── curl -X POST https://axiom-ai-production-aaec.up.railway.app/ask \ -H "Content-Type: application/json" \ -H "X-API-Key: YOUR_KEY" \ -d '{"question": "Review my code", "system": "You are a senior Python engineer. Be blunt.", "provider": "claude"}'
# pip install requests import requests BASE = "https://axiom-ai-production-aaec.up.railway.app" HEADERS = { "X-API-Key": "YOUR_KEY", "Content-Type": "application/json" } # Single-turn Q&A r = requests.post(f"{BASE}/ask", headers=HEADERS, json={ "question": "What is RAG?", "provider": "claude" }) print(r.json()["answer"]) # Multi-turn chat r = requests.post(f"{BASE}/chat", headers=HEADERS, json={ "message": "Hello!", "provider": "openai" }) session_id = r.json()["session_id"] # save for next message
// Single-turn Q&A const BASE = 'https://axiom-ai-production-aaec.up.railway.app'; const r = await fetch(`${BASE}/ask`, { method: 'POST', headers: { 'Content-Type': 'application/json', 'X-API-Key': 'YOUR_KEY' }, body: JSON.stringify({ question: 'What is RAG?', provider: 'claude' }) }); const data = await r.json(); console.log(data.answer); // done!
Explore every endpoint, fire live requests, and integrate in minutes.