Skip to content

AI Runtime

Multi-vendor LLM chat, embeddings, vision, image generation, speech and live sessions.

Overview

Base path: /v1/ai
Python namespace: ai
text
# Python
from infrai import ai

# TypeScript
import { InfraiClient } from "@infrai/sdk";
const client = new InfraiClient({ apiKey: process.env.INFRAI_API_KEY! });
// → client.ai

Methods

ai.chat

POST /v1/ai/chat

Multi-vendor chat completion with quality-based routing, tools, structured output, caching and batch mode.

Python

python
ai.chat(prompt, *, model=None, vendor=None, task="general", prefer="balanced", stream=False, response_format=None, tools=None, cache_strategy="vendor", batch_mode=False, max_cost_multiplier=1.5, timeout_seconds=60, idempotency_key=None)

TypeScript

typescript
client.ai.chat(opts: ChatOptions): Promise<ChatResult>

Parameters

NameTypeRequiredDescription
messagesstring | ChatMessage[]
Required
A prompt string, or a list of role/content messages.
modelstringOptionalExplicit model id; skips task/prefer routing.
vendorstringOptionalPin a specific AI vendor.
task"general" | "reasoning" | "coding" | "long_context"OptionalUse-case axis (general/reasoning/coding/long_context); routes to a model family.
prefer"balanced" | "cheapest" | "smartest"OptionalOptimization axis: balanced (best value) | cheapest | smartest.
temperaturenumberOptionalSampling temperature.
max_tokensnumberOptionalMaximum tokens to generate.
toolsTool[]OptionalTool/function-calling definitions.
response_format{ type: "json_object" | "text" | "json_schema" }OptionalForce structured output (JSON object or schema).
cache_strategy"vendor" | "infrai" | "none"OptionalWhich cache layer to use.
batch_modebooleanOptionalRun as a batch job for a discounted rate.
max_cost_multipliernumberOptionalCap failover cost relative to the primary vendor.
streambooleanOptionalSet true to stream; use the streaming method.
idempotency_keystringOptionalOptional dedup key; identical retries return the same result.

Returns

ChatResult { content, finish_reason, usage, metadata }

Example

一次性前置(每个范例都假定已完成):

bash
pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...
python
from infrai import ai

# stream=True asks the gateway to deliver the completion incrementally.
result = ai.chat(
    "Tell me a 50-word story about a curious raccoon.",
    prefer="balanced",
    stream=True,
)
print(result.get("text") or result.get("content"))
print("request_id:", result["_metadata"].get("request_id"))
print("vendor:", result["_metadata"].get("vendor"))

ai.chat

POST /v1/ai/chat

Stream chat tokens as Server-Sent Events; iterate the chunks as they arrive.

Python

python
ai.chat(prompt, *, stream=True)  # iterate chunks

TypeScript

typescript
client.ai.streamChat(opts: ChatOptions): AsyncIterable<ChatStreamChunk>

Parameters

NameTypeRequiredDescription
messagesstring | ChatMessage[]
Required
A prompt string, or a list of role/content messages.
signalAbortSignalOptionalAn AbortSignal to cancel the stream.

Returns

AsyncIterable<ChatStreamChunk { delta, finish_reason, index }>

Example

一次性前置(每个范例都假定已完成):

bash
pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...
python
from infrai import ai

for chunk in ai.chat("Tell me a 50-word story about a curious raccoon.", stream=True):
    print(chunk.delta.content or "", end="", flush=True)

ai.embed

POST /v1/ai/embed

Create text embeddings, single or batched.

Python

python
ai.embed.create(text, *, model=None, vendor=None, dimensions=None, idempotency_key=None)

TypeScript

typescript
client.ai.embed(opts: EmbedOptions): Promise<EmbedResult>

Parameters

NameTypeRequiredDescription
inputstring | string[]
Required
Text or list of texts to embed.
modelstringOptionalExplicit model id; skips task/prefer routing.
dimensionsnumberOptionalTarget embedding dimension count.
idempotency_keystringOptionalOptional dedup key; identical retries return the same result.

Returns

EmbedResult { embeddings, model, usage, metadata }

Example

一次性前置(每个范例都假定已完成):

bash
pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...
python
from infrai import ai

# Embed text once for semantic search / RAG, then rank by cosine similarity.
result = ai.embed.create("What animal has clever paws?")
data = result.get("data") or result.get("embeddings") or []
vector = data[0].get("embedding") if data and isinstance(data[0], dict) else data[0]
print("dimensions:", len(vector or []))

ai.image

POST /v1/ai/image

Generate images from a text prompt across image vendors.

Python

python
ai.image.generate(prompt, *, size=None, n=None, model=None, vendor=None, prefer="balanced")

TypeScript

typescript
client.ai.image(opts: ImageOptions): Promise<{ images }>

Parameters

NameTypeRequiredDescription
promptstring
Required
Text description of the desired image.
sizestringOptionalOutput size, e.g. 1024x1024.
nnumberOptionalNumber of images to generate.
prefer"balanced" | "cheapest" | "smartest"OptionalOptimization axis: balanced (best value) | cheapest | smartest.

Returns

{ images: Array<{ url, b64? }> }

Example

一次性前置(每个范例都假定已完成):

bash
pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...
python
from infrai import ai

result = ai.image.generate(
    "A watercolor raccoon reading a book under a maple tree",
    size="1024x1024",
    n=1,
    prefer="balanced",
)
images = result.get("images") or result.get("data") or []
print("first url:", images[0].get("url") if images else None)

ai.vision

POST /v1/ai/vision

Reason over one or more images with a prompt.

Python

python
ai.vision(prompt=..., images=[...], model=None, vendor=None)

TypeScript

typescript
client.ai.vision(opts: VisionOptions): Promise<ChatResult>

Parameters

NameTypeRequiredDescription
promptstring
Required
Text description of the desired image.
imagesArray<{ url } | { base64, mime? }>
Required
Image references as URLs or base64.

Returns

ChatResult

Example

一次性前置(每个范例都假定已完成):

bash
pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...
python
from infrai import ai

r = ai.vision(prompt="What's in this image?",
              images=[{"url": "https://example.com/photo.jpg"}])
print(r.get("content") or r.get("text"))

ai.tts

POST /v1/ai/tts

Synthesize speech audio from text.

Python

python
ai.tts(input=..., voice=None, model=None, vendor=None, format=None)

TypeScript

typescript
client.ai.tts(opts: TtsOptions): Promise<{ audio_url, format }>

Parameters

NameTypeRequiredDescription
inputstring
Required
Text to synthesize into speech.
voicestringOptionalVoice id to use.
format"mp3" | "wav" | "ogg" | "pcm"OptionalOutput audio format.

Returns

{ audio_url, format }

Example

一次性前置(每个范例都假定已完成):

bash
pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...
python
from infrai import ai

r = ai.tts(input="Hello from Infrai.", voice="alloy", format="mp3")
print("audio url:", r.get("audio_url"))

ai.asr

POST /v1/ai/asr

Transcribe audio to text with optional language hints.

Python

python
ai.asr(audio=..., language=None, model=None, vendor=None)

TypeScript

typescript
client.ai.asr(opts: AsrOptions): Promise<{ text, language, segments? }>

Parameters

NameTypeRequiredDescription
audio{ url } | { base64, mime? }
Required
Audio reference as URL or base64.
languagestringOptionalLanguage hint, e.g. en or zh.

Returns

{ text, language, segments? }

Example

一次性前置(每个范例都假定已完成):

bash
pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...
python
from infrai import ai

r = ai.asr(audio={"url": "https://example.com/clip.mp3"}, language="en")
print("transcript:", r.get("text"))

ai.realtime.token

POST /v1/ai/realtime/token

Mint a short-lived token for a WebRTC live session.

Python

python
ai.realtime.token(model=None, voice=None, ttl_seconds=None)

TypeScript

typescript
client.ai.liveToken(opts): Promise<LiveSessionToken>

Parameters

NameTypeRequiredDescription
modelstringOptionalExplicit model id; skips task/prefer routing.
voicestringOptionalVoice id to use.
ttl_secondsnumberOptionalLifetime in seconds before expiry.

Returns

LiveSessionToken { session_id, ice_servers, token, expires_at }

Example

一次性前置(每个范例都假定已完成):

bash
pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...
python
from infrai import ai

tok = ai.realtime.token(ttl_seconds=300)
print("session:", tok.get("session_id"), "| expires:", tok.get("expires_at"))