AI Runtime

Multi-vendor LLM chat, embeddings, vision, image generation, speech and live sessions.

Overview

Base path: /v1/ai

Python namespace: ai

text

# Python
from infrai import ai

# TypeScript
import { InfraiClient } from "@infrai/sdk";
const client = new InfraiClient({ apiKey: process.env.INFRAI_API_KEY! });
// → client.ai

Methods

ai.chat

POST /v1/ai/chat

Multi-vendor chat completion with quality-based routing, tools, structured output, caching and batch mode.

Python

python

ai.chat(prompt, *, model=None, vendor=None, task="general", prefer="balanced", stream=False, response_format=None, tools=None, cache_strategy="vendor", batch_mode=False, max_cost_multiplier=1.5, timeout_seconds=60, idempotency_key=None)

TypeScript

typescript

client.ai.chat(opts: ChatOptions): Promise<ChatResult>

Parameters

Name	Type	Required	Description
`messages`	`string \| ChatMessage[]`	Required	A prompt string, or a list of role/content messages.
`model`	`string`	Optional	Explicit model id; skips task/prefer routing.
`vendor`	`string`	Optional	Pin a specific AI vendor.
`task`	`"general" \| "reasoning" \| "coding" \| "long_context"`	Optional	Use-case axis (general/reasoning/coding/long_context); routes to a model family.
`prefer`	`"balanced" \| "cheapest" \| "smartest"`	Optional	Optimization axis: balanced (best value) \| cheapest \| smartest.
`temperature`	`number`	Optional	Sampling temperature.
`max_tokens`	`number`	Optional	Maximum tokens to generate.
`tools`	`Tool[]`	Optional	Tool/function-calling definitions.
`response_format`	`{ type: "json_object" \| "text" \| "json_schema" }`	Optional	Force structured output (JSON object or schema).
`cache_strategy`	`"vendor" \| "infrai" \| "none"`	Optional	Which cache layer to use.
`batch_mode`	`boolean`	Optional	Run as a batch job for a discounted rate.
`max_cost_multiplier`	`number`	Optional	Cap failover cost relative to the primary vendor.
`stream`	`boolean`	Optional	Set true to stream; use the streaming method.
`idempotency_key`	`string`	Optional	Optional dedup key; identical retries return the same result.

Returns

ChatResult { content, finish_reason, usage, metadata }

Example

一次性前置(每个范例都假定已完成):

bash

pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...

python

from infrai import ai

# stream=True asks the gateway to deliver the completion incrementally.
result = ai.chat(
    "Tell me a 50-word story about a curious raccoon.",
    prefer="balanced",
    stream=True,
)
print(result.get("text") or result.get("content"))
print("request_id:", result["_metadata"].get("request_id"))
print("vendor:", result["_metadata"].get("vendor"))

ai.chat

POST /v1/ai/chat

Stream chat tokens as Server-Sent Events; iterate the chunks as they arrive.

Python

python

ai.chat(prompt, *, stream=True)  # iterate chunks

TypeScript

typescript

client.ai.streamChat(opts: ChatOptions): AsyncIterable<ChatStreamChunk>

Parameters

Name	Type	Required	Description
`messages`	`string \| ChatMessage[]`	Required	A prompt string, or a list of role/content messages.
`signal`	`AbortSignal`	Optional	An AbortSignal to cancel the stream.

Returns

AsyncIterable<ChatStreamChunk { delta, finish_reason, index }>

Example

一次性前置(每个范例都假定已完成):

bash

pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...

python

from infrai import ai

for chunk in ai.chat("Tell me a 50-word story about a curious raccoon.", stream=True):
    print(chunk.delta.content or "", end="", flush=True)

ai.embed

POST /v1/ai/embed

Create text embeddings, single or batched.

Python

python

ai.embed.create(text, *, model=None, vendor=None, dimensions=None, idempotency_key=None)

TypeScript

typescript

client.ai.embed(opts: EmbedOptions): Promise<EmbedResult>

Parameters

Name	Type	Required	Description
`input`	`string \| string[]`	Required	Text or list of texts to embed.
`model`	`string`	Optional	Explicit model id; skips task/prefer routing.
`dimensions`	`number`	Optional	Target embedding dimension count.
`idempotency_key`	`string`	Optional	Optional dedup key; identical retries return the same result.

Returns

EmbedResult { embeddings, model, usage, metadata }

Example

一次性前置(每个范例都假定已完成):

bash

pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...

python

from infrai import ai

# Embed text once for semantic search / RAG, then rank by cosine similarity.
result = ai.embed.create("What animal has clever paws?")
data = result.get("data") or result.get("embeddings") or []
vector = data[0].get("embedding") if data and isinstance(data[0], dict) else data[0]
print("dimensions:", len(vector or []))

ai.image

POST /v1/ai/image

Generate images from a text prompt across image vendors.

Python

python

ai.image.generate(prompt, *, size=None, n=None, model=None, vendor=None, prefer="balanced")

TypeScript

typescript

client.ai.image(opts: ImageOptions): Promise<{ images }>

Parameters

Name	Type	Required	Description
`prompt`	`string`	Required	Text description of the desired image.
`size`	`string`	Optional	Output size, e.g. 1024x1024.
`n`	`number`	Optional	Number of images to generate.
`prefer`	`"balanced" \| "cheapest" \| "smartest"`	Optional	Optimization axis: balanced (best value) \| cheapest \| smartest.

Returns

{ images: Array<{ url, b64? }> }

Example

一次性前置(每个范例都假定已完成):

bash

pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...

python

from infrai import ai

result = ai.image.generate(
    "A watercolor raccoon reading a book under a maple tree",
    size="1024x1024",
    n=1,
    prefer="balanced",
)
images = result.get("images") or result.get("data") or []
print("first url:", images[0].get("url") if images else None)

ai.vision

POST /v1/ai/vision

Reason over one or more images with a prompt.

Python

python

ai.vision(prompt=..., images=[...], model=None, vendor=None)

TypeScript

typescript

client.ai.vision(opts: VisionOptions): Promise<ChatResult>

Parameters

Name	Type	Required	Description
`prompt`	`string`	Required	Text description of the desired image.
`images`	`Array<{ url } \| { base64, mime? }>`	Required	Image references as URLs or base64.

Returns

ChatResult

Example

一次性前置(每个范例都假定已完成):

bash

pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...

python

from infrai import ai

r = ai.vision(prompt="What's in this image?",
              images=[{"url": "https://example.com/photo.jpg"}])
print(r.get("content") or r.get("text"))

ai.tts

POST /v1/ai/tts

Synthesize speech audio from text.

Python

python

ai.tts(input=..., voice=None, model=None, vendor=None, format=None)

TypeScript

typescript

client.ai.tts(opts: TtsOptions): Promise<{ audio_url, format }>

Parameters

Name	Type	Required	Description
`input`	`string`	Required	Text to synthesize into speech.
`voice`	`string`	Optional	Voice id to use.
`format`	`"mp3" \| "wav" \| "ogg" \| "pcm"`	Optional	Output audio format.

Returns

{ audio_url, format }

Example

一次性前置(每个范例都假定已完成):

bash

pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...

python

from infrai import ai

r = ai.tts(input="Hello from Infrai.", voice="alloy", format="mp3")
print("audio url:", r.get("audio_url"))

ai.asr

POST /v1/ai/asr

Transcribe audio to text with optional language hints.

Python

python

ai.asr(audio=..., language=None, model=None, vendor=None)

TypeScript

typescript

client.ai.asr(opts: AsrOptions): Promise<{ text, language, segments? }>

Parameters

Name	Type	Required	Description
`audio`	`{ url } \| { base64, mime? }`	Required	Audio reference as URL or base64.
`language`	`string`	Optional	Language hint, e.g. en or zh.

Returns

{ text, language, segments? }

Example

一次性前置(每个范例都假定已完成):

bash

pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...

python

from infrai import ai

r = ai.asr(audio={"url": "https://example.com/clip.mp3"}, language="en")
print("transcript:", r.get("text"))

ai.realtime.token

POST /v1/ai/realtime/token

Mint a short-lived token for a WebRTC live session.

Python

python

ai.realtime.token(model=None, voice=None, ttl_seconds=None)

TypeScript

typescript

client.ai.liveToken(opts): Promise<LiveSessionToken>

Parameters

Name	Type	Required	Description
`model`	`string`	Optional	Explicit model id; skips task/prefer routing.
`voice`	`string`	Optional	Voice id to use.
`ttl_seconds`	`number`	Optional	Lifetime in seconds before expiry.

Returns

LiveSessionToken { session_id, ice_servers, token, expires_at }

Example

一次性前置(每个范例都假定已完成):

bash

pip install infrai
# one-time auth (no secret needed): anonymous account + trial, writes ~/.infrai/credentials
python -c "from infrai import infra; infra.activate()"
# returning user instead: export INFRAI_API_KEY=ifr_pk_proj_...

python

from infrai import ai

tok = ai.realtime.token(ttl_seconds=300)
print("session:", tok.get("session_id"), "| expires:", tok.get("expires_at"))