Overview

stream_text is an async iterator for low-latency streaming text generation. It provides real-time access to tokens as they’re generated, with optional callbacks for chunk processing, error handling, and completion.

Basic usage

1

Kick off the request

from ai_sdk import stream_text, openai
model = openai("gpt-4.1-mini")
stream_res = stream_text(
    model=model,
    prompt="Write an epic poem about the sea",
)
2

Consume the async iterator

async for delta in stream_res.text_stream:
    print(delta, end="", flush=True)  # real-time!
Alternatively call await stream_res.text() once to get the full text.

Parameters

NameTypeRequiredDescription
modelLanguageModelProvider instance created via e.g. openai() or anthropic()
promptstrone of prompt/messagesUser prompt (plain string).
systemstrSystem instruction prepended to the conversation.
messagesList[AnyMessage]Fine-grained message array providing full control over roles & multimodal parts. Overrides prompt.
toolsList[Tool]Enable iterative tool-calling (see further below).
max_stepsint8Safeguard to abort endless tool loops.
on_stepCallable[[OnStepFinishResult], None]Callback executed after every model ↔ tool round-trip.
on_chunkCallable[[str], None]Callback executed for each text chunk.
on_errorCallable[[Exception], None]Callback executed on exception.
on_finishCallable[[str], None]Callback executed once the stream has finished.
**kwargsprovider-specificForwarded verbatim to the underlying SDK – e.g. temperature=0.2.

Return value

stream_text returns a StreamTextResult with:
  • text_stream: Async iterator yielding text chunks
  • text(): Async method to get the complete text
  • usage: Token usage statistics
  • finish_reason: Why the stream ended
  • tool_calls: Tool calls if any were made

Examples

Basic streaming

import asyncio
from ai_sdk import stream_text, openai

async def main():
    model = openai("gpt-4.1-mini")
    stream_res = stream_text(
        model=model,
        prompt="Write a short story about a robot learning to paint"
    )

    async for chunk in stream_res.text_stream:
        print(chunk, end="", flush=True)

    print(f"\n\nTotal tokens: {stream_res.usage.completion_tokens}")

asyncio.run(main())

With callbacks

import asyncio
from ai_sdk import stream_text, openai

async def main():
    model = openai("gpt-4.1-mini")

    def on_chunk(chunk):
        print(f"Chunk: {chunk}")

    def on_error(exc):
        print(f"Error: {exc}")

    def on_finish(full_text):
        print(f"Finished! Total length: {len(full_text)}")

    stream_res = stream_text(
        model=model,
        prompt="Explain quantum computing in simple terms",
        on_chunk=on_chunk,
        on_error=on_error,
        on_finish=on_finish
    )

    # Get the full text at once
    full_text = await stream_res.text()
    print(f"Complete text: {full_text}")

asyncio.run(main())

With system instruction

import asyncio
from ai_sdk import stream_text, openai

async def main():
    model = openai("gpt-4.1-mini")
    stream_res = stream_text(
        model=model,
        system="You are a helpful teacher. Explain complex topics in simple terms.",
        prompt="What is machine learning?"
    )

    async for chunk in stream_res.text_stream:
        print(chunk, end="", flush=True)

asyncio.run(main())

With custom parameters

import asyncio
from ai_sdk import stream_text, openai

async def main():
    model = openai("gpt-4.1-mini")
    stream_res = stream_text(
        model=model,
        prompt="Write a creative poem",
        temperature=0.9,
        max_tokens=200
    )

    async for chunk in stream_res.text_stream:
        print(chunk, end="", flush=True)

asyncio.run(main())

Tool-calling with streaming

See the dedicated Tool page for a complete walkthrough.
import asyncio
from ai_sdk import tool, stream_text, openai

async def main():
    add = tool(
        name="add",
        description="Add two integers.",
        parameters={
            "type": "object",
            "properties": {"a": {"type": "integer"}, "b": {"type": "integer"}},
            "required": ["a", "b"],
        },
        execute=lambda a, b: a + b,
    )

    model = openai("gpt-4.1-mini")
    stream_res = stream_text(
        model=model,
        prompt="What is 15 + 27?",
        tools=[add],
    )

    async for chunk in stream_res.text_stream:
        print(chunk, end="", flush=True)

asyncio.run(main())

stream_text is provider-agnostic. Swap openai() for anthropic() or any other future implementation – no code changes required.