Stream text asynchronously for low-latency real-time generation with optional callbacks.
stream_text
is an async iterator for low-latency streaming text generation. It provides real-time access to tokens as they’re generated, with optional callbacks for chunk processing, error handling, and completion.
Kick off the request
Consume the async iterator
await stream_res.text()
once to get the full text.Name | Type | Required | Description |
---|---|---|---|
model | LanguageModel | ✓ | Provider instance created via e.g. openai() or anthropic() |
prompt | str | one of prompt /messages | User prompt (plain string). |
system | str | – | System instruction prepended to the conversation. |
messages | List[AnyMessage] | – | Fine-grained message array providing full control over roles & multimodal parts. Overrides prompt . |
tools | List[Tool] | – | Enable iterative tool-calling (see further below). |
max_steps | int | 8 | Safeguard to abort endless tool loops. |
on_step | Callable[[OnStepFinishResult], None] | – | Callback executed after every model ↔ tool round-trip. |
on_chunk | Callable[[str], None] | – | Callback executed for each text chunk. |
on_error | Callable[[Exception], None] | – | Callback executed on exception. |
on_finish | Callable[[str], None] | – | Callback executed once the stream has finished. |
**kwargs | provider-specific | – | Forwarded verbatim to the underlying SDK – e.g. temperature=0.2 . |
stream_text
returns a StreamTextResult
with:
text_stream
: Async iterator yielding text chunkstext()
: Async method to get the complete textusage
: Token usage statisticsfinish_reason
: Why the stream endedtool_calls
: Tool calls if any were madestream_text
is provider-agnostic. Swap openai()
for
anthropic()
or any other future implementation – no code changes required.