Need ultra-low latency output? Use stream_text
- an async
iterator that yields
text deltas in real time.
Prepare the script
import asyncio
from ai_sdk import openai, stream_text
model = openai("gpt-4.1-mini")
async def main():
result = stream_text(
model=model,
prompt="Write a short poem about the sea where each line rhymes.",
on_chunk=lambda d: print(d, end="", flush=True),
)
# Alternatively await the full text:
full = await result.text()
print("\n---\nFull text:")
print(full)
asyncio.run(main())
Run it
You’ll see tokens appear immediately instead of buffering the full response.
stream_text
accepts the same arguments as generate_text
plus optional callbacks:
on_chunk(delta)
- each text delta
on_error(exc)
- exceptions while streaming
on_finish(full_text)
- once complete
stream_object
is similar, but yields objects instead of text.
stream_text
is similar, but yields objects instead of text.