Need ultra-low latency output? Use stream_text - an async iterator that yields text deltas in real time.
1

Prepare the script

import asyncio
from ai_sdk import openai, stream_text

model = openai("gpt-4.1-mini")

async def main():
result = stream_text(
model=model,
prompt="Write a short poem about the sea where each line rhymes.",
on_chunk=lambda d: print(d, end="", flush=True),
)

    # Alternatively await the full text:
    full = await result.text()
    print("\n---\nFull text:")
    print(full)

asyncio.run(main())

2

Run it

python stream.py
You’ll see tokens appear immediately instead of buffering the full response.
stream_text accepts the same arguments as generate_text plus optional callbacks:
  • on_chunk(delta) - each text delta
  • on_error(exc) - exceptions while streaming
  • on_finish(full_text) - once complete
stream_object is similar, but yields objects instead of text.
stream_text is similar, but yields objects instead of text.