Anthropic Provider - Python AI SDK

Overview

The Anthropic provider offers access to Claude models through an OpenAI-compatible API interface. It uses the OpenAI SDK under the hood to communicate with Anthropic’s API, providing seamless integration with the AI SDK.

Quick Start

from ai_sdk import anthropic, generate_text

model = anthropic("claude-3-haiku-20240307")
res = generate_text(model=model, prompt="Hello, world!")
print(res.text)

Available Models

Claude Models

Model	Description	Max Tokens	Cost (Input / Output)
`claude-4-opus`	Most capable (Opus 4)	200k	$15/1M input,$ 75/1M output
`claude-4-sonnet`	Balanced performance (Sonnet 4)	200k	$3/1M input,$ 15/1M output
`claude-3.7-sonnet`	Sonnet 3.7 (legacy)	200k	$3/1M input,$ 15/1M output
`claude-3.5-sonnet`	Sonnet 3.5 (legacy)	200k	$3/1M input,$ 15/1M output
`claude-3.5-haiku`	Haiku 3.5 (fast, cost-effective)	200k	$0.80/1M input,$ 4/1M output
`claude-3-opus-20240229`	Opus 3 (legacy)	200k	$15/1M input,$ 75/1M output
`claude-3-haiku-20240307`	Haiku 3 (legacy, fast & efficient)	200k	$0.25/1M input,$ 1.25/1M output

Anthropic models have different pricing tiers. Check the Anthropic pricing page for the latest rates.

Basic Usage

Text Generation

from ai_sdk import anthropic, generate_text

model = anthropic("claude-3-haiku-20240307")
res = generate_text(
    model=model,
    prompt="Write a haiku about artificial intelligence"
)
print(res.text)

Streaming

import asyncio
from ai_sdk import anthropic, stream_text

async def main():
    model = anthropic("claude-3-haiku-20240307")
    stream_res = stream_text(
        model=model,
        prompt="Tell me a story about a robot learning to paint"
    )

    async for chunk in stream_res.text_stream:
        print(chunk, end="", flush=True)

asyncio.run(main())

With System Instructions

from ai_sdk import anthropic, generate_text

model = anthropic("claude-3-haiku-20240307")
res = generate_text(
    model=model,
    system="You are a helpful coding assistant. Always provide clear, concise explanations with code examples.",
    prompt="Explain what recursion is in simple terms"
)
print(res.text)

Advanced Features

Structured Output

Claude models support structured output through the OpenAI compatibility layer:

from ai_sdk import anthropic, generate_object
from pydantic import BaseModel

class Recipe(BaseModel):
    title: str
    ingredients: list[str]
    instructions: list[str]
    prep_time: int

model = anthropic("claude-3-haiku-20240307")
res = generate_object(
    model=model,
    schema=Recipe,
    prompt="Create a recipe for chocolate chip cookies"
)
print(res.object)  # Recipe(title='Chocolate Chip Cookies', ...)

Function Calling

Claude models support function calling through the OpenAI compatibility layer:

from ai_sdk import anthropic, generate_text, tool

def calculate_math(expression: str) -> str:
    """Evaluate a mathematical expression."""
    try:
        result = eval(expression)
        return str(result)
    except Exception as e:
        return f"Error: {str(e)}"

calc_tool = tool(
    name="calculate",
    description="Evaluate mathematical expressions",
    parameters={
        "type": "object",
        "properties": {
            "expression": {"type": "string", "description": "Mathematical expression to evaluate"}
        },
        "required": ["expression"]
    },
    execute=calculate_math
)

model = anthropic("claude-3-haiku-20240307")
res = generate_text(
    model=model,
    prompt="What is 15 * 7 + 23?",
    tools=[calc_tool]
)
print(res.text)

Chat-based Conversations

from ai_sdk import anthropic, generate_text
from ai_sdk.types import CoreSystemMessage, CoreUserMessage, CoreAssistantMessage

model = anthropic("claude-3-haiku-20240307")

# Start a conversation
messages = [
    CoreSystemMessage(content="You are a helpful coding assistant."),
    CoreUserMessage(content="What is Python?"),
    CoreAssistantMessage(content="Python is a high-level, interpreted programming language known for its simplicity and readability."),
    CoreUserMessage(content="What are its main features?")
]

res = generate_text(model=model, messages=messages)
print(res.text)

Configuration

API Key

Set your Anthropic API key:

export ANTHROPIC_API_KEY="sk-ant-..."

Or pass it directly:

model = anthropic("claude-3-haiku-20240307", api_key="sk-ant-...")

Default Parameters

Configure default parameters for all requests:

model = anthropic(
    "claude-3-haiku-20240307",
    temperature=0.7,
    max_tokens=1000,
    top_p=0.9,
    user="my-app/123"  # For analytics
)

Custom Base URL

For enterprise or custom deployments:

model = anthropic(
    "claude-3-haiku-20240307",
    base_url="https://api.anthropic.com/v1/",
    api_key="sk-ant-..."
)

Parameters

Common Parameters

Parameter	Type	Default	Description
`model`	`str`	-	Model identifier (required)
`api_key`	`str`	`None`	API key (uses ANTHROPIC_API_KEY env var)
`base_url`	`str`	`https://api.anthropic.com/v1/`	API base URL

Generation Parameters

Parameter	Type	Default	Description
`temperature`	`float`	`1.0`	Controls randomness (0.0 = deterministic)
`max_tokens`	`int`	`8192`	Maximum tokens to generate
`top_p`	`float`	`1.0`	Nucleus sampling parameter
`top_k`	`int`	`None`	Top-k sampling parameter
`user`	`str`	`None`	User identifier for analytics

Anthropic models automatically set max_tokens=8192 if not specified, unlike OpenAI models.

Model Comparison

Performance vs Cost

Model	Speed	Capability	Cost	Best For
`claude-3-5-sonnet`	Fast	High	Medium	General use, complex reasoning
`claude-3-5-haiku`	Very Fast	Good	Low	Simple tasks, high volume
`claude-3-opus`	Slow	Highest	High	Complex analysis, research
`claude-3-sonnet`	Medium	High	Medium	Balanced performance
`claude-3-haiku`	Fast	Good	Low	Quick responses, cost-sensitive

Token Limits

All Claude models support up to 200,000 tokens for input and output combined, making them suitable for long documents and conversations.

Error Handling

Rate Limiting

import time
from ai_sdk import anthropic, generate_text

def generate_with_retry(prompt, max_retries=3):
    model = anthropic("claude-3-haiku-20240307")

    for attempt in range(max_retries):
        try:
            return generate_text(model=model, prompt=prompt)
        except Exception as e:
            if "rate_limit" in str(e).lower() and attempt < max_retries - 1:
                wait_time = 2 ** attempt
                print(f"Rate limited, waiting {wait_time}s...")
                time.sleep(wait_time)
                continue
            raise

res = generate_with_retry("Hello!")

Token Management

from ai_sdk import anthropic, generate_text

def estimate_tokens(text):
    """Rough token estimation for Claude models."""
    # Claude uses a different tokenizer than GPT
    # Rough estimate: 1 token ≈ 3.5 characters
    return len(text) / 3.5

def truncate_prompt(prompt, max_tokens=150000):
    """Truncate prompt to fit within token limits."""
    estimated_tokens = estimate_tokens(prompt)
    if estimated_tokens > max_tokens:
        # Leave room for response
        max_chars = (max_tokens - 10000) * 3.5
        return prompt[:int(max_chars)] + "..."
    return prompt

model = anthropic("claude-3-haiku-20240307")
long_prompt = "A very long prompt..." * 10000
truncated = truncate_prompt(long_prompt)
res = generate_text(model=model, prompt=truncated)

Best Practices

1. Model Selection

Choose the right Claude model for your use case:

# For simple tasks - fast and cheap
model = anthropic("claude-3-haiku-20240307")

# For complex reasoning - balanced performance
model = anthropic("claude-3-sonnet-20240229")

# For research and analysis - most capable
model = anthropic("claude-3-opus-20240229")

2. Cost Optimization

Monitor and optimize costs:

from ai_sdk import anthropic, generate_text

model = anthropic("claude-3-haiku-20240307")
res = generate_text(model=model, prompt="Hello!")

if res.usage:
    input_cost = res.usage.prompt_tokens * 0.00000025  # $0.25/1M tokens
    output_cost = res.usage.completion_tokens * 000000125  # $1.25/1M tokens
    total_cost = input_cost + output_cost
    print(f"Cost: ${total_cost:.6f}")

3. Claude-Specific Prompting

Claude models respond well to clear, structured prompts:

# Good - clear structure
prompt = """
Please analyze the following Python code and provide:

1. A brief overview of what the code does
2. Any potential issues or improvements
3. A refactored version if needed

Code:
{code}
"""

# Avoid - vague instructions
prompt = "Look at this code and tell me what you think"

4. System Instructions

Claude models are particularly good at following system instructions:

from ai_sdk import anthropic, generate_text

model = anthropic("claude-3-haiku-20240307")
res = generate_text(
    model=model,
    system="""You are a helpful coding assistant with expertise in Python.
    When reviewing code, always:
    1. Explain the code's purpose clearly
    2. Identify potential issues or improvements
    3. Provide specific, actionable suggestions
    4. Include code examples when helpful""",
    prompt="Review this function: def add(a, b): return a + b"
)
print(res.text)

5. Long Context Usage

Claude models excel at processing long documents:

from ai_sdk import anthropic, generate_text

model = anthropic("claude-3-sonnet-20240229")  # Higher token limit

# Process a long document
with open("long_document.txt", "r") as f:
    document = f.read()

res = generate_text(
    model=model,
    prompt=f"""Please summarize the key points from this document:

{document}

Provide a concise summary with the main themes and important details."""
)
print(res.text)

Troubleshooting

Common Issues

Invalid API Key
```
Error: Invalid API key
```
- Check your API key starts with sk-ant-
- Ensure you have sufficient credits
Model Not Found
```
Error: The model `claude-3-haiku-20240307` does not exist
```
- Verify model name spelling
- Check if model is available in your region
Rate Limiting
```
Error: Rate limit exceeded
```
- Implement exponential backoff
- Consider upgrading your plan
Token Limit Exceeded
```
Error: Request too large
```
- Reduce input length
- Use a model with higher token limits

Debug Mode

Enable detailed logging:

import logging
logging.basicConfig(level=logging.DEBUG)

from ai_sdk import anthropic, generate_text
model = anthropic("claude-3-haiku-20240307")
res = generate_text(model=model, prompt="Hello!")

Limitations

Current Limitations

No Embeddings: Anthropic doesn’t provide embedding models through this interface
Vision Models: Limited vision support compared to OpenAI
Function Calling: Uses OpenAI compatibility layer, may have limitations

Workarounds

For embeddings, consider using OpenAI’s embedding models:

from ai_sdk import openai, embed_many

# Use OpenAI for embeddings
embed_model = openai.embedding("text-embedding-3-small")
result = embed_many(model=embed_model, values=["text1", "text2"])

Claude models are particularly good at following instructions and maintaining context. Use clear, structured prompts for best results.

Anthropic models are constantly being updated. Check the Anthropic API documentation for the latest model availability and pricing.

Text Generation

Object Generation

Embeddings

Tools

Providers

Types

​Overview

​Quick Start

​Available Models

​Claude Models

​Basic Usage

​Text Generation

​Streaming

​With System Instructions

​Advanced Features

​Structured Output

​Function Calling

​Chat-based Conversations

​Configuration

​API Key

​Default Parameters

​Custom Base URL

​Parameters

​Common Parameters

​Generation Parameters

​Model Comparison

​Performance vs Cost

​Token Limits

​Error Handling

​Rate Limiting

​Token Management

​Best Practices

​1. Model Selection

​2. Cost Optimization

​3. Claude-Specific Prompting

​4. System Instructions

​5. Long Context Usage

​Troubleshooting

​Common Issues

​Debug Mode

​Limitations

​Current Limitations

​Workarounds

Overview

Quick Start

Available Models

Claude Models

Basic Usage

Text Generation

Streaming

With System Instructions

Advanced Features

Structured Output

Function Calling

Chat-based Conversations

Configuration

API Key

Default Parameters

Custom Base URL

Parameters

Common Parameters

Generation Parameters

Model Comparison

Performance vs Cost

Token Limits

Error Handling

Rate Limiting

Token Management

Best Practices

1. Model Selection

2. Cost Optimization

3. Claude-Specific Prompting

4. System Instructions

5. Long Context Usage

Troubleshooting

Common Issues

Debug Mode

Limitations

Current Limitations

Workarounds