For AI Agents & Developers: Use this guide to choose between OpenAI, Anthropic, and Ollama primitives
| Priority | Best Choice | Why |
|---|---|---|
| Quality | OpenAIPrimitive (GPT-4) | Best reasoning, most capable |
| Cost | OllamaPrimitive | 100% free, runs locally |
| Privacy | OllamaPrimitive | Data never leaves your machine |
| Speed | OpenAIPrimitive (GPT-4o-mini) | Fastest API response |
| Long Context | AnthropicPrimitive (Claude) | 200K+ token context window |
| Safety | AnthropicPrimitive (Claude) | Best at refusing harmful requests |
| Simplicity | OpenAIPrimitive | Easiest to get started |
| Feature | OpenAI | Anthropic | Ollama |
|---|---|---|---|
| Best Model | GPT-4o | Claude 3.5 Sonnet | Llama 3.2 |
| Cost (1M tokens) | $2.50-$15 | $3-$15 | $0 (free) |
| Free Tier | $5 credit | ❌ No | ✅ Unlimited |
| Setup Difficulty | ⭐ Easy | ⭐ Easy | ⭐⭐⭐ Medium |
| API Latency | ~1-2s | ~1-2s | ~5-10s (local) |
| Context Window | 128K tokens | 200K tokens | 128K tokens |
| Privacy | ⚠️ Cloud | ⚠️ Cloud | ✅ 100% local |
| Quality | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| Deployment | ✅ Easy | ✅ Easy | ⚠️ Need GPU |
"""Simple chatbot using OpenAIPrimitive"""
from tta_dev_primitives.integrations import OpenAIPrimitive, OpenAIRequest
from tta_dev_primitives.core.base import WorkflowContext
import asyncio
import os
async def main():
# Create primitive (uses GPT-4o-mini by default)
llm = OpenAIPrimitive(api_key=os.getenv("OPENAI_API_KEY"))
context = WorkflowContext(workflow_id="chatbot")
# Send message
request = OpenAIRequest(
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain async/await in Python"}
],
temperature=0.7
)
response = await llm.execute(request, context)
print(f"Assistant: {response.content}")
if __name__ == "__main__":
asyncio.run(main())
Cost: ~$0.0001 per request (GPT-4o-mini) Speed: ~1-2 seconds Quality: ⭐⭐⭐⭐⭐
"""Document analysis using AnthropicPrimitive"""
from tta_dev_primitives.integrations import AnthropicPrimitive, AnthropicRequest
from tta_dev_primitives.core.base import WorkflowContext
import asyncio
import os
async def main():
# Create primitive (uses Claude 3.5 Sonnet)
llm = AnthropicPrimitive(api_key=os.getenv("ANTHROPIC_API_KEY"))
context = WorkflowContext(workflow_id="doc-analysis")
# Analyze long document (up to 200K tokens)
with open("long_document.txt") as f:
document = f.read()
request = AnthropicRequest(
messages=[
{"role": "user", "content": f"Summarize this document:\n\n{document}"}
],
system="You are a technical document analyst.",
max_tokens=1000
)
response = await llm.execute(request, context)
print(f"Summary: {response.content}")
if __name__ == "__main__":
asyncio.run(main())
Cost: ~$0.003 per request (Claude 3.5 Sonnet) Speed: ~2-3 seconds Quality: ⭐⭐⭐⭐⭐ Context: Up to 200K tokens
"""Private chatbot using OllamaPrimitive"""
from tta_dev_primitives.integrations import OllamaPrimitive, OllamaRequest
from tta_dev_primitives.core.base import WorkflowContext
import asyncio
async def main():
# Create primitive (runs locally, no API key needed)
llm = OllamaPrimitive(model="llama3.2")
context = WorkflowContext(workflow_id="private-chat")
# Send message (data never leaves your machine)
request = OllamaRequest(
messages=[
{"role": "user", "content": "Explain quantum computing"}
],
temperature=0.7
)
response = await llm.execute(request, context)
print(f"Assistant: {response.content}")
if __name__ == "__main__":
asyncio.run(main())
Cost: $0 (free) Speed: ~5-10 seconds (depends on GPU) Quality: ⭐⭐⭐⭐ Privacy: ✅ 100% local
| Model | Cost |
|---|---|
| GPT-4o-mini | $0.15 |
| GPT-4o | $2.50 |
| GPT-4 Turbo | $10.00 |
| Claude 3.5 Sonnet | $3.00 |
| Claude 3 Opus | $15.00 |
| Ollama (any model) | $0.00 |
| Model | Cost |
|---|---|
| GPT-4o-mini | $0.60 |
| GPT-4o | $10.00 |
| GPT-4 Turbo | $30.00 |
| Claude 3.5 Sonnet | $15.00 |
| Claude 3 Opus | $75.00 |
| Ollama (any model) | $0.00 |
Example: 1000 requests with 500 input + 500 output tokens each:
# Use Ollama for free unlimited testing
llm = OllamaPrimitive(model="llama3.2")
# Use OpenAI for cost-effective production
llm = OpenAIPrimitive(model="gpt-4o-mini")
# Use Claude for complex reasoning
llm = AnthropicPrimitive(model="claude-3-5-sonnet-20241022")
Use RouterPrimitive to combine multiple LLMs:
from tta_dev_primitives import RouterPrimitive
from tta_dev_primitives.integrations import (
OpenAIPrimitive,
AnthropicPrimitive,
OllamaPrimitive
)
# Create router with fallback strategy
router = RouterPrimitive(
routes={
"fast": OpenAIPrimitive(model="gpt-4o-mini"), # Default
"quality": AnthropicPrimitive(), # For complex tasks
"free": OllamaPrimitive() # For development
},
default_route="fast"
)
# Route based on task complexity
def select_route(task):
if task.complexity == "high":
return "quality"
elif task.is_development:
return "free"
return "fast"
packages/tta-dev-primitives/src/tta_dev_primitives/integrations/openai_primitive.pypackages/tta-dev-primitives/src/tta_dev_primitives/integrations/anthropic_primitive.pypackages/tta-dev-primitives/src/tta_dev_primitives/integrations/ollama_primitive.pyPRIMITIVES_CATALOG.mdPRIMITIVES_CATALOG.mdPRIMITIVES_CATALOG.mdLast Updated: October 30, 2025 For: AI Agents & Developers (all skill levels) Maintained by: TTA.dev Team