This directory contains practical examples demonstrating how to use the tta-dev-primitives package to build robust AI application workflows.
Phase 3 examples have been updated to align with the current InstrumentedPrimitive patterns and observability changes. The following examples are functional and tested in this branch:
rag_workflow.py — Basic RAG example (retrieval + LLM generation)agentic_rag_workflow.py — Production-grade agentic RAG (routing, grading, hallucination checks)cost_tracking_workflow.py — Cost tracking and budget enforcementstreaming_workflow.py — Token-by-token streaming with metrics and aggregationmulti_agent_workflow.py is being recreated to follow the same pattern and will be available shortly.
rag_workflow.py ✅Demonstrates a working RAG workflow: vector retrieval (simulated), context augmentation, and LLM generation with caching and fallback.
Features:
Run it:
cd packages/tta-dev-primitives
uv run python examples/rag_workflow.py
agentic_rag_workflow.py ✅Production-grade agentic RAG implementation based on the NVIDIA agentic pattern. This example demonstrates a 6-stage pipeline with routing, retrieval, document grading, answer generation, answer grading, and hallucination checking.
Features:
Run it:
cd packages/tta-dev-primitives
uv run python examples/agentic_rag_workflow.py
multi_agent_workflow.pyMulti-agent coordination pattern with task decomposition and parallel execution.
Features:
Run it:
cd packages/tta-dev-primitives
uv run python examples/multi_agent_workflow.py
cost_tracking_workflow.pyCost tracking and budget enforcement with detailed metrics and attribution.
Features:
Run it:
cd packages/tta-dev-primitives
uv run python examples/cost_tracking_workflow.py
streaming_workflow.pyStreaming LLM responses with token-by-token delivery and performance metrics.
Features:
Run it:
cd packages/tta-dev-primitives
uv run python examples/streaming_workflow.py
quick_wins_demo.pyQuick start demonstration showing basic primitive usage and composition.
Topics covered:
Run it:
cd packages/tta-dev-primitives
uv run python examples/quick_wins_demo.py
real_world_workflows.pyProduction-ready workflow patterns for common AI application scenarios.
Examples included:
Run it:
cd packages/tta-dev-primitives
uv run python examples/real_world_workflows.py
error_handling_patterns.pyRobust error handling strategies using recovery primitives.
Examples included:
Run it:
cd packages/tta-dev-primitives
uv run python examples/error_handling_patterns.py
apm_example.pyAgent Package Manager (APM) integration showing how to use MCP-compatible package metadata.
Topics covered:
Run it:
cd packages/tta-dev-primitives
uv run python examples/apm_example.py
observability_demo.py ⭐ NEWComprehensive observability platform demonstration showcasing production-ready monitoring and metrics.
This demo proves that the TTA.dev observability platform (Phases 1-3) is production-ready and provides real value for monitoring AI workflows.
Topics covered:
InstrumentedPrimitive - no manual instrumentation neededWhat the demo does:
Run it:
cd packages/tta-dev-primitives
uv run python examples/observability_demo.py
Sample output:
📊 Metrics for: llm_generation
------------------------------------------------------------
Latency Percentiles:
p50: 227.90ms
p90: 463.71ms
p95: 466.12ms
p99: 472.14ms
SLO Status: ✅
Target: 95.0%
Availability: 95.24%
Latency Compliance: 100.00%
Error Budget Remaining: 100.0%
Throughput:
Total Requests: 21
RPS: 2.27
Next steps after running the demo:
dashboards/grafana/dashboards/alertmanager/uv pip install prometheus-clientSequential:
workflow = step1 >> step2 >> step3
Parallel:
results = ParallelPrimitive([task1, task2, task3])
Conditional:
conditional = ConditionalPrimitive(
condition=lambda x, ctx: x["type"] == "important",
if_true=priority_handler,
if_false=normal_handler
)
Retry:
RetryPrimitive(
primitive=api_call,
max_attempts=3,
backoff_factor=2.0
)
Fallback:
FallbackPrimitive(
primary=expensive_service,
fallback=cheap_service
)
Timeout:
TimeoutPrimitive(
primitive=slow_operation,
timeout_seconds=5.0
)
Caching:
CachePrimitive(
ttl=3600, # 1 hour
max_size=1000
)
Routing:
RouterPrimitive(
routes={
"fast": fast_model,
"balanced": balanced_model,
"quality": quality_model
}
)
LambdaPrimitive for quick prototyping>> operator or SequentialPrimitive to chain stepsRetryPrimitive, TimeoutPrimitive, FallbackPrimitiveCachePrimitive and RouterPrimitive for cost/performanceWorkflowContext for tracking and observabilityworkflow = (
validate_input >>
CachePrimitive(ttl=1800) >>
RouterPrimitive(tier="balanced") >>
process_response >>
format_output
)
api_workflow = FallbackPrimitive(
primary=TimeoutPrimitive(
primitive=RetryPrimitive(
primitive=api_call,
max_attempts=3
),
timeout_seconds=5.0
),
fallback=cached_response
)
pipeline = SequentialPrimitive([
load_data,
ParallelPrimitive([clean, validate, enrich]),
transform,
save_results
])
All examples include inline assertions and output for verification. To run with pytest:
cd packages/tta-dev-primitives
uv run pytest examples/ -v
Have a useful pattern to share? We welcome contributions!
See CONTRIBUTING.md for details.