API Reference
Core Function
run
async def run(
pipeline: list[Any],
query: str,
client: Client
) -> tuple[str, list[dict[str, Any]]]
Execute a pipeline against a query.
Parameters:
| Name | Type | Description |
|---|---|---|
pipeline |
list |
List of pipeline steps |
query |
str |
User query to process |
client |
Client |
Async function to call LLMs |
Returns:
A tuple of (result, history):
result: Final response string (empty string if pipeline produces no output)history: List of step execution records
History record structure:
{
"step": str, # Step class name ("Propose", "Aggregate", etc.)
"outputs": list[str], # Responses after this step
"llm_calls": list[dict], # Details of each LLM call
"step_time": float, # Seconds elapsed
}
LLM call structure:
{
"model": str, # Model identifier
"time": float, # Call duration in seconds
"in_tokens": int, # Input tokens
"out_tokens": int, # Output tokens
"error": str, # Only present if call failed
}
Client Protocol
class Client(Protocol):
def __call__(
self,
*,
model: str,
messages: list[Message],
temp: float,
max_tokens: int,
) -> Awaitable[tuple[str, int, int]]: ...
Your client must be an async callable that returns (response_text, input_tokens, output_tokens).
Message type:
class Message(TypedDict):
role: str # "system", "user", or "assistant"
content: str # Message content
LLM Steps
Propose
Generate initial responses from multiple models in parallel.
Synthesize
class Synthesize(NamedTuple):
agents: list[str]
prompt: str = P_SYNTH
temp: float = 0.7
max_tokens: int = 2048
Each agent synthesizes all previous responses.
Aggregate
class Aggregate(NamedTuple):
agent: str
prompt: str = P_SYNTH
temp: float = 0.7
max_tokens: int = 2048
Single agent combines all responses into one.
Refine
class Refine(NamedTuple):
agents: list[str]
prompt: str = P_REFINE
temp: float = 0.7
max_tokens: int = 2048
Improve each response individually. Agents are cycled if fewer than responses.
Rank
class Rank(NamedTuple):
agent: str
n: int = 3
prompt: str = P_RANK
temp: float = 0.7
max_tokens: int = 2048
Select top N responses by quality.
Vote
Find consensus or select best answer.
Transform Steps
Shuffle
Randomize response order.
Dropout
Randomly drop responses with given probability.
Sample
Take random subset of N responses.
Take
Keep first N responses.
Filter
Keep responses where fn(response) returns True.
Map
Transform each response with fn(response).
Constants
Default Temperature
Default Max Tokens
Prompts
P_SYNTH = """You have been provided with responses from various models to a query. \
Synthesize into a single, high-quality response. \
Critically evaluate—some may be biased or incorrect. \
Do not simply replicate; offer a refined, accurate reply."""
P_REFINE = "Improve this response:\n\n{text}\n\nOriginal query: {query}"
P_VOTE = """These responses answer the same question. \
Identify the consensus view shared by the majority. \
If no clear consensus, select the most accurate answer. \
Return only that answer, restated clearly."""
P_RANK = """Rank these responses by quality for the query: '{query}'
{responses}
Return the top {n} as comma-separated numbers (e.g., '3, 1, 5')."""