API Reference

Core Function

`run`

async def run(
    pipeline: list[Any],
    query: str,
    client: Client
) -> tuple[str, list[dict[str, Any]]]

Execute a pipeline against a query.

Parameters:

Name	Type	Description
`pipeline`	`list`	List of pipeline steps
`query`	`str`	User query to process
`client`	`Client`	Async function to call LLMs

Returns:

A tuple of (result, history):

result: Final response string (empty string if pipeline produces no output)
history: List of step execution records

History record structure:

{
    "step": str,              # Step class name ("Propose", "Aggregate", etc.)
    "outputs": list[str],     # Responses after this step
    "llm_calls": list[dict],  # Details of each LLM call
    "step_time": float,       # Seconds elapsed
}

LLM call structure:

{
    "model": str,       # Model identifier
    "time": float,      # Call duration in seconds
    "in_tokens": int,   # Input tokens
    "out_tokens": int,  # Output tokens
    "error": str,       # Only present if call failed
}

Client Protocol

class Client(Protocol):
    def __call__(
        self,
        *,
        model: str,
        messages: list[Message],
        temp: float,
        max_tokens: int,
    ) -> Awaitable[tuple[str, int, int]]: ...

Your client must be an async callable that returns (response_text, input_tokens, output_tokens).

Message type:

class Message(TypedDict):
    role: str      # "system", "user", or "assistant"
    content: str   # Message content

LLM Steps

`Propose`

class Propose(NamedTuple):
    agents: list[str]
    temp: float = 0.7
    max_tokens: int = 2048

Generate initial responses from multiple models in parallel.

`Synthesize`

class Synthesize(NamedTuple):
    agents: list[str]
    prompt: str = P_SYNTH
    temp: float = 0.7
    max_tokens: int = 2048

Each agent synthesizes all previous responses.

`Aggregate`

class Aggregate(NamedTuple):
    agent: str
    prompt: str = P_SYNTH
    temp: float = 0.7
    max_tokens: int = 2048

Single agent combines all responses into one.

`Refine`

class Refine(NamedTuple):
    agents: list[str]
    prompt: str = P_REFINE
    temp: float = 0.7
    max_tokens: int = 2048

Improve each response individually. Agents are cycled if fewer than responses.

`Rank`

class Rank(NamedTuple):
    agent: str
    n: int = 3
    prompt: str = P_RANK
    temp: float = 0.7
    max_tokens: int = 2048

Select top N responses by quality.

`Vote`

class Vote(NamedTuple):
    agent: str
    prompt: str = P_VOTE
    temp: float = 0.7
    max_tokens: int = 2048

Find consensus or select best answer.

Transform Steps

`Shuffle`

class Shuffle(NamedTuple): ...

Randomize response order.

`Dropout`

class Dropout(NamedTuple):
    rate: float  # 0.0 to 1.0

Randomly drop responses with given probability.

`Sample`

class Sample(NamedTuple):
    n: int

Take random subset of N responses.

`Take`

class Take(NamedTuple):
    n: int

Keep first N responses.

`Filter`

class Filter(NamedTuple):
    fn: Callable[[str], bool]

Keep responses where fn(response) returns True.

`Map`

class Map(NamedTuple):
    fn: Callable[[str], str]

Transform each response with fn(response).

Constants

Default Temperature

DEFAULT_TEMP = 0.7

Default Max Tokens

DEFAULT_MAX_TOKENS = 2048

Prompts

P_SYNTH = """You have been provided with responses from various models to a query. \
Synthesize into a single, high-quality response. \
Critically evaluate—some may be biased or incorrect. \
Do not simply replicate; offer a refined, accurate reply."""

P_REFINE = "Improve this response:\n\n{text}\n\nOriginal query: {query}"

P_VOTE = """These responses answer the same question. \
Identify the consensus view shared by the majority. \
If no clear consensus, select the most accurate answer. \
Return only that answer, restated clearly."""

P_RANK = """Rank these responses by quality for the query: '{query}'

{responses}

Return the top {n} as comma-separated numbers (e.g., '3, 1, 5')."""

Type Exports

from mixture_llm import (
    # Core
    run,
    Message,
    Client,

    # LLM steps
    Propose,
    Synthesize,
    Aggregate,
    Refine,
    Rank,
    Vote,

    # Transform steps
    Shuffle,
    Dropout,
    Sample,
    Take,
    Filter,
    Map,

    # Constants
    DEFAULT_TEMP,
    DEFAULT_MAX_TOKENS,
    P_SYNTH,
    P_REFINE,
    P_VOTE,
    P_RANK,
)