Balancer#

class langworks.middleware.balancer.Balancer#

Middleware allowing to distribute queries among other middleware, allowing for load balancing. Optionally, load balancing may be enhanced with autoscaling, allowing to control at what rate middleware are made available or unavailable.

Fundamentals#

__init__(middleware: Sequence[Middleware], autoscale_threshold: tuple[float, float] = (0, 0))#

Initialized the Balancer.

Parameters#

middleware: Instantiated middleware to which queries may be distributed, giving priority to middleware specified first.
autoscale_threshold: Pair of thresholds specifying at what number of queries per middleware to scale up (first item) or scale down (second item). By default this is set to (0, 0), setting the balancer up to immediately scale up to use all resources, while never scaling down.

Methods#

exec(query: str = None, role: str = None, guidance: str = None, history: Thread = None, context: dict = None, params: SamplingParams = None) → tuple[Thread, dict[str, Any]]#

Generate a new message, following up on the message passed using the given guidance and sampling parameters.

Parameters#

query: The query to prompt the LLM with, optionally formatted using Langworks’ static DSL.
role: The role of the agent stating this query, usually ‘user’, ‘system’ or ‘assistant’.
guidance: Template for the message to be generated, formatted using Langworks’ dyanmic DSL.
history: Conversational history (thread) to prepend to the prompt.
context: Context to reference when filling in the templated parts of the query, guidance and history. In case the Langwork or the input also define a context, the available contexts are merged. When duplicate attributes are observed, the value is copied from the most specific context, i.e. input context over Query context, and Query context over Langwork context.
params: Sampling parameters, wrapped by a SamplingParams object, specifying how the LLM should select subsequent tokens.

Balancer

Contents

Balancer#

Fundamentals#

Parameters#

Methods#

Parameters#