Changelog#

The following changes have been implemented throughout the development of Langworks:

v0.3.1 - 2025-0x-xx#

Fixes#

  • Fixed caching in LlamaCpp and vLLM middleware, as broken since the use of the chat completions API by both, introduced in v0.2.0.

  • Fixed in cast_json_as_cls() handling of typehints containing typing.Literal or re.Pattern, assigning the parent object to the attribute typehinted instead of the actual value of the attribute.

v0.3.0 - 2025-07-16#

Changes#

  • Implemented new utility sub-module util.balancing, providing utilities for load balancing, notably the BalancedPool class, serving as a pool of reusable resources such as client connections.

  • Added multi-client support to existing middleware (LlamaCpp and vLLM), allowing to distribute queries across multiple endpoints. Optionally, usage of these endpoints may be scaled up or down, depending on usage, as controlled by the newly introduced autoscale_threshold argument introduced to these middleware’s initializers.

  • Added new middleware Balancer, allowing to distribute queries across multiple middleware even if running on different backends (i.e. llama.cpp and vLLM).

  • Added optional support for JSON auto-repair using json-repair to Dataclass and JSON constraints.

Fixes#

  • Fixed get_dataclass_defaults(), stopping it from dropping zero value defaults.

v0.2.1 - 2025-06-24#

Changes#

Fixes#

  • Changed how logit_bias is handled in middleware, preventing requests passed to the inference provider from being blocked over potential validation errors, notably as has been happening with vLLM since Langworks v0.2.0.

  • Changed intermediate representation of JSON schema in dataclass constraint from string to dict, conform with how the json constraint stores these schema. This fixes the use of the dataclass constraint when employing vLLM middleware.

Documentation#

  • Updated examples in documentation to better reflect Langworks’ current feature set.

v0.2.0 - 2025-06-22#

Changes#

  • Added new middleware: llama.cpp, available through the middleware.llama_cpp module.

  • Added json_schema_from_cls() to the newly added util.json module, allowing to easily generate from dataclasses and TypedDict subclasses, JSON schemas compliant with the official standard.

  • Added json_shorthand_schema_from_cls() to util.json module, providing an utility function to easily generate shorthand JSON-schemas from dataclasses and TypedDict subclasses, as may be used when instructing a LLM to output JSON.

  • Added typing-extensions as a dependency, due to json_shorthand_schema_from_dict()’s use of the Doc object.

  • Added cast_json_as_cls() to util.json, providing an utility to easily convert a JSON-like dictionary to an instantiated dataclass or TypedDict subclass, adhering to the type hints provided by these classes.

  • Added clean_multiline() to the newly added util.string module, allowing for the easy removal of unwanted newlines and indentations from multiline text blocks (triple quote strings).

  • Added a new config module, allowing to set framework-wide configurations, including the setting CLEAN_MULTILINE, which controls whether the newly introduced clean_multiline() function is automatically applied to queries and guidance passed to Query objects. By default this setting is set to True. Previous behaviour may be restored by setting this flag to False.

  • Added a new clean_multiline attribute to langworks.Langwork.__init__() and langworks.Query.__init__(), allowing to locally override automatic application of clean_multiline() to queries, guidance and threads passed.

  • Moved module auth to util.auth, and module caching to util.caching.

  • Refactored dsl.constraints, moving each constraint into its own sub-module.

  • Replaced grammar constraint with gbnf and lark constraints, allowing to specify more precisely what kind of grammar is being specified.

  • Added new constraint dataclass, which takes a dataclass or TypedDict, deriving from this input a JSON-schema, which it enforces upon the LLM, acquiring a JSON-object that the constraint converts into the the class passed, using cast_json_as_cls.

  • Changed vLLM middleware to make use of Chat Completion API instead of regular Completion API, greatly simplifying the design of the middleware, allowing to drop the dependency on transformers, leading to faster import times.

  • Updated vLLM-specific sampling parameters, dropping support for beam search-related parameters, adding support for allowed_tokens, bad_words and min_tokens.

  • Moved langworks.middleware.generic.SamplingParams.stop_tokens to langworks.middleware.vllm.SamplingParams.stop_tokens.

  • Made OpenAI dependency optional, only requiring it if the vLLM middleware is used.

v0.1.2 - 2025-06-08#

Changes#

  • Added join and associated arguments to Langwork initializer, allowing embedded langworks to aggregate data before forwarding this data to the entry node.

  • Bumped required pypeworks version to v0.3.0.

v0.1.1 - 2025-05-29#

Changes#

  • Bumped required pypeworks version to v0.2.0, including integration of new pypeworks features.