Changelog#
The following changes have been implemented throughout the development of Langworks:
v0.3.1 - 2025-0x-xx#
Fixes#
Fixed caching in
LlamaCppandvLLMmiddleware, as broken since the use of the chat completions API by both, introduced in v0.2.0.Fixed in
cast_json_as_cls()handling of typehints containingtyping.Literalorre.Pattern, assigning the parent object to the attribute typehinted instead of the actual value of the attribute.
v0.3.0 - 2025-07-16#
Changes#
Implemented new utility sub-module util.balancing, providing utilities for load balancing, notably the
BalancedPoolclass, serving as a pool of reusable resources such as client connections.Added multi-client support to existing middleware (
LlamaCppandvLLM), allowing to distribute queries across multiple endpoints. Optionally, usage of these endpoints may be scaled up or down, depending on usage, as controlled by the newly introducedautoscale_thresholdargument introduced to these middleware’s initializers.Added new middleware
Balancer, allowing to distribute queries across multiple middleware even if running on different backends (i.e. llama.cpp and vLLM).Added optional support for JSON auto-repair using json-repair to
DataclassandJSONconstraints.
Fixes#
Fixed
get_dataclass_defaults(), stopping it from dropping zero value defaults.
v0.2.1 - 2025-06-24#
Changes#
Added
json_dict_schema_from_cls()toutil.json, allowing to retrieve the requested JSON schema as a Python dictionary. This as opposed tojson_schema_from_cls(), which returns this schema as a string.
Fixes#
Changed how
logit_biasis handled in middleware, preventing requests passed to the inference provider from being blocked over potential validation errors, notably as has been happening with vLLM since Langworks v0.2.0.Changed intermediate representation of JSON schema in
dataclassconstraint from string to dict, conform with how thejsonconstraint stores these schema. This fixes the use of thedataclassconstraint when employing vLLM middleware.
Documentation#
Updated examples in documentation to better reflect Langworks’ current feature set.
v0.2.0 - 2025-06-22#
Changes#
Added new middleware: llama.cpp, available through the
middleware.llama_cppmodule.Added
json_schema_from_cls()to the newly addedutil.jsonmodule, allowing to easily generate from dataclasses and TypedDict subclasses, JSON schemas compliant with the official standard.Added
json_shorthand_schema_from_cls()toutil.jsonmodule, providing an utility function to easily generate shorthand JSON-schemas from dataclasses and TypedDict subclasses, as may be used when instructing a LLM to output JSON.Added typing-extensions as a dependency, due to
json_shorthand_schema_from_dict()’s use of theDocobject.Added
cast_json_as_cls()to util.json, providing an utility to easily convert a JSON-like dictionary to an instantiated dataclass or TypedDict subclass, adhering to the type hints provided by these classes.Added
clean_multiline()to the newly addedutil.stringmodule, allowing for the easy removal of unwanted newlines and indentations from multiline text blocks (triple quote strings).Added a new
configmodule, allowing to set framework-wide configurations, including the settingCLEAN_MULTILINE, which controls whether the newly introducedclean_multiline()function is automatically applied to queries and guidance passed to Query objects. By default this setting is set toTrue. Previous behaviour may be restored by setting this flag toFalse.Added a new
clean_multilineattribute tolangworks.Langwork.__init__()andlangworks.Query.__init__(), allowing to locally override automatic application ofclean_multiline()to queries, guidance and threads passed.Moved module
authtoutil.auth, and modulecachingtoutil.caching.Refactored
dsl.constraints, moving each constraint into its own sub-module.Replaced
grammarconstraint withgbnfandlarkconstraints, allowing to specify more precisely what kind of grammar is being specified.Added new constraint dataclass, which takes a dataclass or TypedDict, deriving from this input a JSON-schema, which it enforces upon the LLM, acquiring a JSON-object that the constraint converts into the the class passed, using cast_json_as_cls.
Changed
vLLMmiddleware to make use of Chat Completion API instead of regular Completion API, greatly simplifying the design of the middleware, allowing to drop the dependency on transformers, leading to faster import times.Updated vLLM-specific sampling parameters, dropping support for beam search-related parameters, adding support for allowed_tokens, bad_words and min_tokens.
Moved
langworks.middleware.generic.SamplingParams.stop_tokenstolangworks.middleware.vllm.SamplingParams.stop_tokens.Made OpenAI dependency optional, only requiring it if the vLLM middleware is used.
v0.1.2 - 2025-06-08#
Changes#
Added
joinand associated arguments to Langwork initializer, allowing embedded langworks to aggregate data before forwarding this data to the entry node.Bumped required pypeworks version to v0.3.0.
v0.1.1 - 2025-05-29#
Changes#
Bumped required pypeworks version to v0.2.0, including integration of new pypeworks features.