Changelog#
The following changes have been implemented throughout the development of Langworks:
v0.3.1 - 2025-0x-xx#
Fixes#
Fixed caching in
LlamaCpp
andvLLM
middleware, as broken since the use of the chat completions API by both, introduced in v0.2.0.Fixed in
cast_json_as_cls()
handling of typehints containingtyping.Literal
orre.Pattern
, assigning the parent object to the attribute typehinted instead of the actual value of the attribute.
v0.3.0 - 2025-07-16#
Changes#
Implemented new utility sub-module util.balancing, providing utilities for load balancing, notably the
BalancedPool
class, serving as a pool of reusable resources such as client connections.Added multi-client support to existing middleware (
LlamaCpp
andvLLM
), allowing to distribute queries across multiple endpoints. Optionally, usage of these endpoints may be scaled up or down, depending on usage, as controlled by the newly introducedautoscale_threshold
argument introduced to these middleware’s initializers.Added new middleware
Balancer
, allowing to distribute queries across multiple middleware even if running on different backends (i.e. llama.cpp and vLLM).Added optional support for JSON auto-repair using json-repair to
Dataclass
andJSON
constraints.
Fixes#
Fixed
get_dataclass_defaults()
, stopping it from dropping zero value defaults.
v0.2.1 - 2025-06-24#
Changes#
Added
json_dict_schema_from_cls()
toutil.json
, allowing to retrieve the requested JSON schema as a Python dictionary. This as opposed tojson_schema_from_cls()
, which returns this schema as a string.
Fixes#
Changed how
logit_bias
is handled in middleware, preventing requests passed to the inference provider from being blocked over potential validation errors, notably as has been happening with vLLM since Langworks v0.2.0.Changed intermediate representation of JSON schema in
dataclass
constraint from string to dict, conform with how thejson
constraint stores these schema. This fixes the use of thedataclass
constraint when employing vLLM middleware.
Documentation#
Updated examples in documentation to better reflect Langworks’ current feature set.
v0.2.0 - 2025-06-22#
Changes#
Added new middleware: llama.cpp, available through the
middleware.llama_cpp
module.Added
json_schema_from_cls()
to the newly addedutil.json
module, allowing to easily generate from dataclasses and TypedDict subclasses, JSON schemas compliant with the official standard.Added
json_shorthand_schema_from_cls()
toutil.json
module, providing an utility function to easily generate shorthand JSON-schemas from dataclasses and TypedDict subclasses, as may be used when instructing a LLM to output JSON.Added typing-extensions as a dependency, due to
json_shorthand_schema_from_dict()
’s use of theDoc
object.Added
cast_json_as_cls()
to util.json, providing an utility to easily convert a JSON-like dictionary to an instantiated dataclass or TypedDict subclass, adhering to the type hints provided by these classes.Added
clean_multiline()
to the newly addedutil.string
module, allowing for the easy removal of unwanted newlines and indentations from multiline text blocks (triple quote strings).Added a new
config
module, allowing to set framework-wide configurations, including the settingCLEAN_MULTILINE
, which controls whether the newly introducedclean_multiline()
function is automatically applied to queries and guidance passed to Query objects. By default this setting is set toTrue
. Previous behaviour may be restored by setting this flag toFalse
.Added a new
clean_multiline
attribute tolangworks.Langwork.__init__()
andlangworks.Query.__init__()
, allowing to locally override automatic application ofclean_multiline()
to queries, guidance and threads passed.Moved module
auth
toutil.auth
, and modulecaching
toutil.caching
.Refactored
dsl.constraints
, moving each constraint into its own sub-module.Replaced
grammar
constraint withgbnf
andlark
constraints, allowing to specify more precisely what kind of grammar is being specified.Added new constraint dataclass, which takes a dataclass or TypedDict, deriving from this input a JSON-schema, which it enforces upon the LLM, acquiring a JSON-object that the constraint converts into the the class passed, using cast_json_as_cls.
Changed
vLLM
middleware to make use of Chat Completion API instead of regular Completion API, greatly simplifying the design of the middleware, allowing to drop the dependency on transformers, leading to faster import times.Updated vLLM-specific sampling parameters, dropping support for beam search-related parameters, adding support for allowed_tokens, bad_words and min_tokens.
Moved
langworks.middleware.generic.SamplingParams.stop_tokens
tolangworks.middleware.vllm.SamplingParams.stop_tokens
.Made OpenAI dependency optional, only requiring it if the vLLM middleware is used.
v0.1.2 - 2025-06-08#
Changes#
Added
join
and associated arguments to Langwork initializer, allowing embedded langworks to aggregate data before forwarding this data to the entry node.Bumped required pypeworks version to v0.3.0.
v0.1.1 - 2025-05-29#
Changes#
Bumped required pypeworks version to v0.2.0, including integration of new pypeworks features.