Release date 2025/12/18
Feature
-
Supported health checks and circuit breaker for the load balancer. Added two new fields
max_failsandfail_timeout. -
Added missing
least-connectionsload balancing algorithm to the AI Proxy Advanced plugin which was missed in the previous release. -
Added support for Gemini live websocket in the ai-proxy-advanced plugin.
-
Added load balance, failover and circuit breaker feature for semantic routing.
Bugfix
-
Fixed balancer retry failures caused by expired DNS entries by preloading DNS for targets.
-
Fixed an issue where the token count for Gemini Vertex embeddings API in native format was incorrect.
-
Fixed an issue where the token count for Gemini Vertex embeddings API in OpenAI format was incorrect.
-
Fixed intermittent 500 responses from the AI Proxy Advanced plugin when using Azure OpenAI
-
Fixed an issue where the native format option did not work correctly for non-openai formats.
-
Fixed an issue where the semantic load balancing with pgvector namespace was not functioning correctly.
-
Fixed an issue when using responses API and background mode.
-
Fixed an issue where Files content analytics extraction is not handled properly for Azure.
-
Fixed an issue where requests to Anthropic Claude models via Azure Foundry were not being processed correctly.
-
Fixed an issue where AWS Bedrock invoke command is not properly proxied.
-
Fixed an issue where the Gemini image generation model responses were not being processed correctly.
-
Fixed missing
idandcreatedfields in certain drivers. -
Fixed an issue where the
max_completion_tokensparameter was not being set correctly for O1 series models (e.g.,o1,o3,o4,gpt-5).