Release date 2025/12/18
Bugfix
-
Fixed an issue where cost savings attributes
ai_proxy_cache_cost_savingswere not properly calculated when cache hits. -
Fixed an issue where Cohere and Huggingface models did not work with the semantic cache because of the polluted request format.
-
Fixed an issue where the plugin did not report time to first token (TTFT) metrics when hit.
-
Fixed an issue where the plugin did not allow /responses api to be used with Azure.