LiteLLM v1.65.0 introduces significant enhancements including Model Context Protocol (MCP) tools, new models, and various performance improvements.
Model Context Protocol (MCP)β
This release introduces support for centrally adding MCP servers on LiteLLM. This allows you to add MCP endpoints and your developers can list
and call
MCP tools through LiteLLM.
Custom Prompt Managementβ
This release allows you to connect LiteLLM to any prompt management service through our custom prompt management hooks. As proxy admin all you need to do is implement a get_chat_completion_prompt
hook which accepts a prompt_id and prompt_variables and returns a formatted prompt.
Categorized Improvements and Fixesβ
New Models / Updated Modelsβ
- Support for Vertex AI gemini-2.0-flash-lite & Google AI Studio gemini-2.0-flash-lite PR
- Support for Vertex AI Fine-Tuned LLMs PR
- Nova Canvas image generation support PR
- OpenAI gpt-4o-transcribe support PR
- Added new Vertex AI text embedding model PR
- Updated model prices and context windows PR
LLM Translationβ
- OpenAI Web Search Tool Call Support PR
- Vertex AI topLogprobs support PR
- Fixed Vertex AI multimodal embedding translation PR
- Support litellm.api_base for Vertex AI + Gemini across completion, embedding, image_generation PR
- Fixed Mistral chat transformation PR
Spend Tracking Improvementsβ
- Log 'api_base' on spend logs PR
- Support for Gemini audio token cost tracking PR
- Fixed OpenAI audio input token cost tracking PR
- Added Daily User Spend Aggregate view - allows UI Usage tab to work > 1m rows PR
- Connected UI to "LiteLLM_DailyUserSpend" spend table PR
UIβ
- Allowed team admins to add/update/delete models on UI PR
- Show API base and model ID on request logs PR
- Allow viewing keyinfo on request logs PR
- Enabled viewing all wildcard models on /model/info PR
- Added render supports_web_search on model hub PR
Logging Integrationsβ
- Fixed StandardLoggingPayload for GCS Pub Sub Logging Integration PR
Performance / Reliability Improvementsβ
- LiteLLM Redis semantic caching implementation PR
- Gracefully handle exceptions when DB is having an outage PR
- Allow Pods to startup + passing /health/readiness when allow_requests_on_db_unavailable: True and DB is down PR
- Removed hard coded final usage chunk on Bedrock streaming usage PR
- Refactored Vertex AI passthrough routes - fixes unpredictable behaviour with auto-setting default_vertex_region on router model add PR
General Improvementsβ
- Support for exposing MCP tools on litellm proxy PR
- Support discovering Gemini, Anthropic, xAI models by calling their /v1/model endpoint PR
- Fixed route check for non-proxy admins on JWT auth PR
- Added baseline Prisma database migrations PR
- Get master key from environment, if not set PR
Documentationβ
Securityβ
- Bumped next from 14.2.21 to 14.2.25 in UI dashboard PR