In today's AI world, keeping a close eye on how much you're spending on large language model (LLM) usage isn't just a “nice to have” — it's essential. Whether you’re a platform team managing access or a development group optimizing costs, LiteLLM paired with a PostgreSQL backend can give you clear, granular visibility into your usage patterns and spending. In this post, we'll walk through how to set it up and leverage its built-in tools to make data-driven decisions.
What Is LiteLLM, and Why Use It?
LiteLLM is an LLM gateway / proxy that supports 100+ models from various providers (OpenAI, Azure, Anthropic, etc.). LiteLLM+1
It offers:
-
Unified API access across multiple LLM providers
-
Cost tracking (models, users, API keys) LiteLLM+1
-
Budgeting and rate limiting via tags LiteLLM
-
Observability tools and logging integrations
Because LiteLLM is compatible with many LLM providers, you don’t need to build custom spend-tracking for each one: you centralize it.
Setting Up the Infrastructure
1. Deploy PostgreSQL
Start by deploying a PostgreSQL database. You can use Docker + docker-compose (or your preferred setup) and optionally pgAdmin if you want a UI for exploring data. The database will store:
-
Users, teams, organizations
-
Virtual API keys
-
Budget configurations
-
Detailed usage logs for each API request LiteLLM+1
2. Configure LiteLLM to Use Postgres
In your litellm_config.yaml, set up the database URL to point to your Postgres instance. For example:
Make sure that master_key matches what your clients will use to authenticate with the LiteLLM proxy. Medium+1
3. Run the LiteLLM Proxy
Once configured, start LiteLLM (for example, via Docker). It will proxy all your LLM requests, log them to Postgres, and compute cost per call automatically. Medium+1
4. Send Test Requests
You can send test LLM requests via Python (LiteLLM SDK), curl, or LangChain. For example, using Python:
By passing metadata tags in your requests, you can attribute spend to specific teams, projects, or use-cases. LiteLLM
Analyzing Spend & Usage
LiteLLM provides APIs for exploring usage:
-
Daily spend breakdown: You can query /user/daily/activity over a date range to get per-day metrics: spend, prompt tokens, completion tokens, number of requests, and a breakdown by model, provider, or API key. LiteLLM+1
-
User-level totals: Get a summary of total spend for a user and their associated API keys. LiteLLM
-
UI dashboard: LiteLLM includes a built-in web UI (usually at http://<proxy-host>/ui) where you can explore logs, cost by model, user, and tags. Medium
Advanced Options: Tag Budgets & Rate Limits
You can enforce budgets based on tags: for example, limit spending for "marketing" vs "engineering" or "chatbot-research" vs "summarization". LiteLLM
This gives you strong cost governance. If you hit a “soft” budget, you can throttle or block requests, or raise alerts.
Scaling Considerations
If your LiteLLM usage logs grow very large (e.g., 1M+ rows), querying might become slow. LiteLLM
To mitigate:
-
Export logs from Postgres to a data warehouse (S3, GCS, Snowflake, etc.)
-
Perform analysis in a separate analytics system (Redash, Databricks, etc.)
-
Optionally, disable real-time logging in the proxy for long-term production loads
Complementary Tools
-
llm-usage-tracker: A Python library that can monkey-patch LLM calls (OpenAI, LiteLLM, Gemini) to automatically compute token usage and cost. PyPI
-
llm-accounting: A package to track LLM usage, costs, tokens, and more, with support for PostgreSQL or SQLite as backend. PyPI
-
PostHog integration: You can send LiteLLM usage events to PostHog for behavioral analytics, combining cost tracking with user journeys. LiteLLM
Why This Setup Is Powerful
-
Centralized cost attribution — All LLM calls go through LiteLLM, so you have a single source of truth for spend.
-
Granular visibility — By using metadata tags, you can break down usage by project, user, team, or feature.
-
Governance & control — You can enforce budgets, rate limits, or other rules per tag or user.
-
Open architecture — Because you own the Postgres DB, you can export, aggregate, or analyze data however you want.
-
Scalability — While LiteLLM's built-in UI is great for smaller workloads, you can scale to big data workflows using standard analytics tools.
If you're building or operating LLM-based systems, cost visibility is not just a back-office concern — it's essential for optimizing performance, managing risk, and scaling responsibly. Using LiteLLM with PostgreSQL gives you a self-hosted, transparent, and flexible foundation for tracking LLM usage and spend. Once it's in place, you can monitor, enforce, and optimize — making your AI infrastructure both powerful and cost-conscious.