Global AI and Data Science

Global AI & Data Science

Train, tune and distribute models with generative AI and machine learning capabilities

 View Only

How to Monitor Your LLM Usage and Costs Using LiteLLM, PostgreSQL, and the Built-In Dashboard

By Wendy Munoz posted Fri January 09, 2026 12:37 PM

  

In modern AI applications, understanding how much you’re spending on large language model (LLM) calls — and why— is no longer optional. Developers and platform teams alike often struggle to answer questions such as:

  • Which models are costing us the most?

  • How much does each team or feature contribute to the bill?

  • Can we centralize logs and costs from multiple providers?

LiteLLM — an open-source LLM gateway — paired with a PostgreSQL backend and its built-in UI provides an elegant self-hosted solution to answer these questions with clarity and control.

What Is LiteLLM and Why It Matters

LiteLLM acts as a proxy between your applications and multiple LLM providers like OpenAI, Anthropic, Azure, and others. It standardizes API calls, offers unified spend tracking, and supports advanced features such as budgets, rate limits, and tagging — all without needing separate tooling for each provider.

The benefits include:

  • Centralized visibility across providers

  • Automatic cost attribution per API request

  • Governance with budgets and rate limits

  • Built-in UI for exploring logs, token usage, and spend patterns

This approach means you no longer have to stitch together custom dashboards or query individual logs from every model service you use.

Step-by-Step Setup

1. Provision a PostgreSQL Database

First, deploy a PostgreSQL instance — whether locally, in Docker, or via a hosted service. This database will store all:

  • Usage logs

  • Cost records

  • Metadata tags

  • Budgets and team configurations

You can use tools like pgAdmin alongside Postgres to inspect or query this data visually.

2. Configure LiteLLM with PostgreSQL

Create a configuration file (e.g., litellm_config.yaml) where you define:

  • Your database connection string

  • LLM models you want to proxy

  • The master API key your clients will use

Example snippet:

general_settings: master_key: "sk-your-master-key" database_url: "postgresql://<user>:<password>@<host>:5432/<db>"

With this in place, LiteLLM will log every request into PostgreSQL so that spend and usage can be analyzed later.

3. Launch LiteLLM Proxy

Run the LiteLLM process, pointing it to your config file. If you’re using Docker, mount the config and expose the proxy port — typically 4000 — so your applications can connect:

docker run \ -v $(pwd)/litellm_config.yaml:/app/config.yaml \ -p 4000:4000 \ ghcr.io/berriai/litellm:latest \ --config /app/config.yaml

Once running, all LLM requests sent through this proxy will be logged into PostgreSQL with usage details and cost info.

4. Send Test Requests

You can then send requests through LiteLLM using your favorite client — whether it’s Python, curl, or frameworks like LangChain:

from langchain_openai import ChatOpenAI chat = ChatOpenAI( openai_api_base="http://localhost:4000/v1", openai_api_key="sk-your-master-key", model_name="gpt-3.5-turbo" ) response = chat.invoke([{"role": "user", "content": "Explain LLM cost tracking!"}]) print(response.content)

Every request will now be reflected in PostgreSQL, capturing token counts, costs, model used, and any metadata you attach — like project tags or user IDs.

5. Explore Spend with the Built-In Dashboard

LiteLLM comes with a helpful web interface accessible at:

http://<proxy-host>:4000/ui

In this UI, you can:

  • View token usage and cost over time

  • Break down spend by model, user, tag, or API key

  • Spot anomalies or cost spikes quickly

  • Drill into individual request logs

This dashboard makes it easy for platform engineers or cost accountants to visualize exactly what’s happening in their AI stack without writing complex SQL queries.

Making It Even More Powerful

Once live, you can extend the setup further:

- Attach tags to your requests to categorize spend by team or feature.
- Set budgets or rate limits for specific tags so departments stay within allocated spend.
- Integrate telemetry systems (e.g., PostHog, Datadog) for deeper analytics across performance and cost.

Tracking LLM usage and cost doesn’t have to be ad-hoc or opaque. By routing all AI calls through LiteLLM with PostgreSQL storage and its UI, you build a transparent, extensible, and governable cost monitoring pipeline.

This approach equips teams with the data they need to optimize spending, analyze usage patterns, and build more cost-efficient AI systems — all on a self-hosted stack they control.

1 comment
5 views

Permalink

Comments

Mon January 12, 2026 06:03 AM

Wendy Munoz's LiteLLM stack = LLM cost control singularity

PostgreSQL + proxy gateway delivers $47K/month visibility across OpenAI/Anthropic/Azure with zero custom tooling.

Master key + tag attribution = team-level governance at the 4000 port dashboard.

Production architecture: App → localhost:4000 → LiteLLM → Provider (GPT/Claude). Every request logs tokens + cost + metadata to Postgres. UI explodes spend by model/team/user.

docker run -v config.yaml:/app/config.yaml -p 4000:4000 litellm:main config: master_key=sk-123 + database_url=postgres://...

Value extraction:

Top models: gpt-4o (67% spend, $24K) Top teams: Marketing (47%, $12K) Anomalies: Legal chat loop (3.2K req/hr) Budgets: Engineering alert@8K tokens

Centralized cost truth. Tag-enforced governance. Self-hosted sovereignty. No vendor lock. Deploy proxy. Costs visible. Chaos terminated.