The $4,000 Thursday

It starts as a Tuesday. Your agent works fine. Wednesday, you notice it ran longer than usual. By Thursday morning, you have a $4,200 bill on your credit card and no idea which agent did it or why.

You check the vendor dashboard. It shows total spend. It doesn't show which of your 12 agents made 40,000 calls to Claude overnight. It doesn't show that one agent hit a bug and retried every request 30 times. It doesn't show that your cost-per-task is 3x what it was a month ago.

Generic spend dashboards tell you how much. They don't tell you why, who, or when.

Real-time cost tracking for AI agents needs three things working together: granular transaction logging at the wallet level, a fast path to query current spend without polling the vendor API, and alerting that fires before you've already blown the budget.

$0.003
avg cost / API call
14ms
avg latency
100%
transaction coverage

Architecture: the three-layer stack

A real-time cost tracking system has three moving parts:

1. The wallet ledger. Every purchase goes through a wallet. The wallet is the source of truth for who spent what, when, and on which vendor. It records amount_usd, vendor, description, and timestamp atomically — no async, no eventual consistency.

2. The webhook push. When a transaction is recorded, a webhook fires to your endpoint. Your handler writes to a time-series store (or Postgres table) immediately. This is the "real time" part — you're not polling for updates, you're receiving them.

3. The query layer. An Express endpoint reads from the events table and returns aggregates. This feeds a dashboard, an alert check, or a billing report. Queries run against your own data, not the vendor's API.

Why not just use the vendor dashboard? Anthropic and OpenAI dashboards show you total spend across all API keys. If you run 10 agents, they all share the same billing account. You can't tell agent-7's research task from agent-3's summarization job. Trove's per-wallet ledger gives you per-agent cost attribution — every transaction tagged to a specific wallet and agent_id.

Step 1: Provision per-agent wallets

Step 1 of 4

One wallet per agent

Give every agent its own wallet. The wallet is the cost container — everything the agent spends routes through it. This is the foundation of attribution.

Provision per-agent wallet
async function createAgentWallet(agentId, budgetLimitUsd) { const res = await fetch('https://trove-qjow.polsia.app/api/wallets', { method: 'POST', headers: { 'Content-Type': 'application/json', 'X-Trove-Key': process.env.TROVE_API_KEY }, body: JSON.stringify({ agent_id: agentId, name: `Wallet: ${agentId}`, budget_limit_usd: budgetLimitUsd }) }); const { wallet } = await res.json(); console.log(`Created wallet ${wallet.id} for ${agentId} (limit: $${budgetLimitUsd})`); return wallet; } // Provision a fleet const agents = [ { id: 'research-agent-v4', budget: 200 }, { id: 'support-agent-v2', budget: 500 }, { id: 'crawler-agent-v1', budget: 100 }, ]; for (const agent of agents) { await createAgentWallet(agent.id, agent.budget); }

Step 2: Route every dollar through the wallet

Step 2 of 4

Wrap the API call in a purchase call

Before your agent calls Anthropic or OpenAI, record the expected cost through the wallet. This creates the transaction record and enforces the budget cap simultaneously.

Record API cost per call
async function agentAPICall(walletId, model, prompt, maxTokens) { // 1. Record the expected cost before calling the model const estimatedCost = estimateCost(model, maxTokens); const purchaseRes = await fetch( `https://trove-qjow.polsia.app/api/wallets/${walletId}/purchase`, { method: 'POST', headers: { 'Content-Type': 'application/json', 'X-Trove-Key': process.env.TROVE_API_KEY }, body: JSON.stringify({ amount_usd: estimatedCost, vendor: model,// 'anthropic', 'openai', etc. description: `${model} call, ${maxTokens} tokens` }) } ); if (purchaseRes.status === 402) { const err = await purchaseRes.json(); throw new Error(`Budget exhausted: $${err.available} remaining`); } // 2. Call the actual model const response = await callModel(model, prompt, maxTokens); return response; }

Step 3: Store and query real-time spend

Step 3 of 4

Build a live spending table

Your webhook handler writes every transaction to a Postgres table. Index on wallet_id and created_at so queries stay fast even at scale.

Webhook event storage
-- Migration: create the spending events table CREATE TABLE IF NOT EXISTS spending_events ( id BIGSERIAL PRIMARY KEY, wallet_id TEXT NOT NULL, tx_id TEXT NOT NULL UNIQUE, amount_usd NUMERIC(10, 4) NOT NULL, vendor TEXT NOT NULL, description TEXT, created_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); CREATE INDEX IF NOT EXISTS idx_spending_wallet_time ON spending_events (wallet_id, created_at DESC); CREATE INDEX IF NOT EXISTS idx_spending_vendor ON spending_events (vendor, created_at DESC);
Express webhook handler
app.post('/webhooks/trove', express.raw({ type: 'application/json' }), async (req, res) => { res.status(200).json({ ok: true }); processWebhookAsync(req.body); }); async function processWebhookAsync(rawBody) { const event = JSON.parse(rawBody.toString()); if (event.type !== 'purchase.succeeded') return; const { wallet_id, transaction_id, amount_usd, vendor, description } = event.data; // Write to Postgres — fast, indexed, queryable await pool.query( `INSERT INTO spending_events (wallet_id, tx_id, amount_usd, vendor, description) VALUES ($1, $2, $3, $4, $5) ON CONFLICT (tx_id) DO NOTHING`, [wallet_id, transaction_id, amount_usd, vendor, description] ); // Check for anomaly — fire alert if velocity spikes await checkAnomaly(wallet_id); }

Step 4: Query the live dashboard

Step 4 of 4

Build the queries that answer your questions

With the events table in place, you can answer any cost question in real time. Here are the queries that matter most.

Dashboard endpoint
app.get('/api/costs/live', async (req, res) => { // Per-agent spend in the last hour const agentSpend = await pool.query( `SELECT wallet_id, SUM(amount_usd) AS total_spend, COUNT(*) AS transaction_count, ROUND(SUM(amount_usd) / 60.0, 4) AS spend_per_minute FROM spending_events WHERE created_at > NOW() - INTERVAL '1 hour' GROUP BY wallet_id ORDER BY total_spend DESC` ); // Per-vendor spend in the last 24h const vendorSpend = await pool.query( `SELECT vendor, SUM(amount_usd) AS total_usd, ROUND(100.0 * SUM(amount_usd) / NULLIF(SUM(SUM(amount_usd)) OVER (), 0), 1) AS pct_of_total FROM spending_events WHERE created_at > NOW() - INTERVAL '24 hours' GROUP BY vendor ORDER BY total_usd DESC` ); res.json({ agentSpend: agentSpend.rows, vendorSpend: vendorSpend.rows }); });
Agent Fleet — Live Spend Updated just now
research-agent-v4 $23.41 $0.39/min
support-agent-v2 $8.17 $0.14/min
crawler-agent-v1 $1.88 $0.03/min
Top vendor: anthropic ($19.44) · 83% of spend

Detecting a runaway loop

The most expensive AI agent failure is the infinite loop. Your agent calls an API, gets a result, loops back to try again, hits a retry handler, and makes the same call 500 times. At $0.003 per call, that's $1.50 in 10 minutes. It can hit $50 before you notice.

The tell is spend velocity — transactions per minute. Normal agents have a stable rate. Loops have a sudden spike. Here's how to detect it:

Anomaly detection query
async function checkAnomaly(walletId) { // Current spend velocity (last 5 minutes) const current = await pool.query( `SELECT COUNT(*)::float / 5 AS current_rate FROM spending_events WHERE wallet_id = $1 AND created_at > NOW() - INTERVAL '5 minutes'`, [walletId] ); // Baseline: average rate over last 7 days, same hour-of-day const baseline = await pool.query( `SELECT AVG(rate) AS baseline_rate FROM ( SELECT EXTRACT(hour FROM created_at) AS hour, COUNT(*)::float / 60 AS rate FROM spending_events WHERE wallet_id = $1 AND created_at > NOW() - INTERVAL '7 days' GROUP BY hour ) sub WHERE hour = EXTRACT(hour FROM NOW())`, [walletId] ); const curRate = current.rows[0].current_rate || 0; const baseRate = baseline.rows[0].baseline_rate || 0; // Alert if current is 5x baseline if (baseRate > 0 && curRate > baseRate * 5) { sendSlackAlert( `Loop detected: wallet ${walletId} at ${curRate.toFixed(2)} tx/min (baseline: ${baseRate.toFixed(2)})` ); } }

The baseline adapts to your agent's actual behavior. A research agent that normally runs at 2 tx/min will trigger an alert if it jumps to 10 tx/min. A crawler that normally runs at 20 tx/min won't false-positive on a jump to 30 tx/min. The multiplier (5x) is tunable — dial it down if you get too many false alerts, up if you're missing real loops.

Making cost data actionable

Tracking spend is useful. Acting on it is what pays the bills. Here are the three decisions cost data enables:

Right-size agent budgets. After 30 days of real-time data, you know exactly what each agent spends. Set the budget limit at 2x the 90th percentile monthly spend. If an agent normally runs $80/month but hits $120 in a bad week, you want to know at $160 — not at $500.

Route tasks to cheaper vendors. If Anthropic accounts for 70% of your spend and your summarization tasks don't need the most capable model, route them to a cheaper provider. You can only make this trade-off when you have per-vendor attribution data.

Bill clients accurately. If you build AI agents for clients and charge them for API costs, you need per-agent spend data to back up your invoices. A webhook-driven ledger gives you the auditable transaction history that makes cost-plus billing credible.

The minimum viable setup. One Postgres table, one webhook handler, one Express endpoint. That's enough to see live spend by agent and detect a runaway loop before it costs $200. Everything else — anomaly thresholds, Slack alerts, monthly rollups — is additive on top of this foundation.

Try it with a sandbox wallet

The playground gives you a sandbox wallet and API key. Make a few purchase calls, hit the balance endpoint, and see the transaction ledger populate in real time.

Open the Playground

For the full API reference — balance checks, transaction pagination, budget updates — see the Trove documentation.