Cost-Aware API Management Playbook

Overview

Stop discovering API cost overruns at month-end invoices. This playbook shows you how to set up FlexGate's AI-powered cost tracking that detects anomalies in real-time, analyzes root causes with Claude, and automatically applies optimizations.

What You'll Build:

Per-route, per-client cost tracking
Real-time budget alerts
Claude-powered cost optimization analysis
Automatic caching and rate limiting
Cost trend forecasting

Prerequisites:

FlexGate v2.0+ with AI module installed
Anthropic API key (get one here)
Basic understanding of API pricing (OpenAI, Anthropic, etc.)
Node.js 18+

Step 1: Define Your API Costs (5 minutes)

Create config/api-costs.ts:

typescript

// Define cost per API call for your routes
export const API_COSTS = {
  // OpenAI routes
  '/api/chat/completions': {
    baseCost: 0.002,      // $0.002 per request (GPT-4)
    costPerToken: 0.00003, // $0.03 per 1K tokens
    model: 'gpt-4'
  },
  
  '/api/embeddings': {
    baseCost: 0.0001,     // $0.0001 per request
    costPerToken: 0.0000001,
    model: 'text-embedding-ada-002'
  },
  
  '/api/completions': {
    baseCost: 0.001,      // $0.001 per request (GPT-3.5)
    costPerToken: 0.000002,
    model: 'gpt-3.5-turbo'
  },
  
  // Anthropic routes
  '/api/claude/messages': {
    baseCost: 0.015,      // $0.015 per request (Claude 3.5 Sonnet)
    costPerToken: 0.000003,
    model: 'claude-3-5-sonnet-20241022'
  },
  
  // Database queries (estimated)
  '/api/analytics': {
    baseCost: 0.0005,     // $0.0005 per complex query
    model: 'postgresql'
  },
  
  // Default for unknown routes
  default: {
    baseCost: 0.0001,
    model: 'unknown'
  }
} as const;

// Monthly budgets
export const BUDGETS = {
  daily: {
    total: 50.00,         // $50/day total
    perRoute: {
      '/api/chat/completions': 30.00,
      '/api/claude/messages': 15.00,
      '/api/analytics': 5.00
    }
  },
  
  monthly: {
    total: 1500.00,       // $1500/month total
    perRoute: {
      '/api/chat/completions': 900.00,
      '/api/claude/messages': 450.00,
      '/api/analytics': 150.00
    }
  },
  
  // Per-client budgets (prevent single client from using entire quota)
  perClient: {
    daily: 10.00,         // $10/day per client
    monthly: 300.00       // $300/month per client
  }
};

// Cost alert thresholds
export const COST_THRESHOLDS = {
  warning: 0.75,    // Alert at 75% of budget
  critical: 0.90,   // Critical alert at 90% of budget
  emergency: 1.00   // Emergency at 100% of budget
};

Customize for your use case:

typescript

// Example: Twilio SMS costs
'/api/sms/send': {
  baseCost: 0.0075,  // $0.0075 per SMS
  model: 'twilio'
}

// Example: AWS Lambda invocations
'/api/process': {
  baseCost: 0.0000002,  // $0.20 per 1M requests
  model: 'aws-lambda'
}

Step 2: Implement Cost Tracking Middleware (10 minutes)

Create middleware/cost-tracking.ts:

typescript

import { Request, Response, NextFunction } from 'express';
import { AIEventEmitter, AIEventType, EventSeverity } from '@flexgate/ai';
import { API_COSTS, BUDGETS, COST_THRESHOLDS } from '../config/api-costs';

// Cost tracking storage (use Redis in production)
interface CostRecord {
  route: string;
  client: string;
  cost: number;
  timestamp: string;
  tokens?: number;
  model?: string;
}

const costRecords: CostRecord[] = [];
const dailyCosts = new Map<string, number>(); // route -> total cost today
const clientCosts = new Map<string, number>(); // client -> total cost today

export const costTracking = (req: Request, res: Response, next: NextFunction) => {
  const startTime = Date.now();
  
  // Capture original json method to intercept response
  const originalJson = res.json;
  
  res.json = function(body: any) {
    // Calculate cost based on route and response
    const route = req.route?.path || req.path;
    const client = req.headers['x-client-id'] as string || 'anonymous';
    
    const costConfig = API_COSTS[route] || API_COSTS.default;
    let totalCost = costConfig.baseCost;
    
    // Add token-based cost if available
    if (body?.usage?.total_tokens && costConfig.costPerToken) {
      totalCost += body.usage.total_tokens * costConfig.costPerToken;
    }
    
    // Record cost
    const record: CostRecord = {
      route,
      client,
      cost: totalCost,
      timestamp: new Date().toISOString(),
      tokens: body?.usage?.total_tokens,
      model: costConfig.model
    };
    
    costRecords.push(record);
    if (costRecords.length > 10000) {
      costRecords.shift(); // Keep last 10K records
    }
    
    // Update daily totals
    const routeKey = `${route}`;
    dailyCosts.set(routeKey, (dailyCosts.get(routeKey) || 0) + totalCost);
    clientCosts.set(client, (clientCosts.get(client) || 0) + totalCost);
    
    // Check budgets and emit alerts
    checkBudgets(route, client, totalCost).catch(console.error);
    
    return originalJson.call(this, body);
  };
  
  next();
};

async function checkBudgets(route: string, client: string, cost: number) {
  const aiEvents = (await import('../config/ai-events')).default;
  
  // Check route budget
  const routeDailyCost = dailyCosts.get(route) || 0;
  const routeBudget = BUDGETS.daily.perRoute[route] || BUDGETS.daily.total;
  const routeUsage = routeDailyCost / routeBudget;
  
  if (routeUsage >= COST_THRESHOLDS.warning) {
    const severity = routeUsage >= COST_THRESHOLDS.critical 
      ? EventSeverity.CRITICAL 
      : EventSeverity.WARNING;
    
    await aiEvents.emit({
      type: AIEventType.COST_ALERT,
      severity,
      summary: `Route ${route} at ${Math.round(routeUsage * 100)}% of daily budget ($${routeDailyCost.toFixed(2)}/$${routeBudget.toFixed(2)})`,
      data: {
        metric: 'cost_usd',
        current_value: routeDailyCost,
        threshold: routeBudget * COST_THRESHOLDS.warning,
        window: '1d',
        trend: 'RISING',
        unit: 'usd'
      },
      context: {
        route,
        budget_daily: routeBudget,
        usage_percent: routeUsage * 100,
        top_clients: getTopClientsByRoute(route),
        time_to_monthly_limit: estimateTimeToLimit(routeDailyCost, BUDGETS.monthly.perRoute[route] || BUDGETS.monthly.total)
      }
    });
  }
  
  // Check client budget
  const clientDailyCost = clientCosts.get(client) || 0;
  const clientBudget = BUDGETS.perClient.daily;
  const clientUsage = clientDailyCost / clientBudget;
  
  if (clientUsage >= COST_THRESHOLDS.warning) {
    await aiEvents.emit({
      type: AIEventType.COST_ALERT,
      severity: clientUsage >= COST_THRESHOLDS.critical 
        ? EventSeverity.CRITICAL 
        : EventSeverity.WARNING,
      summary: `Client ${client} at ${Math.round(clientUsage * 100)}% of daily budget ($${clientDailyCost.toFixed(2)}/$${clientBudget.toFixed(2)})`,
      data: {
        metric: 'cost_usd',
        current_value: clientDailyCost,
        threshold: clientBudget * COST_THRESHOLDS.warning,
        window: '1d',
        trend: 'RISING',
        unit: 'usd'
      },
      context: {
        client,
        budget_daily: clientBudget,
        usage_percent: clientUsage * 100,
        client_routes: getClientRoutes(client)
      }
    });
  }
}

function getTopClientsByRoute(route: string): Array<{ id: string; cost_usd: number; requests: number }> {
  const clientStats = new Map<string, { cost: number; requests: number }>();
  
  costRecords
    .filter(r => r.route === route)
    .forEach(r => {
      const stats = clientStats.get(r.client) || { cost: 0, requests: 0 };
      stats.cost += r.cost;
      stats.requests += 1;
      clientStats.set(r.client, stats);
    });
  
  return Array.from(clientStats.entries())
    .map(([id, stats]) => ({ id, cost_usd: stats.cost, requests: stats.requests }))
    .sort((a, b) => b.cost_usd - a.cost_usd)
    .slice(0, 5);
}

function getClientRoutes(client: string): Array<{ route: string; cost_usd: number }> {
  const routeStats = new Map<string, number>();
  
  costRecords
    .filter(r => r.client === client)
    .forEach(r => {
      routeStats.set(r.route, (routeStats.get(r.route) || 0) + r.cost);
    });
  
  return Array.from(routeStats.entries())
    .map(([route, cost]) => ({ route, cost_usd: cost }))
    .sort((a, b) => b.cost_usd - a.cost_usd)
    .slice(0, 5);
}

function estimateTimeToLimit(dailyCost: number, monthlyLimit: number): string {
  const daysInMonth = 30;
  const projectedMonthlyCost = dailyCost * daysInMonth;
  
  if (projectedMonthlyCost <= monthlyLimit) {
    return 'Within budget';
  }
  
  const daysToLimit = Math.floor(monthlyLimit / dailyCost);
  return `${daysToLimit} days`;
}

// Reset daily costs at midnight
setInterval(() => {
  const now = new Date();
  if (now.getHours() === 0 && now.getMinutes() === 0) {
    dailyCosts.clear();
    clientCosts.clear();
    console.log('[Cost Tracking] Daily costs reset');
  }
}, 60000); // Check every minute

export { costRecords, dailyCosts, clientCosts };

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187

Add to your Express app:

typescript

import express from 'express';
import { costTracking } from './middleware/cost-tracking';

const app = express();

// Add cost tracking middleware (before routes)
app.use(costTracking);

// ... your routes

Step 3: Configure Cost Alerts (5 minutes)

Update config/ai-events.ts:

typescript

import { AIEventEmitter, AIEventType, EventSeverity } from '@flexgate/ai';

export const aiEvents = new AIEventEmitter({
  enabled: true,
  
  events: {
    // ... existing events
    
    [AIEventType.COST_ALERT]: {
      enabled: true,
      threshold: 75,  // 75% of budget triggers alert
      window: '1d',
      minSamples: 5,
      severity: EventSeverity.WARNING
    }
  },
  
  // Budget tracking
  budgets: {
    daily: 50.00,
    monthly: 1500.00,
    alertOnBudgetExceeded: true,
    autoThrottle: true  // Automatically rate limit when budget exceeded
  }
});

export default aiEvents;

Step 4: AI-Powered Cost Optimization (5 minutes)

Create services/cost-optimizer.ts:

typescript

import Anthropic from '@anthropic-ai/sdk';
import { PromptTemplateLibrary } from '@flexgate/ai';
import aiEvents from '../config/ai-events';
import type { AIEvent } from '@flexgate/ai';
import { costRecords } from '../middleware/cost-tracking';

const claude = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY!
});

interface OptimizationStrategy {
  strategy: string;
  savings_usd_per_month: number;
  implementation_effort: 'LOW' | 'MEDIUM' | 'HIGH';
  priority: number;
  code_example?: string;
}

// Analyze cost alert with Claude
export async function analyzeAndOptimize(event: AIEvent) {
  try {
    // Build prompt
    const prompt = PromptTemplateLibrary.buildPrompt(event);
    
    console.log(`[Cost Optimizer] Analyzing ${event.context.route || 'overall'} costs`);
    
    // Get Claude analysis
    const response = await claude.messages.create({
      model: 'claude-3-5-sonnet-20241022',
      max_tokens: 1024,
      messages: [{
        role: 'user',
        content: prompt
      }]
    });
    
    const analysis = JSON.parse(response.content[0].text);
    
    console.log(`[Cost Optimizer] Found ${analysis.strategies?.length || 0} optimization strategies`);
    
    // Auto-apply low-effort, high-impact optimizations
    const applied: string[] = [];
    for (const strategy of analysis.strategies || []) {
      if (strategy.implementation_effort === 'LOW' && strategy.priority === 1) {
        const success = await applyOptimization(strategy, event);
        if (success) {
          applied.push(strategy.strategy);
        }
      }
    }
    
    return {
      event,
      analysis,
      appliedOptimizations: applied,
      estimatedSavings: analysis.strategies
        ?.reduce((sum, s) => sum + s.savings_usd_per_month, 0) || 0,
      timestamp: new Date().toISOString()
    };
    
  } catch (error) {
    console.error('[Cost Optimizer] Analysis failed:', error);
    throw error;
  }
}

async function applyOptimization(strategy: OptimizationStrategy, event: AIEvent): Promise<boolean> {
  try {
    console.log(`[Cost Optimizer] Applying: ${strategy.strategy}`);
    
    // Cache optimization
    if (strategy.strategy.toLowerCase().includes('cache') || 
        strategy.strategy.toLowerCase().includes('caching')) {
      await enableCaching(event.context.route, 300); // 5 min TTL
      return true;
    }
    
    // Rate limiting
    if (strategy.strategy.toLowerCase().includes('rate limit')) {
      await applyRateLimit(event.context.client, 60, 12); // 12 req/min
      return true;
    }
    
    // Request batching
    if (strategy.strategy.toLowerCase().includes('batch')) {
      await enableBatching(event.context.route);
      return true;
    }
    
    return false;
  } catch (error) {
    console.error(`[Cost Optimizer] Failed to apply ${strategy.strategy}:`, error);
    return false;
  }
}

async function enableCaching(route: string, ttl: number) {
  // Implementation depends on your cache system (Redis, Memcached, etc.)
  console.log(`[Cache] Enabled ${ttl}s cache for ${route}`);
  
  // Example with Redis
  // await redis.setex(`cache:${route}:config`, 86400, JSON.stringify({ ttl }));
}

async function applyRateLimit(client: string, windowSeconds: number, maxRequests: number) {
  console.log(`[Rate Limit] Applied ${maxRequests} req/${windowSeconds}s for ${client}`);
  
  // Example: Store in rate limiter config
  // await rateLimiter.setLimit(client, { windowSeconds, maxRequests });
}

async function enableBatching(route: string) {
  console.log(`[Batching] Enabled request batching for ${route}`);
  
  // Example: Configure batch processor
  // await batchProcessor.enable(route, { maxBatchSize: 10, maxWaitMs: 100 });
}

// Auto-analyze cost alerts
aiEvents.on('COST_ALERT', async (event) => {
  const result = await analyzeAndOptimize(event);
  
  console.log('[Cost Alert] Analysis complete:', {
    strategies: result.analysis.strategies?.length,
    applied: result.appliedOptimizations.length,
    estimatedSavings: `$${result.estimatedSavings}/month`
  });
  
  // Notify team
  await notifyCostAlert(event, result);
});

async function notifyCostAlert(event: AIEvent, result: any) {
  // Implementation in Step 5
}

export default { analyzeAndOptimize };

Step 5: Cost Alert Notifications (5 minutes)

Create services/cost-notifier.ts:

typescript

import { App } from '@slack/bolt';
import type { AIEvent } from '@flexgate/ai';

const slack = new App({
  token: process.env.SLACK_BOT_TOKEN,
  signingSecret: process.env.SLACK_SIGNING_SECRET
});

const FINANCE_CHANNEL = process.env.SLACK_FINANCE_CHANNEL || '#api-costs';

export async function notifyCostAlert(event: AIEvent, analysis: any, appliedOptimizations: string[]) {
  if (!process.env.SLACK_BOT_TOKEN) {
    console.log('[Slack] Not configured, skipping cost notification');
    return;
  }
  
  try {
    const usagePercent = Math.round(event.context.usage_percent || 0);
    const severityEmoji = usagePercent >= 90 ? '🚨' : '⚠️';
    
    await slack.client.chat.postMessage({
      channel: FINANCE_CHANNEL,
      blocks: [
        {
          type: 'header',
          text: {
            type: 'plain_text',
            text: `${severityEmoji} Cost Alert: ${event.context.route || event.context.client}`,
            emoji: true
          }
        },
        {
          type: 'section',
          text: {
            type: 'mrkdwn',
            text: `*Summary:* ${event.summary}\n*Budget Usage:* ${usagePercent}% ($${event.data.current_value.toFixed(2)}/$${event.context.budget_daily.toFixed(2)})\n*Trend:* ${event.data.trend}`
          }
        },
        {
          type: 'divider'
        },
        {
          type: 'section',
          text: {
            type: 'mrkdwn',
            text: `*🤖 AI Analysis:*\n\n${formatCostAnalysis(analysis)}`
          }
        },
        {
          type: 'section',
          text: {
            type: 'mrkdwn',
            text: `*💰 Estimated Savings:* $${calculateTotalSavings(analysis.strategies)}/month`
          }
        },
        ...(appliedOptimizations.length > 0 ? [{
          type: 'section',
          text: {
            type: 'mrkdwn',
            text: `*✅ Auto-Applied Optimizations:*\n${appliedOptimizations.map(o => `• ${o}`).join('\n')}`
          }
        }] : []),
        {
          type: 'section',
          text: {
            type: 'mrkdwn',
            text: `*🔝 Top Clients:*\n${formatTopClients(event.context.top_clients)}`
          }
        },
        {
          type: 'actions',
          elements: [
            {
              type: 'button',
              text: { type: 'plain_text', text: '📊 View Details' },
              style: 'primary',
              url: `https://your-dashboard.com/costs/${event.id}`
            },
            {
              type: 'button',
              text: { type: 'plain_text', text: '✅ Acknowledge' },
              action_id: 'ack_cost_alert',
              value: event.id
            }
          ]
        }
      ]
    });
    
    console.log(`[Slack] Cost alert sent to ${FINANCE_CHANNEL}`);
    
  } catch (error) {
    console.error('[Slack] Cost notification failed:', error);
  }
}

function formatCostAnalysis(analysis: any): string {
  let text = '';
  
  if (analysis.anomaly_assessment) {
    text += `*Assessment:* ${analysis.anomaly_assessment}\n\n`;
  }
  
  if (analysis.root_cause) {
    text += `*Root Cause:* ${analysis.root_cause}\n\n`;
  }
  
  if (analysis.strategies && analysis.strategies.length > 0) {
    text += '*Top Optimization Strategies:*\n';
    analysis.strategies.slice(0, 3).forEach((s: any, i: number) => {
      text += `${i + 1}. ${s.strategy}\n`;
      text += `   • Savings: $${s.savings_usd_per_month}/month\n`;
      text += `   • Effort: ${s.implementation_effort}\n`;
    });
  }
  
  return text;
}

function formatTopClients(clients: Array<{ id: string; cost_usd: number; requests: number }>): string {
  if (!clients || clients.length === 0) return 'No client data available';
  
  return clients.slice(0, 5).map((c, i) => 
    `${i + 1}. ${c.id}: $${c.cost_usd.toFixed(2)} (${c.requests} requests)`
  ).join('\n');
}

function calculateTotalSavings(strategies: any[]): number {
  if (!strategies) return 0;
  return strategies.reduce((sum, s) => sum + (s.savings_usd_per_month || 0), 0);
}

slack.action('ack_cost_alert', async ({ ack, body }) => {
  await ack();
  
  await slack.client.chat.postMessage({
    channel: FINANCE_CHANNEL,
    thread_ts: (body as any).message.ts,
    text: `✅ Cost alert acknowledged by <@${body.user.id}>`
  });
});

export default slack;

Step 6: Cost Dashboard API (Optional - 5 minutes)

Create routes/cost-api.ts:

typescript

import express from 'express';
import { costRecords, dailyCosts, clientCosts } from '../middleware/cost-tracking';

const router = express.Router();

// Get current costs
router.get('/api/costs/current', (req, res) => {
  const totalDaily = Array.from(dailyCosts.values()).reduce((a, b) => a + b, 0);
  
  res.json({
    daily_total: totalDaily,
    by_route: Object.fromEntries(dailyCosts),
    by_client: Object.fromEntries(clientCosts),
    timestamp: new Date().toISOString()
  });
});

// Get cost trend
router.get('/api/costs/trend', (req, res) => {
  const hours = parseInt(req.query.hours as string) || 24;
  const cutoff = new Date(Date.now() - hours * 60 * 60 * 1000);
  
  const recentCosts = costRecords
    .filter(r => new Date(r.timestamp) >= cutoff)
    .reduce((acc, r) => {
      const hour = new Date(r.timestamp).getHours();
      acc[hour] = (acc[hour] || 0) + r.cost;
      return acc;
    }, {} as Record<number, number>);
  
  res.json({
    hours,
    trend: recentCosts,
    total: Object.values(recentCosts).reduce((a, b) => a + b, 0)
  });
});

// Get top spenders
router.get('/api/costs/top-clients', (req, res) => {
  const limit = parseInt(req.query.limit as string) || 10;
  
  const topClients = Array.from(clientCosts.entries())
    .sort((a, b) => b[1] - a[1])
    .slice(0, limit)
    .map(([client, cost]) => ({
      client,
      cost_usd: cost,
      requests: costRecords.filter(r => r.client === client).length
    }));
  
  res.json({ top_clients: topClients });
});

export default router;

Add to your app:

typescript

import costApi from './routes/cost-api';
app.use(costApi);

Testing

Create tests/cost-optimization.test.ts:

typescript

import { AIEventFactory, AIEventType, EventSeverity, TrendDirection } from '@flexgate/ai';
import { analyzeAndOptimize } from '../services/cost-optimizer';

describe('Cost Optimization', () => {
  
  it('should detect cost spike', async () => {
    const event = AIEventFactory.create({
      type: AIEventType.COST_ALERT,
      summary: 'Route /api/chat costs exceeded daily budget by 3.2x',
      severity: EventSeverity.WARNING,
      data: {
        metric: 'cost_usd',
        current_value: 48.50,
        threshold: 15.00,
        window: '1d',
        trend: TrendDirection.RISING,
        unit: 'usd'
      },
      context: {
        route: '/api/chat',
        budget_daily: 15.00,
        usage_percent: 323,
        top_clients: [
          { id: 'acme-corp', cost_usd: 38.20, requests: 43200 }
        ]
      }
    });
    
    expect(event.confidence).toBeGreaterThan(0.8);
    expect(event.data.breach_ratio).toBeGreaterThan(3.0);
  });
  
  it('should analyze and suggest optimizations', async () => {
    const event = AIEventFactory.createSample('COST_ALERT');
    
    const result = await analyzeAndOptimize(event);
    
    expect(result.analysis).toBeDefined();
    expect(result.analysis.strategies).toBeDefined();
    expect(Array.isArray(result.analysis.strategies)).toBe(true);
    expect(result.estimatedSavings).toBeGreaterThan(0);
  }, 15000);
  
});

Run tests:

bash

npm test -- tests/cost-optimization.test.ts

Verification Checklist

✅ Configuration

[ ] API costs defined for all routes
[ ] Daily/monthly budgets set
[ ] Cost thresholds configured
[ ] Environment variables loaded

Test:

bash

node -e "console.log(require('./config/api-costs').API_COSTS)"

✅ Cost Tracking

[ ] Middleware integrated
[ ] Costs being recorded
[ ] Daily totals accumulating
[ ] Client costs tracked

Test:

bash

# Make some API calls
curl -H "x-client-id: test-client" http://localhost:3000/api/chat/completions

# Check costs
curl http://localhost:3000/api/costs/current

✅ AI Analysis

[ ] Cost alerts triggering
[ ] Claude analysis working
[ ] Optimization strategies generated
[ ] Auto-optimizations applying

Test:

bash

npm test -- tests/cost-optimization.test.ts

✅ Notifications

[ ] Slack alerts arriving
[ ] Cost details visible
[ ] Optimization suggestions included

Production Deployment

Environment Variables

bash

# Cost tracking
COST_TRACKING_ENABLED=true
DAILY_BUDGET_USD=50.00
MONTHLY_BUDGET_USD=1500.00

# Slack notifications
SLACK_FINANCE_CHANNEL=#api-costs

# Auto-optimization
AUTO_OPTIMIZE_ENABLED=true
AUTO_OPTIMIZE_MAX_PER_DAY=5

Monitoring

typescript

// Track cost metrics
metrics.gauge('costs.daily_total_usd', dailyTotal);
metrics.gauge('costs.budget_usage_percent', usagePercent);
metrics.counter('costs.alerts_triggered');
metrics.counter('costs.optimizations_applied');

Cost Optimization Strategies

Strategy 1: Intelligent Caching

When: Repeated identical requests
Implementation:

typescript

import { createCache } from 'cache-manager';

const cache = await createCache({
  store: 'redis',
  ttl: 300, // 5 minutes
  max: 1000
});

app.get('/api/chat/history', async (req, res) => {
  const cacheKey = `history:${req.userId}`;
  const cached = await cache.get(cacheKey);
  
  if (cached) {
    return res.json(cached); // No API cost!
  }
  
  const result = await fetchHistory(req.userId);
  await cache.set(cacheKey, result);
  res.json(result);
});

Savings: 90-98% on cacheable endpoints

Strategy 2: Request Batching

When: Multiple small requests
Implementation:

typescript

import pQueue from 'p-queue';

const batchQueue = new pQueue({ concurrency: 1, interval: 100 });
const pendingRequests: any[] = [];

app.post('/api/embeddings', async (req, res) => {
  pendingRequests.push({ req, res });
  
  if (pendingRequests.length >= 10) {
    await processBatch();
  }
});

async function processBatch() {
  const batch = pendingRequests.splice(0, 10);
  const texts = batch.map(b => b.req.body.text);
  
  // Single API call for 10 requests
  const embeddings = await openai.embeddings.create({
    input: texts,
    model: 'text-embedding-ada-002'
  });
  
  batch.forEach((b, i) => {
    b.res.json({ embedding: embeddings.data[i].embedding });
  });
}

Savings: 50-70% on batchable requests

Strategy 3: Model Selection

When: Using expensive models unnecessarily
Implementation:

typescript

function selectModel(complexity: number) {
  if (complexity < 0.3) return 'gpt-3.5-turbo'; // $0.001
  if (complexity < 0.7) return 'gpt-4-turbo';   // $0.005
  return 'gpt-4';                                // $0.015
}

const model = selectModel(calculateComplexity(prompt));

Savings: 40-60% on appropriate model selection

Troubleshooting

Issue: Costs not tracking

Solutions:

Verify middleware order (must be before routes)
Check route paths match exactly
Ensure response uses .json() method

Issue: Budget alerts not firing

Solutions:

Check threshold values
Verify enough samples collected
Review AI event configuration

Issue: Optimizations not applying

Solutions:

Check optimization strategy detection
Verify permissions for changes
Review logs for errors

Success Metrics

Track these KPIs:

Daily/monthly cost vs budget
Cost per request trend
Optimization success rate
Savings from auto-optimizations

Expected Results (4 weeks):

30-50% cost reduction
90%+ anomaly detection
<1 hour detection time

Next Steps

Run for 2 weeks - Gather baseline data
Review Claude suggestions - Validate recommendations
Expand to Auto-Recovery - Next Playbook →

Playbook Version: 1.0.0
Last Updated: February 15, 2026
Tested With: FlexGate v2.0.0, Claude 3.5 Sonnet

Cost-Aware API Management Playbook ​

Overview ​

Step 1: Define Your API Costs (5 minutes) ​

Step 2: Implement Cost Tracking Middleware (10 minutes) ​

Step 3: Configure Cost Alerts (5 minutes) ​

Step 4: AI-Powered Cost Optimization (5 minutes) ​

Step 5: Cost Alert Notifications (5 minutes) ​

Step 6: Cost Dashboard API (Optional - 5 minutes) ​

Testing ​

Verification Checklist ​

✅ Configuration ​

✅ Cost Tracking ​

✅ AI Analysis ​

✅ Notifications ​

Production Deployment ​

Environment Variables ​

Monitoring ​

Cost Optimization Strategies ​

Strategy 1: Intelligent Caching ​

Strategy 2: Request Batching ​

Strategy 3: Model Selection ​

Troubleshooting ​

Issue: Costs not tracking ​

Issue: Budget alerts not firing ​

Issue: Optimizations not applying ​

Success Metrics ​

Next Steps ​

Cost-Aware API Management Playbook

Overview

Step 1: Define Your API Costs (5 minutes)

Step 2: Implement Cost Tracking Middleware (10 minutes)

Step 3: Configure Cost Alerts (5 minutes)

Step 4: AI-Powered Cost Optimization (5 minutes)

Step 5: Cost Alert Notifications (5 minutes)

Step 6: Cost Dashboard API (Optional - 5 minutes)

Testing

Verification Checklist

✅ Configuration

✅ Cost Tracking

✅ AI Analysis

✅ Notifications

Production Deployment

Environment Variables

Monitoring

Cost Optimization Strategies

Strategy 1: Intelligent Caching

Strategy 2: Request Batching

Strategy 3: Model Selection

Troubleshooting

Issue: Costs not tracking

Issue: Budget alerts not firing

Issue: Optimizations not applying

Success Metrics

Next Steps