Skip to content

Task Classification

Task Classification uses ML to automatically categorize your AI calls by type and complexity, enabling intelligent model routing and optimization.

How It Works

Ladger analyzes your spans and classifies them based on:

  1. Token patterns (input/output counts)
  2. Latency characteristics
  3. Output structure (JSON vs freeform)
  4. Prompt patterns (system prompts, instructions)
  5. Error rates and retry frequency

Complexity Levels

LevelIndicatorsTypical ModelsCost Profile
LowUnder 500 tokens, under 1s latency, deterministicGPT-3.5, Claude Haiku, Gemini Flash$
Medium500-2000 tokens, 1-5s latency, some reasoningGPT-4o-mini, Claude Sonnet$$
HighOver 2000 tokens, over 5s latency, complex reasoningGPT-4o, Claude Opus, o1$$$

Dashboard View

┌─────────────────────────────────────────────────────────────┐
│ TASK COMPLEXITY DISTRIBUTION │
├─────────────────────────────────────────────────────────────┤
│ │
│ Low ████████████████████████░░░░░░░░ 58% │
│ Complexity 91,234 requests │
│ │
│ Medium ██████████░░░░░░░░░░░░░░░░░░░░░░ 28% │
│ Complexity 44,012 requests │
│ │
│ High █████░░░░░░░░░░░░░░░░░░░░░░░░░░░ 14% │
│ Complexity 21,988 requests │
│ │
└─────────────────────────────────────────────────────────────┘

Task Types

TypeDescriptionExamplesOptimization Potential
Code GenerationWriting/modifying codeFunction generation, bug fixesMedium
Planning/ReasoningMulti-step problem solvingAgent orchestration, analysisLow
Q&A/RetrievalAnswering from contextRAG queries, FAQ responsesHigh
SummarizationCondensing informationDocument summaries, chat historyHigh
Creative WritingOpen-ended generationMarketing copy, contentMedium
Data ExtractionStructured output from textJSON parsing, entity extractionVery High
ClassificationCategorizing inputsSentiment, intent detectionVery High

Type Distribution

┌──────────────────────────────────────────────────────┐
│ TASK TYPE DISTRIBUTION │
├──────────────────────────────────────────────────────┤
│ │
│ Q&A/Retrieval ████████████████████ 35% │
│ Classification ████████████░░░░░░░░ 25% │
│ Data Extraction ██████████░░░░░░░░░░ 18% │
│ Summarization ██████░░░░░░░░░░░░░░ 12% │
│ Code Generation ████░░░░░░░░░░░░░░░░ 7% │
│ Other ██░░░░░░░░░░░░░░░░░░ 3% │
│ │
└──────────────────────────────────────────────────────┘

Classification Signals

Ladger uses multiple signals to classify tasks:

Token Analysis

// Low complexity indicator
{ inputTokens: 150, outputTokens: 20 } // Simple response
// High complexity indicator
{ inputTokens: 2500, outputTokens: 800 } // Complex reasoning

Output Structure

// Data Extraction (structured output)
span.setAttributes({
'output.format': 'json',
'output.schema': 'entity-extraction'
});
// Creative Writing (freeform output)
span.setAttributes({
'output.format': 'text',
'output.type': 'marketing-copy'
});

Prompt Patterns

The classifier analyzes system prompts for keywords:

PatternTask Type
”classify”, “categorize”Classification
”extract”, “parse”, “JSON”Data Extraction
”summarize”, “condense”Summarization
”write code”, “function”Code Generation
”answer”, “question”Q&A/Retrieval

Improving Classification

Help Ladger classify accurately with span attributes:

span.setAttributes({
// Task type hints
'task.type': 'classification',
'task.complexity': 'low',
// Input characteristics
'prompt.has_system': true,
'prompt.has_examples': false,
// Output characteristics
'output.format': 'json',
'output.deterministic': true,
});

Model Matching

Based on classification, Ladger recommends appropriate models:

CURRENT vs RECOMMENDED MODEL MAPPING
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Task Type: Classification (Low Complexity)
├── Current: GPT-4o $0.015/req
├── Recommended: GPT-3.5-turbo $0.001/req
└── Savings: 93% (~$2,100/month)
Task Type: Q&A/Retrieval (Medium Complexity)
├── Current: Claude-3-Opus $0.045/req
├── Recommended: Claude-3-Sonnet $0.015/req
└── Savings: 67% (~$890/month)
Task Type: Code Generation (High Complexity)
├── Current: GPT-4o $0.025/req
├── Recommended: GPT-4o $0.025/req (No change)
└── Savings: 0% (Quality critical)

Classification API

Query classifications programmatically:

Terminal window
curl -X GET "https://ladger.pages.dev/api/v1/classification/summary" \
-H "Authorization: Bearer ladger_sk_live_..." \
-d '{ "flowName": "customer-support" }'

Response:

{
"taskKindDistribution": [
{ "kind": "qa_retrieval", "count": 45230, "percentage": 35 },
{ "kind": "classification", "count": 32150, "percentage": 25 },
{ "kind": "data_extraction", "count": 23180, "percentage": 18 }
],
"complexityDistribution": [
{ "level": "low", "count": 91234, "percentage": 58 },
{ "level": "medium", "count": 44012, "percentage": 28 },
{ "level": "high", "count": 21988, "percentage": 14 }
],
"optimizationOpportunities": [
{
"spanName": "classify-intent",
"currentModel": "gpt-4o",
"suggestedModel": "gpt-3.5-turbo",
"estimatedSavings": 2100,
"confidence": 0.97
}
]
}

Feedback Loop

Improve classification over time:

  1. Review classifications in the dashboard
  2. Correct misclassified spans
  3. Model updates based on feedback
// Explicit classification override
span.setAttributes({
'task.type': 'code_generation', // Override auto-classification
'task.complexity': 'high',
'classification.manual': true,
});

Use Cases

Smart Routing

Route requests to appropriate models based on classification:

async function route(request: string) {
// Quick classify
const { complexity } = await classifyRequest(request);
// Route to appropriate model
const model = complexity === 'low'
? 'gpt-3.5-turbo'
: complexity === 'medium'
? 'gpt-4o-mini'
: 'gpt-4o';
return tracer.trace('respond', async (span) => {
span.setAttributes({ 'routing.complexity': complexity });
return await callModel(model, request);
});
}

Cost Alerts

Alert when high-cost models are used for low-complexity tasks:

⚠️ Potential Over-Provisioning Detected
Flow: customer-support → classify-intent
- Task Complexity: Low (confidence: 94%)
- Current Model: gpt-4o ($0.015/request)
- Suggested Model: gpt-3.5-turbo ($0.001/request)
- Monthly Impact: ~$2,100 savings
[View Details] [Simulate Change] [Dismiss]

Next Steps