State & Context Management
Translation in Progress
This page is being translated. Content below is a placeholder.
The Challenge
As conversations grow, context increases:
- More messages = more tokens
- Eventually hits context window limits
- Need strategies to manage this
Message Structure
typescript
interface Message {
role: 'user' | 'model'
content: string | Part[]
}
// Tool result format
interface ToolResponse {
functionResponse: {
name: string
response: unknown
}
}Context Management Strategies
1. Sliding Window
Keep only recent N messages:
typescript
function trimContext(messages: Message[], maxMessages: number) {
if (messages.length <= maxMessages) {
return messages
}
// Keep system message + recent messages
return [messages[0], ...messages.slice(-maxMessages + 1)]
}2. Summarization
Summarize older messages:
typescript
async function compressContext(messages: Message[]) {
const oldMessages = messages.slice(0, -5)
const summary = await llm.summarize(oldMessages)
return [
{ role: 'user', content: `Previous context: ${summary}` },
...messages.slice(-5)
]
}3. gemini-cli Approach
Uses token counting to trigger compression:
typescript
class ChatCompressionService {
async compress(messages: Message[]) {
const tokens = await this.countTokens(messages)
if (tokens < threshold) {
return messages
}
// Find split point and summarize early messages
return this.summarizeOldMessages(messages)
}
}Summary
- Context grows as conversation continues
- Use sliding window or summarization
- gemini-cli uses token-based compression
Next
Learn about preventing infinite loops: Loop Detection →