Skip to content

State & Context Management

Translation in Progress

This page is being translated. Content below is a placeholder.

The Challenge

As conversations grow, context increases:

  • More messages = more tokens
  • Eventually hits context window limits
  • Need strategies to manage this

Message Structure

typescript
interface Message {
  role: 'user' | 'model'
  content: string | Part[]
}

// Tool result format
interface ToolResponse {
  functionResponse: {
    name: string
    response: unknown
  }
}

Context Management Strategies

1. Sliding Window

Keep only recent N messages:

typescript
function trimContext(messages: Message[], maxMessages: number) {
  if (messages.length <= maxMessages) {
    return messages
  }
  // Keep system message + recent messages
  return [messages[0], ...messages.slice(-maxMessages + 1)]
}

2. Summarization

Summarize older messages:

typescript
async function compressContext(messages: Message[]) {
  const oldMessages = messages.slice(0, -5)
  const summary = await llm.summarize(oldMessages)

  return [
    { role: 'user', content: `Previous context: ${summary}` },
    ...messages.slice(-5)
  ]
}

3. gemini-cli Approach

Uses token counting to trigger compression:

typescript
class ChatCompressionService {
  async compress(messages: Message[]) {
    const tokens = await this.countTokens(messages)

    if (tokens < threshold) {
      return messages
    }

    // Find split point and summarize early messages
    return this.summarizeOldMessages(messages)
  }
}

Summary

  • Context grows as conversation continues
  • Use sliding window or summarization
  • gemini-cli uses token-based compression

Next

Learn about preventing infinite loops: Loop Detection →

Learn AI Agent development through real source code