Reflect on Agent Execution

The Reflect stage runs after the Actuate stage and before the next Sense cycle. The Reflect stage is always optional. Omit it entirely if your agent does not need to learn or track metrics.

In the agent builder, the Reflect stage offers two substages:

  • Save memory: The agent saves recent actions, outcomes, and patterns so they carry into the next cycle. Two underlying config types back this substage: learn (extract insights) and update-state (persist a bounded memory).

  • Capture metrics: Records per-cycle performance statistics. The capture-metrics config type backs this substage.

Where does Reflect read from, and what does it change? Reflect reads the full cycle’s derivedData (analysis, plan, governance, evaluation, and learnings; see Where Reason reads and writes in agent state) along with environmentalData. The update-state substage is the only substage that changes memory across cycles. It appends action results to memory.recentActions, new learnings to memory.patterns, and governance violations to memory.constraints. Retention policies keep each one bounded.

When to Add Reflect

Add the Reflect stage when your agent needs to:

  • Extract patterns from cycle outcomes and improve future decisions

  • Incorporate human feedback into its reasoning over time

  • Keep a bounded history in memory (without limits, memory grows indefinitely)

  • Track performance metrics visible in the dashboard

Deterministic agents with fixed rules generally do not need Reflect.

Learn

The learn substage uses the LLM to analyze the cycle’s outcomes and extract insights for the agent’s memory. These insights are then available in the next cycle’s Reason stage.

reflect:
  substages:
    - type: learn
      name: pattern-learner
      prompt: |
        Review this cycle's outcome and extract learnings.

        Actions taken:
        {{derivedData.execution | json}}

        Evaluation results:
        {{derivedData.evaluation | json}}

        Human feedback received this cycle:
        {{memory.feedbackHistory | json}}

        Existing patterns:
        {{memory.patterns | json}}

        Analyze:
        1. Which actions succeeded? Which failed and why?
        2. Are there new patterns to remember?
        3. Should any existing patterns be updated?
        4. If operators rejected or modified decisions, what can we learn?

        Extract actionable insights to improve future decision-making.
      outputSchema:
        type: object
        properties:
          newPatterns:
            type: array
            items:
              type: object
              properties:
                description:
                  type: string
                type:
                  type: string
                  enum: [success, failure, correlation, operator-override, threshold-drift]
                confidence:
                  type: number
          reinforcedPatterns:
            type: array
            items:
              type: string
          deprecatedPatterns:
            type: array
            items:
              type: string
          insights:
            type: array
            items:
              type: string

The output from learn is stored in memory (controlled by your update-state substage) and available as memory.patterns in the next cycle’s Reason prompts.

UpdateState

The update-state substage manages memory retention. Without retention limits, arrays like memory.recentActions grow indefinitely across cycles, eventually consuming memory and making prompts too long.

reflect:
  substages:
    - type: update-state
      name: memory-update
      config:
        retention:
          memory.recentActions:
            policy: max_items
            maxItems: 100 # Keep only the last 100 actions

          memory.patterns:
            policy: ttl
            ttlMs: 604800000 # Discard patterns older than 7 days

          memory.feedbackHistory:
            policy: max_items
            maxItems: 200 # Keep last 200 feedback exchanges

Retention Policies

Policy Fields What it does

max_items

maxItems

Keeps only the most recent N items in the array

ttl

ttlMs

Discards items older than the specified duration (milliseconds)

When a Long-Term Storage network tool is registered, update-state also writes memory to PostgreSQL, so memory survives agent restarts. See Persistent Memory Across Restarts at the end of this page.

CaptureMetrics

The capture-metrics substage records performance statistics for each cycle. These appear in the agent’s detail view in the dashboard.

reflect:
  substages:
    - type: capture-metrics
      name: cycle-metrics
      config:
        trackCycleDuration: true # Time from Sense start to Reflect end
        trackLLMTokens: true # Total LLM tokens used this cycle
        trackAnomalyCount: true # Number of anomalies detected
        trackAlertCount: true # Number of alerts published

Full Reflect Example

From the Quality Monitor with Feedback example:

reflect:
  substages:
    - type: learn
      name: quality-feedback-learner
      prompt: |
        Review this cycle's quality monitoring outcome and extract learnings.
        Pay special attention to human feedback received during this cycle.

        Actions taken:
        {{derivedData.execution | json}}

        Human feedback received this cycle:
        {{memory.feedbackHistory | json}}

        Existing quality patterns:
        {{memory.qualityPatterns | json}}

        Analyze whether operators approved or rejected actions, identify
        false positives, and extract new patterns for future cycles.
      outputSchema:
        type: object
        properties:
          newPatterns:
            type: array
            items:
              type: object
              properties:
                description:
                  type: string
                type:
                  type: string
                  enum:
                    [quality-rule, false-positive, threshold-drift, operator-override, correlation]
                confidence:
                  type: number
                source:
                  type: string
                  enum: [agent-observation, operator-feedback, outcome-evaluation]
          confidenceAdjustment:
            type: number
            minimum: -0.2
            maximum: 0.2

    - type: update-state
      name: memory-update
      config:
        retention:
          memory.recentActions:
            policy: max_items
            maxItems: 100
          memory.qualityPatterns:
            policy: ttl
            ttlMs: 604800000 # 7 days
          memory.feedbackHistory:
            policy: max_items
            maxItems: 200

    - type: capture-metrics
      name: metrics
      config:
        trackLLMTokens: true
        trackCycleDuration: true
        trackAnomalyCount: true

Memory Keys

Standard memory keys available across all stage prompts:

Key Contents

memory.recentActions

Recent actions taken, managed by update-state

memory.patterns

Patterns extracted by learn substages

memory.feedbackHistory

Human feedback responses received via the agent bus

You can define your own memory keys (e.g., memory.qualityPatterns, memory.thresholdHistory). Any key referenced in a prompt or retention policy is created automatically on first write.

Persistent Memory Across Restarts

By default, an agent’s memory lives in memory only. If the agent restarts or you redeploy it, the agent begins fresh with no recollection of previous cycles, patterns, or feedback history. To make memory survive restarts, give the agent persistent long-term storage.

Long-term storage is backed by a PostgreSQL database that you provide and register once as a Long-Term Storage network tool. After you register it, the platform injects it automatically into every agent on that network. You do not need to change your agent definitions. See network tools for how to register one.

When long-term storage is available, the update-state substage writes memory.recentActions, memory.patterns, and memory.constraints through to PostgreSQL each cycle. After a restart, the agent loads its last-written memory and continues from where it left off, rather than starting over.

You can register only one Long-Term Storage tool per network, and all agents on the network share the same PostgreSQL instance. The platform creates the required database tables automatically on first use. You do not manage the schema manually. The platform keeps each agent’s data separate, so agents on the same network do not see one another’s memory.

If the database becomes unreachable, agents keep running. They log a warning and operate with in-memory state for that session. The agents resume writing automatically when the database is available again. A database outage costs you persistence for its duration, not agent availability.