Reflect on Agent Execution

The Reflect stage runs after the Actuate stage and before the next Sense cycle. The Reflect stage is optional. If your agent does not need to learn from the outcomes of previous cycles or track metrics, you can omit the Reflect stage.

You configure the Reflect stage in stage 06 of the Agent Builder.
In the template YAML, you define the stage under stages.reflect.substages.

The Reflect stage offers three substages:

Agent Builder Substage YAML Substage Type Function

Agent Builder Substage	YAML Substage Type	Function
Learn	`learn`	Extracts insights from the outcomes of the cycle.
Save memory	`update-state`	Saves memory for the next cycle and applies retention limits.
Capture metrics	`capture-metrics`	Records per-cycle performance statistics.

Learn

learn

Extracts insights from the outcomes of the cycle.

Save memory

update-state

Saves memory for the next cycle and applies retention limits.

Capture metrics

capture-metrics

Records per-cycle performance statistics.

For details about what the Reflect stage reads and writes, see Reflect Stage Inputs and Outputs.

When to Add Reflect

Add the Reflect stage when your agent needs to achieve one or more of the following goals:

Extract patterns from cycle outcomes and improve future decisions.
Incorporate human feedback into future reasoning.
Keep a limited history in memory. Without limits, memory grows indefinitely.
Track the performance metrics that appear in the agent detail view of the Control Plane. See Monitor Agent Health.

If your agent runs only fixed deterministic rules, you generally do not need the Reflect stage.

Learn

The learn substage uses the large language model (LLM) to analyze the outcomes of the cycle and extract insights for the agent’s memory. These insights are then available in the next cycle’s Reason stage.

reflect:
  substages:
    - type: learn
      name: pattern-learner
      prompt: |
        Review this cycle's outcome and extract learnings.

        Actions taken:
        {{derivedData.execution | json}}

        Evaluation results:
        {{derivedData.evaluation | json}}

        Human feedback received this cycle:
        {{memory.feedbackHistory | json}}

        Existing patterns:
        {{memory.patterns | json}}

        Analyze:
        1. Which actions succeeded? Which failed and why?
        2. Are there new patterns to remember?
        3. Should any existing patterns be updated?
        4. If operators rejected or modified decisions, what can we learn?

        Extract actionable insights to improve future decision-making.
      outputSchema:
        type: object
        properties:
          newPatterns:
            type: array
            items:
              type: object
              properties:
                description:
                  type: string
                type:
                  type: string
                  enum: [success, failure, correlation, operator-override, threshold-drift]
                confidence:
                  type: number
          reinforcedPatterns:
            type: array
            items:
              type: string
          deprecatedPatterns:
            type: array
            items:
              type: string
          insights:
            type: array
            items:
              type: string

The update-state substage stores the learn output in memory. The Reason prompts of the next cycle read the output as memory.patterns.

Update State

The update-state substage manages memory retention. Without retention limits, arrays such as memory.recentActions grow indefinitely across cycles. Oversized arrays waste resources and make LLM prompts too long.

reflect:
  substages:
    - type: update-state
      name: memory-update
      config:
        retention:
          memory.recentActions:
            policy: max_items
            maxItems: 100 # Keep only the last 100 actions

          memory.patterns:
            policy: ttl
            ttlMs: 604800000 # Discard patterns older than 7 days

          memory.feedbackHistory:
            policy: max_items
            maxItems: 200 # Keep last 200 feedback exchanges

Retention Policies

Policy Fields Function

Policy	Fields	Function
`max_items`	`maxItems`	Keeps only the newest items in the array, up to the configured `maxItems` value.
`ttl`	`ttlMs`	Discards items older than the specified duration in milliseconds.

max_items

maxItems

Keeps only the newest items in the array, up to the configured maxItems value.

ttl

ttlMs

Discards items older than the specified duration in milliseconds.

When you register a Long-Term Storage network tool, update-state also writes memory to PostgreSQL, and memory survives agent restarts. See Persistent Memory Across Restarts.

Capture Metrics

The capture-metrics substage records performance statistics for each cycle. The metrics appear in the agent detail view of the Control Plane. See Monitor Agent Health.

reflect:
  substages:
    - type: capture-metrics
      name: cycle-metrics
      config:
        trackCycleDuration: true # Time from Sense start to Reflect end
        trackLLMTokens: true # Total LLM tokens used this cycle
        trackAnomalyCount: true # Number of anomalies detected
        trackAlertCount: true # Number of alerts published

Full Reflect Example

The following example shows a complete Reflect stage with all three substages:

reflect:
  substages:
    - type: learn
      name: quality-feedback-learner
      prompt: |
        Review this cycle's quality monitoring outcome and extract learnings.
        Pay special attention to human feedback received during this cycle.

        Actions taken:
        {{derivedData.execution | json}}

        Human feedback received this cycle:
        {{memory.feedbackHistory | json}}

        Existing quality patterns:
        {{memory.qualityPatterns | json}}

        Analyze whether operators approved or rejected actions, identify
        false positives, and extract new patterns for future cycles.
      outputSchema:
        type: object
        properties:
          newPatterns:
            type: array
            items:
              type: object
              properties:
                description:
                  type: string
                type:
                  type: string
                  enum:
                    [quality-rule, false-positive, threshold-drift, operator-override, correlation]
                confidence:
                  type: number
                source:
                  type: string
                  enum: [agent-observation, operator-feedback, outcome-evaluation]
          confidenceAdjustment:
            type: number
            minimum: -0.2
            maximum: 0.2

    - type: update-state
      name: memory-update
      config:
        retention:
          memory.recentActions:
            policy: max_items
            maxItems: 100
          memory.qualityPatterns:
            policy: ttl
            ttlMs: 604800000 # 7 days
          memory.feedbackHistory:
            policy: max_items
            maxItems: 200

    - type: capture-metrics
      name: metrics
      config:
        trackLLMTokens: true
        trackCycleDuration: true
        trackAnomalyCount: true

Reflect Stage Inputs and Outputs

The Reflect stage reads the full derivedData of the cycle: analysis, plan, governance, execution, and evaluation (see Reason Stage Inputs and Outputs). The stage also reads environmentalData. The update-state substage is the only substage that writes to memory. Unlike derivedData, which the platform resets every cycle, memory persists from one cycle to the next. The update-state substage appends action results to memory.recentActions, new learnings to memory.patterns, and governance violations to memory.constraints. The retention policies that you set in the update-state substage limit the size of each property. See Retention Policies.

Memory Properties

All stage prompts can read the following standard memory properties:

Key Contents

Key	Contents
`memory.recentActions`	Recent actions taken, managed by `update-state`.
`memory.patterns`	Patterns that `learn` substages extract.
`memory.constraints`	Governance violations that `update-state` appends.
`memory.feedbackHistory`	Human feedback responses received through the agent bus.

memory.recentActions

Recent actions taken, managed by update-state.

memory.patterns

Patterns that learn substages extract.

memory.constraints

Governance violations that update-state appends.

memory.feedbackHistory

Human feedback responses received through the agent bus.

You can define your own memory properties (for example, memory.qualityPatterns, memory.thresholdHistory). You can reference a new property in a prompt or retention policy before any value exists. The platform creates the property the first time a substage writes a value to it.

Persistent Memory Across Restarts

By default, the platform stores agent memory only in process memory. A restart or redeployment clears the memory. The agent begins fresh, with no record of previous cycles, patterns, or feedback history. To make memory survive restarts, give the agent persistent long-term storage.

To set up long-term storage, provide a PostgreSQL database and register the database as a Long-Term Storage network tool. For the registration steps, see Provide Tools for Agents in a Network. You only need to register the tool once per network. The platform then makes the storage available to every agent on the network, without changes to your agent definitions.

When long-term storage is available, the update-state substage writes memory.recentActions, memory.patterns, and memory.constraints to PostgreSQL each cycle. After a restart, the agent loads its last-written memory and continues from where it left off, rather than starting over.

You can register only one Long-Term Storage tool per network. All agents on the network share the same PostgreSQL instance. The platform creates the required database tables automatically on first use. You do not create or change the tables manually. The platform keeps the data of each agent separate, so agents on the same network do not see memory of any other agent.

If the database becomes unreachable, agents keep running. The agents log a warning and operate with in-memory state for that session. When the database is available again, the agents resume writing automatically.

Next Steps

Set a Trigger to Run an Agent Cycle: Choose when the agent runs each cycle.
Deploy an Agent: Run your configured agent.