ai-feedback-middleware

🎯

Multi-axis polarity, not a single scalar

Every reaction scored independently on four axes — detection, content, timing, channel. No more collapsing "user disliked this" into one bit and losing the why. The four axes map directly to four remediation paths (trigger rules, prompts, scheduling, delivery routing).

📚

13 locked actions, past-tense

approved · manually_edited · rejected · regenerated · not_selected_from_list · mute_triggered · silently_accepted · silently_rejected_expired · internally_unobserved_externally_completed · manually_replaced · corrected · cancelled · superseded_by. Consumers extend; the core vocabulary is locked so cross-consumer analytics work.

⏰

Silent feedback is captured, not lost

Every governed artifact has a required expiresAt. The Lifecycle Worker fires silently_accepted or silently_rejected_expired on deadline per the artifact type's policy. No more "we never knew if the user agreed."

🧬

Inline Layer 4 actionability inference

Threshold rules over recent reactions promote per-axis polarity from continue_to_observe to actionable_positive or actionable_negative. Runs in the same transaction as the triggering reaction. Tombstones filter direction-symmetrically.

🔌

Provider-agnostic adapters

Postgres for production, in-memory for tests, Redis pub/sub bus, RxJS streams wrapper. Adapter conformance suite is the contract — write your own and it just works. SQLite shipping next.

🛡️

Event-sourced + immutable

captured_artifacts and captured_evaluated_reactions are append-only. Tombstones (cancelled / corrected / superseded_by) reference earlier events; they never mutate. Full audit, perfect replay.

60 seconds

1. Install:

bash

pnpm add @ai-feedback-middleware/core @ai-feedback-middleware/in-memory

2. Compose at startup:

import {
  createFeedback,
  DEFAULT_ACTIONS,
  rejectByDefault,
} from "@ai-feedback-middleware/core";
import {
  createInMemoryEventStore,
  createInMemoryProjectionStore,
  createInMemoryTrackedArtifactsStore,
} from "@ai-feedback-middleware/in-memory";

export const feedback = createFeedback({
  eventStore: createInMemoryEventStore(),
  projectionStore: createInMemoryProjectionStore(),
  trackedArtifacts: createInMemoryTrackedArtifactsStore(),
  actions: DEFAULT_ACTIONS,
  artifactTypes: [rejectByDefault("draft_email")],
});

3. Capture an artifact when the agent produces something reviewable:

const { artifact_id } = await feedback.captureArtifact({
  artifact_type: "draft_email",
  artifact_version: 1,
  producer: "secretary-agent",
  task_type: "outbound:warm_intro",
  payload: { recipient: "alice@example.com", body: "..." },
  expires_at: new Date(Date.now() + 24 * 60 * 60 * 1000).toISOString(),
});

4. Record a reaction when the user acts:

await feedback.recordReaction({
  artifact_id,
  action: "approved",
  payload: { actor_id: "babak" },
});

That's it. The framework writes the capture event, the reaction event with per-axis evaluation embedded, optionally runs Layer 4 inference rules in the same transaction, transitions the lifecycle row to its terminal state, and publishes per-axis topics so downstream subscribers (prompt tuning, labeling, alerting) can pick up what they care about.

How it relates to other tools

Tool	How `ai-feedback-middleware` relates
Argilla, Label Studio	Annotation queues for humans. Argilla's "Records pending annotation" map to our `tracked_artifacts` rows in `waiting` state. Use them as a queue UI on top of this framework.
Humanloop, LangSmith, Langfuse	LLM observability + offline eval. Their "Score" / "Feedback" rows map to our `captured_evaluated_reactions`. Subscribe to `feedback.reaction.*` and pipe events to whichever tool fits your stack.
Helicone, PromptLayer	Telemetry. Different concern — we trust and ignore them at the feedback boundary.
Temporal, Inngest	Workflow orchestration. Different concern — use them around the framework, not inside. The Lifecycle Worker is narrow: deadline + policy → action emission.

Tool

How ai-feedback-middleware relates

Argilla, Label Studio

Annotation queues for humans. Argilla's "Records pending annotation" map to our tracked_artifacts rows in waiting state. Use them as a queue UI on top of this framework.

Humanloop, LangSmith, Langfuse

LLM observability + offline eval. Their "Score" / "Feedback" rows map to our captured_evaluated_reactions. Subscribe to feedback.reaction.* and pipe events to whichever tool fits your stack.

Helicone, PromptLayer

Telemetry. Different concern — we trust and ignore them at the feedback boundary.

Temporal, Inngest

Workflow orchestration. Different concern — use them around the framework, not inside. The Lifecycle Worker is narrow: deadline + policy → action emission.

The framework is adjacent, not competitive. Typical adoption: this framework captures and evaluates; downstream subscribers ship events to your existing tools.

What you get

@ai-feedback-middleware/core — interfaces, classifier, ports, 13-action registry, schema upcasters

@ai-feedback-middleware/in-memory — single-process testing adapters

@ai-feedback-middleware/postgres — production adapters with 8 SQL migrations

@ai-feedback-middleware/redis-pubsub — at-most-once pub/sub bus

@ai-feedback-middleware/streams — RxJS wrapper for composable subscribers

@ai-feedback-middleware/reference — reference projections + capture adapters

@ai-feedback-middleware/adapter-conformance — write your own adapters with the same contract

ai-feedback-middlewareComposable performance feedback for AI agents

Multi-axis polarity, not a single scalar

13 locked actions, past-tense

Silent feedback is captured, not lost

Inline Layer 4 actionability inference

Provider-agnostic adapters

Event-sourced + immutable

60 seconds

How it relates to other tools

What you get

Status

ai-feedback-middlewareComposable performance feedback for AI agents

Multi-axis polarity, not a single scalar

13 locked actions, past-tense

Silent feedback is captured, not lost

Inline Layer 4 actionability inference

Provider-agnostic adapters

Event-sourced + immutable

60 seconds ​

How it relates to other tools ​

What you get ​

Status ​

60 seconds

How it relates to other tools

What you get

Status