Best Event Sourcing Patterns for Web Apps
Best Event Sourcing Patterns for Web Apps
Traditional databases store current state: a user's email, an order's status, an account balance. Event sourcing flips this model—you store the sequence of events that led to current state, then reconstruct state by replaying events. This seems unnecessarily complex until you need audit trails showing exactly what happened and when, or the ability to reconstruct historical state for compliance, or time-travel debugging to understand how your system reached its current broken condition.
This guide covers the event sourcing patterns that work in production web applications. You'll learn when event sourcing solves real problems versus adding unnecessary complexity, how to structure events for replay performance, how to build read models that serve queries efficiently, and how to handle schema evolution when your event structure needs to change. These patterns come from systems processing millions of events in production.
We'll focus on practical implementation: event store design, snapshot strategies to avoid replaying millions of events, projection patterns for building query-optimized views, and the specific challenges that emerge when you need to modify event schemas or fix corrupted event streams.
Why Event Sourcing Exists and When It Matters
Most applications don't need event sourcing. Storing current state in a relational database is simpler, requires less infrastructure, and serves the majority of use cases perfectly well. Event sourcing introduces complexity: you need an event store, projections to build queryable views, and mental overhead to think in terms of events rather than state mutations.
Event sourcing matters when the history of changes is as important as current state. Financial systems need complete audit trails showing every transaction and balance change. E-commerce systems benefit from knowing the full order lifecycle: created, items added, payment attempted, payment succeeded, shipped, delivered. Healthcare systems are legally required to maintain records showing who accessed patient data and when.
The advantage: complete auditability without additional logging infrastructure. Every state change is an event in the event store, creating an append-only audit log automatically. You can reconstruct state at any point in time by replaying events up to that timestamp, enabling powerful debugging capabilities when you need to understand why a customer's order ended up in a broken state.
The Core Tradeoff: Write Simplicity vs Read Complexity
Event sourcing makes writes simple and reads complex. Writing is appending an event to the log—no locks, no transactions across multiple tables, just append. Reading requires replaying events to reconstruct state, which is expensive. To make reads fast, you build projections: materialized views that represent current state derived from events.
This inverts the traditional database model. In a relational database, writes are complex (updating multiple tables, maintaining consistency) and reads are simple (SELECT from current state). Event sourcing trades write complexity for read complexity, which only makes sense if your write patterns benefit from this trade.
High-throughput write scenarios benefit from event sourcing because append-only writes are dramatically faster than UPDATE operations that require locks and transaction coordination. Event sourcing systems routinely handle 10,000+ writes per second on modest hardware because writes don't contend—each event appends to the log independently.
Event Store Design and Implementation
The event store is the foundation of event sourcing: an append-only log of events organized by aggregate (the entity that events belong to). Each event has an aggregate ID, event type, timestamp, sequence number, and payload containing the event data.
Event Store Schema
A basic event store schema tracks aggregate ID, event sequence, event type, and event payload. The sequence number ensures events can be replayed in order, and the aggregate ID groups all events for a single entity.
CREATE TABLE events (
id BIGSERIAL PRIMARY KEY,
aggregate_id UUID NOT NULL,
aggregate_type VARCHAR(100) NOT NULL,
event_type VARCHAR(100) NOT NULL,
event_version INTEGER NOT NULL DEFAULT 1,
payload JSONB NOT NULL,
metadata JSONB,
sequence_number BIGINT NOT NULL,
occurred_at TIMESTAMP NOT NULL DEFAULT NOW(),
CONSTRAINT unique_aggregate_sequence
UNIQUE (aggregate_id, sequence_number)
);
-- Critical indexes for event replay
CREATE INDEX idx_events_aggregate
ON events(aggregate_id, sequence_number);
CREATE INDEX idx_events_type_time
ON events(aggregate_type, occurred_at);
CREATE INDEX idx_events_occurred_at
ON events(occurred_at);
The unique constraint on aggregate_id and sequence_number prevents duplicate events and ensures strict ordering. The JSONB payload stores event-specific data flexibly, allowing different event types to have different structures while maintaining a consistent event log schema.
Event Payload Structure
Event payloads should capture the delta—what changed—not full entity state. This keeps events small and makes them precise records of state transitions.
// Good: event captures the change
{
"event_type": "OrderShipped",
"aggregate_id": "order_123",
"payload": {
"tracking_number": "1Z999AA10123456784",
"carrier": "UPS",
"shipped_at": "2024-03-15T10:30:00Z",
"shipped_by_user_id": "user_456"
}
}
// Bad: event captures full entity state
{
"event_type": "OrderUpdated",
"aggregate_id": "order_123",
"payload": {
"id": "order_123",
"customer_id": "customer_789",
"items": [...], // dozens of items
"status": "shipped",
"tracking_number": "1Z999AA10123456784",
// ... entire order object
}
}
Delta events are smaller, easier to understand, and compose better. You can replay them to reconstruct state without carrying redundant information. Full-state events bloat your event store and make events harder to interpret.
Optimistic Concurrency with Sequence Numbers
Sequence numbers enable optimistic concurrency control. When saving an event, you specify the expected sequence number (the last sequence number you read). If another process has written events since you read, the sequence number check fails and you retry.
async function appendEvent(aggregateId, eventType, payload, expectedSequence) {
try {
await db.query(`
INSERT INTO events (
aggregate_id,
aggregate_type,
event_type,
payload,
sequence_number
)
VALUES ($1, $2, $3, $4, $5)
`, [aggregateId, 'Order', eventType, payload, expectedSequence + 1]);
return { success: true };
} catch (error) {
if (error.constraint === 'unique_aggregate_sequence') {
// Concurrency conflict: another process wrote an event
return { success: false, reason: 'concurrency_conflict' };
}
throw error;
}
}
This prevents lost updates without requiring database locks. Two processes can read the same aggregate state and attempt to write events, but only one succeeds. The other detects the conflict, reloads events to get current state, and retries its operation.
Building and Maintaining Projections
Projections are derived views built from events. They transform the event log into queryable formats optimized for specific read patterns. Without projections, you'd need to replay events for every query, which is prohibitively expensive.
Projection Patterns
A projection is a process that reads events and updates a read model (a database table, document, or cache). Each event type has a handler that updates the appropriate read model.
// Projection: OrderSummary read model
class OrderSummaryProjection {
async handleOrderCreated(event) {
await db.orders.insert({
order_id: event.aggregate_id,
customer_id: event.payload.customer_id,
status: 'created',
total: event.payload.total,
created_at: event.occurred_at
});
}
async handleOrderShipped(event) {
await db.orders.update(
{ order_id: event.aggregate_id },
{
status: 'shipped',
tracking_number: event.payload.tracking_number,
shipped_at: event.occurred_at
}
);
}
async handleOrderDelivered(event) {
await db.orders.update(
{ order_id: event.aggregate_id },
{
status: 'delivered',
delivered_at: event.occurred_at
}
);
}
}
This projection maintains a denormalized order summary table optimized for list queries (show all orders for a customer, show recent orders). The table structure matches query needs, not event structure.
Projection Rebuild Strategy
Projections can be rebuilt from the event log at any time. This is critical when you need to add a new projection, fix a bug in projection logic, or recover from corruption. Rebuild means replaying all events through the projection handlers.
async function rebuildProjection(projection, fromSequence = 0) {
// Clear existing projection data (or write to new table)
await db.orders.truncate();
// Replay all events in order
const eventStream = db.events
.where('sequence_number', '>', fromSequence)
.orderBy('sequence_number', 'asc')
.stream();
for await (const event of eventStream) {
const handler = projection[`handle${event.event_type}`];
if (handler) {
await handler.call(projection, event);
}
// Checkpoint progress for resumability
if (event.sequence_number % 10000 === 0) {
await saveCheckpoint(projection.name, event.sequence_number);
}
}
}
Rebuilding projections can take hours for large event stores. Checkpointing progress allows you to resume if the rebuild process crashes. Running rebuilds against a new table instead of truncating the existing table enables zero-downtime projection updates.
Multiple Projections for Different Query Patterns
Different queries need different data structures. Build multiple projections from the same event stream, each optimized for specific queries.
| Projection | Query Pattern | Data Structure |
|---|---|---|
| OrderSummary | List all orders, filter by status | Relational table with indexes |
| CustomerOrderHistory | Show customer's order timeline | Time-series data, partitioned by customer |
| OrderSearchIndex | Full-text search across orders | Elasticsearch document |
| RevenueAnalytics | Aggregate revenue by time period | Pre-calculated aggregates |
Each projection serves a different purpose. The same event stream feeds all projections, but each projection structures data differently to optimize for its query pattern. This is the power of event sourcing: your write model (events) is decoupled from your read models (projections).
Snapshot Strategy to Avoid Full Replay
Replaying thousands of events to reconstruct aggregate state is expensive. Snapshots solve this: periodically save aggregate state, then replay only events since the last snapshot. This transforms O(n) event replay into O(k) where k is events since last snapshot.
Snapshot Storage
Snapshots store aggregate state at a specific sequence number. Loading an aggregate means loading the latest snapshot and replaying events since that snapshot.
CREATE TABLE snapshots (
id BIGSERIAL PRIMARY KEY,
aggregate_id UUID NOT NULL,
aggregate_type VARCHAR(100) NOT NULL,
sequence_number BIGINT NOT NULL,
state JSONB NOT NULL,
created_at TIMESTAMP NOT NULL DEFAULT NOW(),
CONSTRAINT unique_aggregate_snapshot
UNIQUE (aggregate_id, sequence_number)
);
CREATE INDEX idx_snapshots_aggregate
ON snapshots(aggregate_id, sequence_number DESC);
Snapshot Creation Strategy
Take snapshots periodically based on event count. After every 100 or 1000 events, save a snapshot. This balances snapshot storage cost against replay performance.
async function loadAggregate(aggregateId) {
// Load most recent snapshot
const snapshot = await db.snapshots
.where({ aggregate_id: aggregateId })
.orderBy('sequence_number', 'desc')
.first();
let state = snapshot ? snapshot.state : getInitialState();
let fromSequence = snapshot ? snapshot.sequence_number : 0;
// Replay events since snapshot
const events = await db.events
.where({ aggregate_id: aggregateId })
.where('sequence_number', '>', fromSequence)
.orderBy('sequence_number', 'asc')
.all();
for (const event of events) {
state = applyEvent(state, event);
}
// Maybe save a new snapshot
if (events.length > 100) {
await saveSnapshot(aggregateId, state, events[events.length - 1].sequence_number);
}
return { state, sequence: fromSequence + events.length };
}
This pattern loads the latest snapshot, replays events since the snapshot, and saves a new snapshot if many events were replayed. Over time, this keeps replay count bounded while ensuring snapshots stay reasonably current.
Event Schema Evolution and Versioning
Event schemas change over time. You add fields, remove fields, rename fields, or restructure event data. Unlike database migrations that update all rows, events are immutable—you can't change historical events. Schema evolution means handling multiple event versions simultaneously.
Event Versioning Pattern
Include an event_version field in every event. When you change the event schema, increment the version and update consumers to handle both old and new versions.
// Version 1: OrderCreated event
{
"event_type": "OrderCreated",
"event_version": 1,
"payload": {
"order_id": "order_123",
"customer_email": "[email protected]",
"total": 99.99
}
}
// Version 2: OrderCreated with customer_id instead of email
{
"event_type": "OrderCreated",
"event_version": 2,
"payload": {
"order_id": "order_123",
"customer_id": "customer_456",
"total": 99.99
}
}
Event handlers check the version and process accordingly:
function handleOrderCreated(event) {
if (event.event_version === 1) {
// Old format: look up customer by email
const customer = await findCustomerByEmail(event.payload.customer_email);
return createOrder(event.payload.order_id, customer.id, event.payload.total);
} else if (event.event_version === 2) {
// New format: customer_id directly available
return createOrder(
event.payload.order_id,
event.payload.customer_id,
event.payload.total
);
}
throw new Error(`Unknown OrderCreated version: ${event.event_version}`);
}
Upcasting Pattern for Event Migration
Upcasting transforms old event versions into the current version at read time. This keeps business logic simple by always working with the latest event format.
function upcastOrderCreated(event) {
if (event.event_version === 1) {
// Transform v1 to v2 format
const customer = lookupCustomerByEmail(event.payload.customer_email);
return {
...event,
event_version: 2,
payload: {
order_id: event.payload.order_id,
customer_id: customer.id,
total: event.payload.total
}
};
}
return event; // Already v2
}
function handleOrderCreated(event) {
const currentEvent = upcastOrderCreated(event);
// Business logic only handles v2 format
return createOrder(
currentEvent.payload.order_id,
currentEvent.payload.customer_id,
currentEvent.payload.total
);
}
Upcasting centralizes version handling, making event handlers simpler. The cost is that upcasting logic must run for every old event, but this is typically negligible compared to event processing logic.
Handling Eventually Consistent Projections
Projections update asynchronously—there's a delay between when an event is written and when all projections reflect that event. This creates eventual consistency: reads might return stale data briefly after a write. Applications must handle this reality.
Projection Lag Monitoring
Track how far behind each projection is by comparing the last processed event sequence to the latest event in the store. This indicates projection health and helps identify performance bottlenecks.
async function getProjectionLag(projectionName) {
const latestEvent = await db.events
.orderBy('sequence_number', 'desc')
.first();
const checkpoint = await db.projection_checkpoints
.where({ projection_name: projectionName })
.first();
const lag = latestEvent.sequence_number - checkpoint.sequence_number;
const timeLag = Date.now() - new Date(checkpoint.updated_at).getTime();
return { eventLag: lag, timeLag };
}
If projection lag exceeds acceptable thresholds (1000 events or 10 seconds), alert on it. Projection lag indicates either high event volume that your projections can't keep up with, or projection bugs causing crashes and restarts.
Read-Your-Writes Consistency
Users expect to see their own writes immediately. After creating an order, they should see it in their order list. Handle this by waiting for the relevant projection to process the event before returning from the write operation.
async function createOrder(customerId, items) {
// Write event
const event = await appendEvent('Order', orderId, 'OrderCreated', {
customer_id: customerId,
items: items,
total: calculateTotal(items)
});
// Wait for projection to process this event
await waitForProjection('OrderSummary', event.sequence_number, {
timeout: 5000
});
// Now the order appears in queries
return { orderId, status: 'created' };
}
async function waitForProjection(projectionName, sequenceNumber, options) {
const startTime = Date.now();
while (Date.now() - startTime < options.timeout) {
const checkpoint = await getProjectionCheckpoint(projectionName);
if (checkpoint.sequence_number >= sequenceNumber) {
return; // Projection has processed our event
}
await sleep(50); // Poll every 50ms
}
throw new Error('Projection processing timeout');
}
This ensures read-your-writes consistency for the user who created the order, while other users experience eventual consistency. The tradeoff is slightly increased write latency (typically 50-200ms) to ensure consistency.
Event Sourcing Anti-Patterns to Avoid
Common mistakes make event sourcing systems complex and unmaintainable. Recognizing these anti-patterns prevents painful refactoring.
Event Sourcing Everything
The biggest mistake is applying event sourcing to your entire system. Event sourcing adds complexity—only use it where it provides clear benefits (audit requirements, complex state machines, temporal queries). For simple CRUD operations, traditional state storage is better.
Identify bounded contexts that benefit from event sourcing (orders, financial transactions, user actions requiring audit trails) and use traditional storage for everything else (user profiles, configuration, reference data). Mixing approaches is fine—event sourcing is a tactical pattern, not an architectural mandate.
Storing Application State in Events
Events should represent business facts, not application implementation details. Storing UI state, validation results, or intermediate processing steps as events pollutes your event log.
// Bad: storing application state
{
"event_type": "OrderFormValidated",
"payload": {
"validation_errors": [],
"form_state": { ... }
}
}
// Good: storing business facts
{
"event_type": "OrderCreated",
"payload": {
"order_id": "order_123",
"customer_id": "customer_456",
"items": [ ... ]
}
}
Overly Fine-Grained Events
Creating an event for every tiny state change creates event explosion. An order with 20 items doesn't need 20 OrderItemAdded events—a single OrderCreated event with all items is clearer and more efficient.
Find the right granularity: events should represent meaningful business state transitions, not every field change. Group related changes into single events when they happen atomically.
Testing Event-Sourced Systems
Event sourcing makes testing easier in some ways (events are explicit inputs) and harder in others (projections add async complexity). Effective testing strategies cover event application, projection updates, and end-to-end flows.
Testing Event Handlers
Event handlers are pure functions: given a state and an event, produce new state. This makes them straightforward to test.
test('OrderCreated event initializes order state', () => {
const initialState = null;
const event = {
event_type: 'OrderCreated',
payload: {
order_id: 'order_123',
customer_id: 'customer_456',
total: 99.99
}
};
const newState = applyEvent(initialState, event);
expect(newState).toEqual({
order_id: 'order_123',
customer_id: 'customer_456',
total: 99.99,
status: 'created',
items: []
});
});
test('OrderShipped event updates order status', () => {
const currentState = {
order_id: 'order_123',
status: 'paid'
};
const event = {
event_type: 'OrderShipped',
payload: {
tracking_number: '1Z999AA10123456784'
}
};
const newState = applyEvent(currentState, event);
expect(newState.status).toBe('shipped');
expect(newState.tracking_number).toBe('1Z999AA10123456784');
});
Testing Projections
Test projections by feeding them a sequence of events and verifying the resulting read model state.
test('OrderSummary projection builds correct read model', async () => {
const projection = new OrderSummaryProjection();
await projection.handleOrderCreated({
aggregate_id: 'order_123',
payload: { customer_id: 'customer_456', total: 99.99 },
occurred_at: new Date('2024-03-01T10:00:00Z')
});
await projection.handleOrderShipped({
aggregate_id: 'order_123',
payload: { tracking_number: '1Z999AA10123456784' },
occurred_at: new Date('2024-03-02T15:30:00Z')
});
const order = await db.orders.findOne({ order_id: 'order_123' });
expect(order.status).toBe('shipped');
expect(order.tracking_number).toBe('1Z999AA10123456784');
expect(order.total).toBe(99.99);
});
FAQ
When should I use event sourcing instead of traditional database storage?
Use event sourcing when you need complete audit trails, ability to reconstruct historical state, or complex state machines where understanding state transitions matters. Financial systems, healthcare applications, and e-commerce orders are good candidates. Don't use event sourcing for simple CRUD operations, configuration data, or when current state is all you need.
How do I handle GDPR right-to-be-forgotten with immutable events?
Store personally identifiable information (PII) outside the event store and reference it by ID in events. When a user requests deletion, remove their PII from the reference store while keeping the events. Alternatively, encrypt PII in events and delete the encryption key, making the data unrecoverable while maintaining event integrity.
What's the performance difference between event sourcing and traditional storage?
Writes are faster in event sourcing (append-only, no locks, 10,000+ writes/sec easily). Reads are slower without projections (must replay events) but comparable with projections. Overall system complexity is higher due to projection infrastructure. Choose based on whether write performance and audit requirements justify the complexity.
How often should I take snapshots?
Snapshot frequency depends on aggregate complexity and event volume. Start with snapshots every 100 events. If aggregates are large (many fields, complex state), snapshot more frequently. If events are small and cheap to replay, snapshot less frequently. Monitor aggregate load time and adjust accordingly.
Can I use event sourcing with existing databases like PostgreSQL?
Yes, PostgreSQL works well for event sourcing. Store events in a table with JSONB payloads, use sequence numbers for ordering, and index on aggregate_id. PostgreSQL's ACID guarantees ensure event consistency. Dedicated event stores like EventStoreDB provide additional features but aren't required.
How do I fix bugs in projection logic without replaying millions of events?
Rebuild the projection in a new table while keeping the old projection serving queries. Once the rebuild completes, atomically switch queries to the new projection. This enables zero-downtime projection fixes. For incremental fixes, track which aggregates need reprocessing and replay only their events.
What happens if two processes write conflicting events simultaneously?
Use optimistic concurrency with sequence numbers. Each process checks the expected sequence number when writing events. If another process wrote an event (sequence number advanced), the write fails. The failed process reloads events, recomputes its operation with current state, and retries.
Should I use a dedicated event store like EventStoreDB or build on PostgreSQL?
PostgreSQL works well for most event sourcing use cases and avoids additional infrastructure. Dedicated event stores provide built-in projections, subscriptions, and optimized event streaming, but add operational complexity. Start with PostgreSQL unless you need specific event store features or extremely high event volume.
How do I handle schema changes in events that are already stored?
Use event versioning and upcasting. Add an event_version field to all events. When schema changes, increment the version and write new events with the new schema. Use upcasting functions to transform old event versions to the current version at read time. Never modify existing events in the store.
Can I delete events from the event store?
Generally no—events are immutable facts about what happened. Deleting events breaks auditability and can corrupt projections. If you must remove data (GDPR compliance), either store sensitive data outside events and reference by ID, or use cryptographic erasure (encrypt data and delete the key). For truly erroneous events, append a compensating event instead of deleting.
Conclusion
Event sourcing provides powerful capabilities—complete audit trails, temporal queries, and reliable state machine implementations—at the cost of increased complexity. Use it where these capabilities solve real problems, not as a default architecture. Financial systems, healthcare applications, and complex workflow engines benefit from event sourcing. Simple CRUD applications don't.
Implementation success depends on solid fundamentals: append-only event store with sequence numbers, projections optimized for your query patterns, snapshots to keep replay performant, and event versioning to handle schema evolution. These patterns enable event sourcing systems to scale to millions of events while remaining maintainable.
Start small: identify one bounded context where event sourcing adds value, implement it there, and validate the approach before expanding. Event sourcing is a tactical pattern best applied selectively to parts of your system that benefit most, not a system-wide architectural mandate that adds complexity everywhere.