Architecture

This document explains the core architecture, components, and design principles behind HarmonyLite. Understanding these concepts will help you better implement, configure, and troubleshoot your HarmonyLite deployment.

Architectural Overview

HarmonyLite implements a leaderless, eventually consistent replication system for SQLite databases. The architecture consists of four main components working together:

Change Data Capture (CDC): Monitors and records database changes
Message Distribution: Publishes and subscribes to change events
Change Application: Applies changes to local databases
State Management: Handles snapshots and recovery

The following diagram illustrates the high-level architecture:

Core Components

1. Change Data Capture (CDC)

HarmonyLite uses SQLite triggers to capture all database changes:

Triggers: Automatically installed on all tables to detect INSERT, UPDATE, and DELETE operations
Change Log Tables: Each monitored table has a corresponding __harmonylite__<table_name>_change_log table
Global Change Log: A master table (__harmonylite___global_change_log) tracks the sequence of operations

When a change occurs:

The trigger fires and captures the change details
Information is stored in the change log table
A reference is added to the global change log

Table Structure

Database changes are tracked in specialized tables with this structure:

2. Message Distribution

HarmonyLite uses NATS JetStream for reliable message distribution:

Change Detection: Monitors the database for modifications
Change Collection: Retrieves pending records from change log tables
Hash Calculation: Computes a hash from table name and primary keys
Stream Selection: Routes changes to specific streams based on the hash
Publishing: Sends changes to NATS JetStream
Confirmation: Marks changes as published after acknowledgment

This approach ensures changes to the same row are always handled in order, while allowing parallel processing of changes to different rows.

3. Change Application

When a node receives a change message:

It checks if the change was originated locally (to avoid cycles)
It verifies the change hasn't been applied before
It parses the change details (table, operation type, values)
It applies the change to the local database
It records the message sequence for recovery tracking

4. State Management

HarmonyLite maintains system state through:

Sequence Map: Tracks the last processed message for each stream
Snapshots: Periodic database snapshots for efficient recovery
CBOR Serialization: Efficient binary encoding for change records

Key Mechanisms

Leaderless Replication

Unlike leader-follower systems, HarmonyLite operates without a designated leader:

Any node can accept writes
Changes propagate to all nodes
No single point of failure
Higher write availability

Eventual Consistency

HarmonyLite prioritizes availability over immediate consistency:

Changes eventually reach all nodes
Last-writer-wins conflict resolution
No global locking mechanism
Non-blocking operations

Sharding

Change streams can be sharded to improve performance:

Each shard handles a subset of rows
Determined by hashing table name and primary keys
Enables parallel processing
Configurable via replication_log.shards

Message Flow

The complete message flow looks like this:

Snapshot and Recovery

Snapshot Creation

Snapshots provide efficient node recovery:

Node Recovery

When a node starts or needs to catch up:

Understanding Trade-offs

CAP Theorem Positioning

HarmonyLite makes specific trade-offs according to the CAP theorem:

Consistency: Eventual (not strong)
Availability: High (prioritized)
Partition Tolerance: Maintained

This positions HarmonyLite as an AP system (Availability and Partition Tolerance) rather than a CP system.

Suitable Use Cases

HarmonyLite is ideal for:

Read-heavy workloads
Systems that can tolerate eventual consistency
Applications needing high write availability
Edge computing and distributed systems

Less Suitable Use Cases

HarmonyLite may not be the best choice for:

Strong consistency requirements
Complex transactional workloads
Financial systems requiring immediate consistency
Systems with strict ordering requirements

Performance Characteristics

Scalability

Read Scalability: Excellent (horizontal)
Write Scalability: Good (limited by conflict resolution)
Node Count: Practical up to dozens of nodes

Latency

Local Operations: Minimal impact (~1-5ms overhead)
Replication Delay: Typically 50-500ms depending on network
Recovery Time: Proportional to changes since last snapshot

Resource Usage

Memory: Moderate (configurable)
CPU: Low to moderate
Disk: Additional space for change logs and snapshots
Network: Proportional to change volume and compression settings

Next Steps

Replication Details - Deep dive into the replication process
Snapshots - How snapshots and recovery work
Configuration Reference - Complete configuration options

Architectural Overview​

Core Components​

1. Change Data Capture (CDC)​

Table Structure​

2. Message Distribution​

3. Change Application​

4. State Management​

Key Mechanisms​

Leaderless Replication​

Eventual Consistency​

Sharding​

Message Flow​

Snapshot and Recovery​

Snapshot Creation​

Node Recovery​

Understanding Trade-offs​

CAP Theorem Positioning​

Suitable Use Cases​

Less Suitable Use Cases​

Performance Characteristics​

Scalability​

Latency​

Resource Usage​

Next Steps​