Replication Mechanism
This document provides a deep dive into HarmonyLite's replication process, explaining how changes propagate between nodes, how conflicts are resolved, and how to optimize the system for different workloads.
Core Principles
HarmonyLite implements a leaderless, eventually consistent replication system with the following design principles:
- Any node can accept writes without coordination
- Changes propagate asynchronously to all nodes
- Last-writer-wins for conflict resolution
- Eventual consistency rather than strong consistency
- Parallel processing through sharding
This approach prioritizes availability and partition tolerance over immediate consistency, making HarmonyLite an "AP" system in terms of the CAP theorem.
Change Data Capture (CDC)
Trigger Mechanism
HarmonyLite captures database changes using SQLite triggers:
The system creates three triggers for each monitored table:
- INSERT Trigger: Captures new row data
- UPDATE Trigger: Captures modified row data
- DELETE Trigger: Captures deleted row IDs
Here's a simplified example of an INSERT trigger:
CREATE TRIGGER __harmonylite__users_change_log_on_insert
AFTER INSERT ON users
BEGIN
INSERT INTO __harmonylite__users_change_log(
val_id, val_name, val_email,
type, created_at, state
) VALUES(
NEW.id, NEW.name, NEW.email,
'insert',
CAST((strftime('%s','now') || substr(strftime('%f','now'),4)) AS INT),
0 -- Pending state
);
INSERT INTO __harmonylite___global_change_log (change_table_id, table_name)
VALUES (
last_insert_rowid(),
'users'
);
END;
Change Log Tables
Two types of tables track changes:
-
Per-Table Change Logs: Store specific changes to each table
- Named
__harmonylite__<table_name>_change_log
- Contain columns for all original table fields (prefixed with
val_
) - Include metadata like operation type, timestamp, and state
- Named
-
Global Change Log: Tracks the sequence of all changes
- Named
__harmonylite___global_change_log
- References the specific change log entries
- Provides a chronological record of all operations
- Named
Schema Example
Change Detection and Publishing
Polling Mechanism
HarmonyLite periodically checks for changes using a polling mechanism:
- File Monitoring: Watches the SQLite database and WAL files for changes
- Change Detection: When changes are detected, queries the change log tables
- State Filtering: Retrieves only changes in the "pending" state (state = 0)
Hash-Based Sharding
For efficient parallel processing, changes are distributed across multiple streams based on a hash:
- Hash Calculation: Computed from table name and primary key values
- Shard Selection: The hash determines which stream receives the change
- Message Distribution: Changes to the same row always go to the same stream
This ensures that changes to the same row are processed in order, while allowing changes to different rows to be processed in parallel.
Change Serialization
Changes are serialized for transmission using CBOR (Concise Binary Object Representation):
{
"FromNodeId": 1,
"Payload": {
"Id": 123,
"Type": "insert",
"TableName": "users",
"Row": {
"id": 456,
"name": "John Doe",
"email": "john@example.com"
}
}
}
Optional compression using zstd can be enabled to reduce bandwidth usage.
NATS JetStream Integration
HarmonyLite uses NATS JetStream for reliable message delivery:
- Stream Creation: Creates persistent streams for each shard
- Publication: Publishes changes to the appropriate stream
- Acknowledgment: Waits for confirmation from NATS
- State Update: Marks changes as published in the database (state = 1)
Change Application
Message Flow
Here's the complete message flow from one node to another:
Processing on the Receiving Node
When a node receives a change message:
- Origin Check: Verifies the change didn't originate from this node (prevents loops)
- Duplicate Check: Ensures the change hasn't been applied already
- Table Existence: Confirms the target table exists
- Operation Parsing: Determines the operation type (INSERT, UPDATE, DELETE)
- SQL Generation: Constructs appropriate SQL statements
- Statement Execution: Applies the change to the local database
- Sequence Tracking: Updates the sequence map for recovery purposes
SQL Statement Generation
HarmonyLite generates SQL statements based on the operation type:
- INSERT:
INSERT INTO table (col1, col2) VALUES (val1, val2)
- UPDATE:
UPDATE table SET col1 = val1, col2 = val2 WHERE id = ?
- DELETE:
DELETE FROM table WHERE id = ?
These statements are executed using SQLite's native APIs.
Conflict Resolution
Last-Writer-Wins Strategy
HarmonyLite uses a timestamp-based last-writer-wins (LWW) strategy for conflict resolution:
- Timestamp Comparison: Changes include microsecond-precision timestamps
- Latest Wins: When two nodes modify the same row, the later change prevails
- Node ID Tiebreaker: If timestamps are identical, higher node ID wins
This approach ensures eventual consistency but may result in lost updates in high-conflict scenarios.
Conflict Scenarios
Consider this example scenario:
- Node 1 updates a user's email to "john@example.com" at timestamp 1000
- Node 2 updates the same user's email to "john@company.com" at timestamp 1050
- Both changes replicate to all nodes
- The final state will be "john@company.com" (from Node 2) because it has a later timestamp
Multi-Column Conflicts
If different nodes update different columns of the same row:
- Node 1 updates a user's email at timestamp 1000
- Node 2 updates the same user's phone at timestamp 1050
- Column-level merging does not occur
- The entire row from Node 2 replaces the row from Node 1
- The email update from Node 1 is lost
Replication Configuration
Sharding Configuration
[replication_log]
# Number of shards for parallel processing
shards = 4
# Maximum entries per stream before cleanup
max_entries = 1024
# Number of stream replicas for fault tolerance
replicas = 3
# Enable compression for network efficiency
compress = true
Increasing shards
improves parallel processing but requires more NATS resources.
Write-Only and Read-Only Nodes
HarmonyLite supports specialized node configurations:
# Write-only node (doesn't apply incoming changes)
publish = true
replicate = false
# Read-only node (doesn't publish changes)
publish = false
replicate = true
These settings enable custom topologies like hub-and-spoke or primary-replica.
Performance Considerations
Optimizing Replication Performance
To improve replication performance:
-
Increase Shards for parallel processing:
[replication_log]
shards = 8 # More shards for higher throughput -
Enable Compression for bandwidth efficiency:
[replication_log]
compress = true -
Tune Cleanup Interval for reduced overhead:
# Time in milliseconds between change log cleanup runs
cleanup_interval = 60000 # 1 minute
Transaction Batching
When making multiple changes, use transactions to reduce overhead:
BEGIN TRANSACTION;
INSERT INTO users (name, email) VALUES ('John', 'john@example.com');
INSERT INTO users (name, email) VALUES ('Jane', 'jane@example.com');
COMMIT;
This reduces trigger overhead and improves replication efficiency.
Network Considerations
For optimal performance:
- Low-Latency Connections: Keep network latency below 50ms when possible
- Sufficient Bandwidth: Ensure adequate bandwidth for your change volume
- Reliable Network: Use retry mechanisms for unreliable connections
- Regional Clustering: Group nodes geographically for better performance
Replication Guarantees and Limitations
What HarmonyLite Guarantees
HarmonyLite provides these guarantees:
- Eventual Consistency: All nodes eventually reach the same state
- Change Preservation: Changes are not lost if a node is temporarily offline
- Order Preservation: Changes to the same row are applied in order
- Fault Tolerance: The system continues to function despite node failures
Limitations
Important limitations to be aware of:
- No Strong Consistency: There's a window where nodes have different states
- No Cross-Row Transactions: Transactions spanning multiple tables may not be atomic across nodes
- Last-Writer-Wins only: No custom conflict resolution strategies
- No Quorum Enforcement: Updates proceed without waiting for acknowledgment from other nodes
Advanced Replication Patterns
Multi-Region Deployment
For global deployments, configure NATS gateways between regions:
This setup enables efficient cross-region replication with local write capability.
Hub-and-Spoke Pattern
For edge computing scenarios:
Read Scaling Pattern
For read-heavy workloads:
Conclusion
HarmonyLite's replication mechanism provides a flexible, efficient way to maintain database consistency across distributed nodes. By understanding these internals, you can better optimize, monitor, and troubleshoot your deployment.
For more information on related topics, see:
- Snapshots - How database state is preserved and restored
- NATS Configuration - Details on messaging configuration
- Production Deployment - Best practices for production