- Write Throughput Comparison Between Bigtable and Spanner
Both Bigtable and Spanner support horizontal scaling by adding more nodes, which increases their throughput linearly. Based on the performance numbers from the sources you shared:
-
Bigtable: Handles 10,000 writes per second per node.
-
Spanner: Handles 3,500 writes per second per node, which can increase up to 22,500 writes per second per node depending on configuration (e.g., adding nodes and optimizing performance).
From these numbers, Spanner can indeed have a higher write throughput than Bigtable if scaled appropriately. The initial per-node write throughput may be lower for Spanner (3,500 compared to Bigtable’s 10,000), but Spanner’s performance can scale much higher by adding more nodes. This means that while Bigtable starts with a higher per-node capacity, Spanner can potentially surpass Bigtable in terms of write throughput if you provision sufficient resources.
Therefore, if write throughput is a critical factor, Spanner can achieve higher throughput with the right configuration, especially when you need both throughput and consistency. However, Spanner is typically more expensive and complex to configure for these high-throughput use cases compared to Bigtable.
- Consistency vs. Atomicity in Bigtable
Your understanding is largely correct. Let’s clarify:
-
Strong Consistency: Bigtable provides strong consistency for single-row reads and writes. That means when you perform operations on a single row, the data is immediately consistent.
-
Atomicity: In cases where you perform batch writes or updates across multiple rows, the operations are atomic within a single row but not across multiple rows. This means that each row’s write is isolated and consistent, but if you’re writing to several rows in a single batch, these operations are not atomic across all the rows together.
-
Eventual Consistency: You are correct that Bigtable is only eventually consistent when you’re replicating data across clusters in different regions. In a replicated, multi-cluster setup, there can be slight delays in data synchronization between clusters, leading to eventual consistency across regions. However, for most standard operations within a single region, Bigtable maintains strong consistency at the row level.