Migrate Cassandra workloads to Bigtable for a fully-managed, Cloud-native experience
Migrating your self-managed Cassandra workloads to the cloud can feel daunting. You need the power, scalability, and reliability of a managed service, but also a migration path that minimizes disruption and code rewrites. This year at Google Cloud Next ‘25, we announced the Cassandra to Bigtable proxy adapter and client for Java. These new tools not only enable a lift-and-shift path to Bigtable but also pave the way for a zero-downtime migration from Cassandra and Cassandra-compatible databases like ScyllaDB, Amazon Keyspaces and Cosmos DB to Bigtable.
Minimal code changes, maximum impact
At the heart of this new offering is a seamless developer experience. We provide two ways to streamline a Cassandra to Bigtable migration.
First, the Cassandra to Bigtable proxy adapter, can run as a standalone process, and serves as a seamless translation layer between your Cassandra-based app and Bigtable. Second, the Cassandra to Bigtable client for Java is a drop-in replacement for your existing Cassandra java driver and runs in the same process as your applications.
This means you can unlock the benefits of Bigtable with minimal modifications to your application code. Simply update your connection configuration and let the adapter handle the rest.
A familiar face with a powerful engine under the hood
The Cassandra to Bigtable proxy adapter allows for the continued use of Cassandra Query Language (CQL). This approach enables developers to utilize their existing application code, as the proxy adapter acts as a transparent intermediary, translating your CQL queries into Bigtable’s API and giving you access to the raw performance of its underlying infrastructure.
The managed experience you deserve
While your developers enjoy the continuity of the Cassandra API, you gain all the advantages of a fully managed NoSQL database. Bigtable’s proven track record of high availability, automated backups, and seamless scaling is now at your fingertips.
-
Linear scalability: Bigtable can virtually scale infinitely, handling petabytes of data and millions of operations per second, simply adding more nodes to get more throughput.
-
Performance: Bigtable offers single-digit millisecond latency for read and write operations, resulting in a significantly improved user experience and application responsiveness.
-
Global reach: Leverage Google’s global network. Bigtable is available in up to 8 regions, to distribute your data closer to your users, reducing latency and improving performance.
-
High availability: Bigtable delivers a 99.999% availability SLA, ensuring virtually uninterrupted service and maximum uptime.
-
Cost-effectiveness: Benefit from Bigtable’s efficient decoupling of storage and compute, that allows for rapid adjustment to your traffic, while removing cost associated with idle resources.
-
Beyond CQL: You can run CQL and GoogleSQL side-by-side to use other Bigtable features like Distributed Counters, Continuous Materialized Views and Change streams. Your Cassandra workloads can easily integrate with the Google Cloud ecosystem, including services like BigQuery and Vertex AI.
Your zero downtime path to Bigtable
With the Cassandra to Bigtable proxy adapter, we’ve removed the friction, empowering you to quickly and easily migrate your Cassandra workloads to the cloud. This unlocks the power of Bigtable while retaining the familiar development experience your teams already rely on.
This migration strategy leverages a suite of proxy tools to ensure a seamless transition from Cassandra to Bigtable with no application downtime and minimal code changes. The key components are:
-
Cassandra to Bigtable proxy adapter: Enables your application to communicate with Bigtable using CQL.
-
Zero Downtime Migration (ZDM) proxy: Sits between your application and databases, managing dual writes and traffic routing.
-
Cassandra Data Migrator (CDM): Used for the bulk transfer of existing historical data.
At a high level, the migration from Cassandra to Bigtable can be broken down in 4 steps:
- Connect your Cassandra application to the ZDM proxy.
- Enable dual-writes to Cassandra and Bigtable.
- Move data in bulk using CDM.
- After validation, cut over to Bigtable.
When using the proxy adapter with the ZDM proxy tool the following migration capabilities are supported:
-
Dual writes to maintain data availability during migration.
-
Asynchronous reads to scale and stress-test your Bigtable instance.
-
Automated data verification and reporting to ensure data integrity.
-
Data mapping to map field and data types to meet production standards.
Step 1: Prepare your Bigtable target instance
Before initiating the migration, you need a Bigtable instance ready to receive your data.
Create a Bigtable Instance: In your Google Cloud project, set up a new Bigtable instance. Define an instance ID (e.g., zdmbigtable), display name, and configure a cluster with appropriate storage type and node count for your workload.
Note: The actual Bigtable table schema will be created later via the Cassandra-Bigtable proxy adapter.
Step 2: Set up the Cassandra-Bigtable proxy adapter
The proxy adapter translates CQL queries from your application into Bigtable API calls, making Bigtable appear as a Cassandra node.
Deploy the proxy adapter: The Cassandra-Bigtable proxy can be deployed on a virtual machine, or as a container on Kubernetes.
Configure the proxy: The core configuration is done in a config.yaml file. Key parameters include:
-
projectId: Your Google Cloud Project ID. -
instanceIds: The ID of your target Bigtable instance (e.g.,zdmbigtable). -
port: The port the proxy will listen on for CQL connections (typically 9042).
cassandraToBigtableConfigs:
# [Optional] Global default GCP Project ID
projectId: YOUR_GCP_PROJECT_ID
listeners:
* name: cluster1
port: PORT_NUMBER
bigtable:
[Optional], skip if it is the same as global_projectId.
projectId: YOUR_GCP_PROJECT_ID
If you want to use multiple instances then pass the instance names by comma seperated
Instance name should not contain any special characters except underscore(_)
instanceIds: GCP_BIGTABLE_INSTANCE_ID
Start the proxy: Launch the proxy application on its own.
Define schema in Bigtable: Connect to the Cassandra-Bigtable Proxy using a CQL tool (like cqlsh). Execute CREATE TABLE statements for your Cassandra tables. The proxy will translate these and create the corresponding tables and column families in Bigtable.
– Example CQL to create table via the proxy
CREATE KEYSPACE zdmbigtable WITH replication = {‘class’:‘SimpleStrategy’, ‘replication_factor’:1};
CREATE TABLE zdmbigtable.employee (
name text PRIMARY KEY,
age bigint,
code int,
credited double,
balance float,
is_active boolean,
birth_date timestamp
);
This step ensures your Bigtable schema matches your Cassandra schema, as understood by CQL.
Step 3: Deploy and configure the Datastax ZDM proxy
The ZDM proxy is the central piece for managing the live migration, enabling dual writes and controlled traffic cutover.
Deployment: The ZDM proxy is deployed using Ansible on dedicated VMs (a “jumphost” for orchestration and one or more “proxy nodes” that handle traffic). See the [ZDM documentation](Phases of the Zero Downtime Migration process | Data Migration | DataStax Docs) to deploy the proxy.
Configure ZDM Proxy: The main configuration file (zdm_proxy_cluster_config.yml for the Ansible deployment) needs to know about your origin (Cassandra) and target (Bigtable, via the Cassandra-Bigtable proxy).
##############################
#### ORIGIN CONFIGURATION
##############################
## Origin credentials (leave commented if no auth)
# origin_username: …
# origin_password: …
## Set the following two parameters only if Origin is a self-managed, non-Astra cluster
origin_contact_points: # Replace!
origin_port: 9042
##############################
#### TARGET CONFIGURATION
##############################
## Target credentials (leave commented if no auth)
# target_username: …
# target_password: …
## Set the following two parameters only if Target is a self-managed, non-Astra cluster
target_contact_points: # Replace!
target_port: 9042
Start the ZDM proxy: Run the Ansible playbook from the ansible directory on the jumphost.
ansible-playbook deploy_zdm_proxy.yml -i zdm_ansible_inventory
Step 4: Configure application and initiate dual writes
With the ZDM Proxy running, you can now route application traffic through it.
Update your application configuration: Modify your application’s database connection settings to point to the ZDM Proxy node’s IP address and port (e.g., :9042).
How dual writes work:
-
Reads: By default, the ZDM Proxy serves read requests from your origin Cassandra database.
-
Writes: All write operations (inserts, updates, deletes) are sent by the ZDM Proxy to both the origin Cassandra database and the target Bigtable instance (via the Cassandra-Bigtable Proxy). This ensures data consistency and allows your application to function without interruption while new data populates Bigtable.
Step 5: Migrate historical data with Cassandra Data Migrator (CDM)
While dual writes handle new and updated data, the CDM tool is used to migrate the bulk of existing historical data.
Deploy CDM: Set up the Cassandra Data Migrator on a VM with Apache Spark installed (Spark is a dependency for CDM).
Configure CDM: Create a cdm.properties file to specify connection details and migration parameters. (CDM has a default template that you can use).
# Origin Cassandra Connection
spark.cdm.connect.origin.host=
spark.cdm.connect.origin.port=9042
# spark.cdm.connect.origin.username=…
# spark.cdm.connect.origin.password=…
# Target Bigtable (via Cassandra-Bigtable Proxy) Connection
spark.cdm.connect.target.host=
spark.cdm.connect.target.port=9042
# spark.cdm.connect.target.username=…
# spark.cdm.connect.target.password=…
# Important for Bigtable compatibility via Proxy
spark.cdm.feature.origin.ttl.automatic=false
spark.cdm.feature.origin.writetime.automatic=false
spark.cdm.feature.target.ttl.automatic=false
spark.cdm.feature.target.writetime.automatic=false
# Specify keyspace and table (can also be passed via command line)
# spark.cdm.schema.keyspace.table=<keyspace_name>.<table_name>
Run the migration job: Execute the CDM tool using spark-submit, pointing to your properties file and the CDM JAR file. This will read data from Cassandra and write it to Bigtable through the Cassandra-Bigtable proxy.
spark-submit --properties-file cdm.properties --class com.datastax.cdm.job.Migrate .jar
Verify data migration: Once the CDM job completes, query Bigtable (e.g., via cqlsh connected to the ZDM Proxy or Cassandra-Bigtable Proxy, or using cbt tool) to ensure all historical data has been transferred correctly. Compare row counts and sample data.
Step 6: The cutover phase
After all historical data is migrated and dual writes have been running, and you’ve thoroughly verified data consistency between Cassandra and Bigtable, you can plan the final cutover.
- Shift reads to target: Reconfigure the ZDM Proxy to serve read traffic primarily from the target (Bigtable). This is a configuration change within the ZDM Proxy, detailed in its documentation. This effectively makes Bigtable the primary database for your application.
- Monitor performance: Closely monitor your application and Bigtable performance after shifting reads.
- Cease dual writes: Once confident that Bigtable is serving all traffic correctly, reconfigure the ZDM Proxy to stop writing to the origin Cassandra cluster. Now, all writes go only to Bigtable.
- Decommission: Eventually, you can decommission your original Cassandra cluster. You may also choose to remove the ZDM Proxy from the architecture and have your application connect directly to the Cassandra-Bigtable Proxy. For applications that can use a native Bigtable client (e.g., Java applications), you might consider transitioning to the Bigtable client for optimized performance and features, potentially removing the Cassandra-Bigtable proxy as well.
Get started today
To learn more about how Bigtable can help move your operation to the next level with a fully managed, cloud-native experience, visit https://cloud.google.com/bigtable. To start your seamless migration today using the Cassandra to bigtable proxy adapter, read our detailed migration guide.