Introduction
You have locked down your Personally Identifiable Information (PII) in Cloud SQL using client-side encryption. This is a huge win for security and compliance; by encrypting data before it even hits the database, you ensure that the database engine itself never sees the plaintext keys.
But then the request comes in from the data team: “We need to run analytics on that data.”
This creates a classic tension between security and utility. If the database can’t read the data, how can you query it? Do you have to decrypt everything into a temporary location (insecure) or pull it all back to the application layer to process (slow)?
In this post, we will walk through a secure, scalable architecture that solves this problem. We will build a system that allows your application to write encrypted data to Cloud SQL, and enables BigQuery to read and decrypt that data on-the-fly for analytics, all while using the same cryptographic keys managed by Cloud Key Management Service (Cloud KMS).
Understanding the architecture
To make this work, we need two distinct components that share a common “language” (cryptosystem):
-
The Writer (Your application): A Java application that uses the Tink crypto library to encrypt data before inserting it into Cloud SQL.
-
The Reader (The analytics engine): A BigQuery setup that pulls the encrypted data using a Federated Query, passes it to a BigQuery Remote Function (running on Cloud Run), and returns the decrypted text for analysis.
Here is what the high-level flow looks like:
Core concepts: Envelope encryption
Before we look at the code, it is helpful to understand the cryptographic method we are using: Envelope Encryption.
If you encrypted every single row in your database with the same key, and that key was compromised, you would lose everything. Envelope encryption solves this by using two different types of keys:
-
Data Encryption Key (DEK): A key generated locally to encrypt a specific piece of data.
-
Key Encryption Key (KEK): A master key that lives centrally in Cloud KMS. This key is used to encrypt (or “wrap”) the DEK.
The encryption process (Writing) When your app writes data, Tink generates a fresh DEK, encrypts the data, and then asks Cloud KMS to use the KEK to wrap that DEK. The database stores the encrypted data and the wrapped DEK. The KEK never leaves Cloud KMS.
The decryption process (Reading) When BigQuery needs to read the data, it sends the wrapped DEK back to Cloud KMS. Cloud KMS unwraps it, returns the valid DEK to the application (in our case, the Remote Function), and the application uses it to decrypt the PII.
The scenario: Securing an online store
Let’s imagine we are running an online store. We record sales transactions in Cloud SQL, but fields like full_name and email_address are sensitive. Our Information Security team requires client-side encryption using the AES symmetric cipher.
However, the VP of Sales just asked for a report on our top 10 customers by total spend. Here is how we build the solution.
Step 1: Encrypting data in the application
First, we need to configure our Java application to use Tink with Cloud KMS. We are using an encryption scheme called AEAD (Authenticated Encryption with Associated Data). This allows us to bind the encrypted data to a specific context (like a transaction ID), ensuring that the encrypted data can’t be “cut and pasted” into a different row.
Here is the initialization code. Notice we only provide the KMS Key URI and the specific AES template we want to use:
import com.google.crypto.tink.Aead;
import com.google.crypto.tink.KmsClient;
import com.google.crypto.tink.aead.AeadConfig;
import com.google.crypto.tink.aead.AeadKeyTemplates;
import com.google.crypto.tink.aead.KmsEnvelopeAead;
import com.google.crypto.tink.integration.gcpkms.GcpKmsClient;
import java.security.GeneralSecurityException;
public class CloudKmsEnvelopeAead {
public static Aead get(String kmsUri) throws GeneralSecurityException {
AeadConfig.register();
// Create a new KMS Client
KmsClient client = new GcpKmsClient().withDefaultCredentials();
// Create an AEAD primitive using the Cloud KMS key
Aead gcpAead = client.getAead(kmsUri);
// Create an envelope AEAD primitive.
// This key acts as the KEK for the locally generated DEKs.
return new KmsEnvelopeAead(AeadKeyTemplates.AES128_GCM, gcpAead);
}
}
Next, we encrypt the PII (data[1]). We use the customer’s UUID (data[0]) as the “associated data” for the AEAD scheme to ensure integrity.
// Encrypt customer_id with uuid as the associated data
byte encryptedCustomerId = envAead.encrypt(data[1].getBytes(), data[0].getBytes());
salesStmt.setBytes(2, encryptedCustomerId);
// Encrypt full_name
byte encryptedFullName = envAead.encrypt(data[2].getBytes(), data[0].getBytes());
salesStmt.setBytes(3, encryptedFullName);
// Encrypt email_address
byte encryptedEmailAddress = envAead.encrypt(data[3].getBytes(), data[0].getBytes());
salesStmt.setBytes(4, encryptedEmailAddress);
At this point, the data in Cloud SQL is unreadable ciphertext. If you ran a standard SELECT * statement in the database console, you would just see garbled bytes.
Step 2: Decrypting for analytics with BigQuery
Now we need to fulfill the request from the Sales VP. Since BigQuery can’t natively decrypt this custom Tink-encrypted data, we use BigQuery Remote Functions. This feature allows BigQuery to call out to a Cloud Run Function (which contains our decryption logic) during the query execution.
Setting up the environment I have created a GitHub repository here with bash scripts to help you stand up the infrastructure, including the Cloud Run Function code. Once you have deployed the resources from the repo, follow these steps:
-
Get your Endpoint: Navigate to the Cloud Run Functions console and copy the HTTP endpoint URL of your new remote function.
-
Prepare BigQuery: Create a dataset, sales_analysis, in the US multi-region.
Creating the remote function Run the following SQL in BigQuery to “teach” it about your decryption function. Replace the endpoint with the URL you copied earlier.
CREATE FUNCTION sales_analysis.aead_decrypt(x STRING, y STRING)
RETURNS STRING
REMOTE WITH CONNECTION us.remote_fx
OPTIONS (
endpoint = ‘https://REGION-PROJECT_ID.cloudfunctions.net/aead_decrypt’
)
Step 3: Running the federated query
Now for the magic. We will write a query that does three things:
-
Uses EXTERNAL_QUERY to fetch the encrypted rows from Cloud SQL.
-
Passes the encrypted fields (converted to Base64) and the transaction UUID (context) to our remote function.
-
Returns the decrypted plaintext results.
SELECT
sales_analysis.aead_decrypt(TO_BASE64(t1.customer_id), t1.purchase_uuid) AS decrypted_customer_id,
sales_analysis.aead_decrypt(TO_BASE64(t1.full_name), t1.purchase_uuid) AS decrypted_full_name,
sales_analysis.aead_decrypt(TO_BASE64(t1.email_address), t1.purchase_uuid) AS decrypted_email_address,
t1.TotalSpent
FROM
EXTERNAL_QUERY(
'us.crashmeyer-acme-connection',
'''SELECT purchase_uuid, customer_id, full_name, email_address, sum(total_price) as TotalSpent
FROM sales
GROUP BY purchase_uuid, customer_id, full_name, email_address
ORDER BY TotalSpent DESC LIMIT 10'''
) AS t1
The result BigQuery executes the logic, decrypts the fields on the fly using the Cloud Run Function (which accesses the Cloud KMS keys), and presents a standard table of results. The VP of Sales gets their list of top customers, and the data remains encrypted at rest in the operational database.
Conclusion
Client-side encryption often feels like a barrier to data utility, but it doesn’t have to be. By combining BigQuery’s federation capabilities with Remote Functions and Cloud KMS, you can maintain a strict security posture while still empowering your business with valuable insights.
Ready to try it out? The code snippets above are just the highlights. To build this in your own environment, check out the full source code and deployment scripts in the GitHub repository below.
Resources
Have questions about setting up the IAM permissions or configuring Tink? Leave a comment below!


