We recently saw a huge spike in our Archived_wal_log size and to debug the reason we created a clone of our Cloud SQL instance and replicated the recent production transaction. (mostly adding and deleting new rows)
PS: we have point in recovery enabled and set for 7 days so huge Archived_wal_log is expected. we are just trying to narrow down the reason for spike.
In the dashboard we can see that the WAL archiving is triggering as per our expectation:
Our expectation was that every successful WAL archiving will add 16 MB of data towards Archived_wal_log size and we should see a huge bump in this graph.
Questions:
Is that expectation incorrect?
what is the relationship between WAL archiving and Archived_wal_log ?
Is there a way to double check the Archived_wal_log size using gcloud CLI or some other place in GCP console ?
This explains why our production instance show the increase in WAL size on disk and the clones we created this week doesn’t show any change although WAL archiving is happening.
To confirm we ran:
gcloud sql instances describe INSTANCE_NAME
and for production it shows
transactionalLogStorageState: DISK
and for the clone
transactionalLogStorageState: CLOUD_STORAGE
Also, it is good to know that cloud storage doesn’t cost anything so we are planning move our production log storage to cloud storage as well.
*For instances having write-ahead logs stored in Cloud Storage, the logs are stored in the same region as the primary instance. This log storage (up to 35 days for Cloud SQL Enterprise Plus edition and seven days for Cloud SQL Enterprise edition, the maximum length for PITR) generates no additional cost per instance.*
would love a confirmation from any expert here though thanks.
we’ll update this thread once we move the production logs to cloud storage
The decision to move Cloud SQL instance’s WAL storage to Cloud Storage was made several reasons:
Cost Efficiency: As you noted, using Cloud Storage for WAL does not incur additional costs for the storage duration limits set by your Cloud SQL plan (up to 35 days for Enterprise Plus and 7 days for Enterprise editions).
Scalability and Performance: Cloud Storage offers better scalability and potentially improved I/O performance over traditional disk-based storage, which can be crucial for high transaction environments.
Reliability: Storing WAL in Cloud Storage can enhance data durability and availability, reducing risks associated with local disk failures.
Before you proceed with the migration, here are some steps to consider:
Backup and Testing: Ensure that comprehensive backups are in place before starting the migration. Testing the migration in a staging environment would also be advisable to preempt any potential issues.
Monitoring Post-Migration: After migrating to Cloud Storage, closely monitor the performance and storage metrics to ensure everything is functioning as expected without unforeseen costs or performance degradations.