Invalid table-valued function EXTERNAL_QUERY Failed to connect to MySQL database. Error: MysqlErrorCode(2013): Lost connection to MySQL server during query at [1:15]
To resolve this issue, you can follow these steps:
Ensure that you have used valid credentials when creating the connection for Cloud SQL.
Check if the service account associated with the Cloud SQL connection has the Cloud SQL Client (roles/cloudsql.client) role. The service account follows the format service-PROJECT_NUMBER@gcp-sa-bigqueryconnection.iam.gserviceaccount.com.
Verify that the connection details for the Cloud SQL instance are correctly specified in the EXTERNAL_QUERY function.
Iâve already checked all and after I add cloud SQL client role I get this error instead ('caching_sha2_password)
Invalid table-valued function EXTERNAL_QUERY Failed to connect to MySQL database. Error: MysqlErrorCode(2059): Authentication plugin âcaching_sha2_passwordâ cannot be loaded: /usr/lib/plugin/caching_sha2_password.so: cannot open shared object file: No such file or directory at [1:15]
By any chance can we bypass this requirement to use the Public IP enabled on the Cloud SQL instance? Our Application team has decided to disable this option considering it a security risk and since our pipeline was using the External Query to source Cloud SQL data into BQ, our pipelines will break going ahead.
Appreciate any workarounds or help in this regards.
Certainly! Disabling the Public IP on the Cloud SQL instance is a common security practice, and there are ways to connect to Cloud SQL from BigQuery without using a Public IP. Here are some alternatives:
1. Use Cloud SQL Proxy:
Cloud SQL Proxy provides a secure way to connect to Cloud SQL instances without a Public IP. You can set up a Cloud SQL Proxy in a Compute Engine instance or a GKE cluster within the same VPC as your Cloud SQL instance.
Hereâs a high-level overview of how to set it up:
Deploy Cloud SQL Proxy on a Compute Engine instance.
Configure the proxy to connect to your Cloud SQL instance.
Update your BigQuery connection to use the proxyâs internal IP.
Cloud SQL Federation allows you to query Cloud SQL data directly from BigQuery without copying the data into BigQuery. This can be done using Private IP.
Hereâs how to set it up:
Enable Private IP on the Cloud SQL instance.
Create a connection in BigQuery using the Private IP.
Use the EXTERNAL_QUERY function in BigQuery to query the Cloud SQL data.
If the above options are not suitable, you might consider using Googleâs Data Transfer Service to move data from Cloud SQL to BigQuery. This can be configured to use Private IP and provides a managed way to transfer data.
Each of these approaches has its own considerations in terms of complexity, performance, and security. Itâs essential to evaluate them in the context of your specific requirements and infrastructure.
Hey, I am facing this error
Invalid table-valued function EXTERNAL_QUERY Failed to connect to MySQL database. Error: MysqlErrorCode(2059): Authentication plugin âmysql_clear_passwordâ cannot be loaded: /usr/lib/plugin/mysql_clear_password.so: cannot open shared object file: No such file or directory at [1:15]
I have added the flag as u mentioned above. But still, I am facing the same error. Any other possible problems that might be causing it
The error indicates a mismatch between the authentication plugin your MySQL client expects and the one your Cloud SQL instance uses.
Potential Causes and Troubleshooting Steps:
Authentication Plugin Mismatch
MySQL 8.0 and beyond: Ensure your MySQL client libraries and tools support caching_sha2_password or mysql_native_password. Update them if necessary.
Server Configuration:If needed, adjust the default_authentication_plugin on your Cloud SQL instance, but proceed cautiously and research implications as this is not always the first course of action.
Permissions and Connectivity
Service Account Permissions: Verify your service account has the âCloud SQL Clientâ role.
Firewall Rules: Ensure network and firewall rules allow communication between BigQuery and Cloud SQL. Check both Public IP and Private IP configuration (including VPC peering).
Propagation Delay and Restart
Wait for Changes: Allow time for configuration changes to take effect across your instance.
Consider a Restart: Restarting your Cloud SQL instance can sometimes force updated settings to fully propagate.
SSL Configuration
Enforce SSL: For security reasons, configure your connection to use SSL, especially if your current setup requires the less secure mysql_clear_password.
Additional Tips:
Gather Details: Provide your MySQL client version, CloudSQL version, and connection method when seeking further assistance.
Documentation: Refer to the official Google Cloud documentation for in-depth guidance.
Support: If still facing issues, contact Google Cloud Support for personalized help.
In the above post from 08-03-2023, several times you mention
This seems to indicate an External Connection can be made using an IP (and presumably a port), but I have only found references to using a ConnectionID in console, bq CLI, and REST API. I feel like I must have overlooked something. Could you point me in the right direction?
You are correct in noting that when creating a connection in BigQuery to access external data sources like Cloud SQL, you typically use a Connection ID rather than directly specifying an IP address and port in your queries. Let me clarify
Connecting BigQuery to Cloud SQL Using Private IP
Securely access your Cloud SQL data from BigQuery using Private IP connections. Hereâs a step-by-step guide:
1. Enable Private IP on Cloud SQL:
In the Google Cloud Console, go to your Cloud SQL instance settings.
Ensure âPrivate IPâ is enabled.
If needed, configure a VPC network for your instance.
2. Create a BigQuery Connection Resource:
In the BigQuery UI, navigate to your projectâs âConnections.â
Click â+ CREATE CONNECTIONâ and choose âCloud SQL connection.â
Fill in these details:
Connection ID: A unique, descriptive name (e.g., âmy-project-to-cloudsqlâ).
Database type: MySQL or PostgreSQL (match your Cloud SQL instance).
Connection info: Enter your Cloud SQL instanceâs Private IP and database name.
Credentials: Provide the database username and password.
3. Use the Connection in BigQuery Queries:
Use the EXTERNAL_QUERY function, referencing your Connection ID:
SELECT * FROM EXTERNAL_QUERY("project_id.region.connection_id", "SELECT * FROM your_table");
Replace placeholders with your actual values.
Security Considerations:
VPC Network: Ensure traffic is allowed between BigQuery and your Cloud SQL instance. If theyâre in different VPCs, set up VPC Network Peering.
IAM Permissions: Grant the BigQuery service account the roles/cloudsql.client role on your Cloud SQL instance.
Benefits of Using Private IP:
Enhanced Security: Data stays within Googleâs internal network, reducing exposure.
Improved Performance: Potential for lower latency and faster queries.
Thank you for responding. This feels so close. I feel like Iâm not looking in quite the right spot for 2. Create BigQuery Connection Resource: because what I see doesnât quite match up with your explanation. Iâll include an image below, but here is how I got there:
Choose BigQuery from the âhamburgerâ menu next to Google Cloud. BigQuery Studio is the default.
Open the project where I am interested in adding the connection (dataanalytics-dev-330120 in the picture)
Tap + ADD next to Explorer (this is where I seem to be going off your script)
Tap Connections to external data sources
This is what I see after that. I see I can name this Connection ID, but I only see Cloud SQL connection name below, no place to add the private IP (which would be the private IP of a Compute Engine instance running cloud_sql_proxy in my case)
Perhaps I donât have some service enabled on my project, or misread your instructions.
Thanks for providing the screenshot. the steps youâve taken have been in BigQuery Studio, which is a newer interface. This differs slightly from the classic BigQuery UI. While the overall concepts remain the same, the exact navigation and options can vary.
Scenario 1: Using BigQuery Studio
1. Enable Private IP on Cloud SQL:
Navigate to your Cloud SQL instance settings in the Google Cloud Console.
Ensure that âPrivate IPâ is enabled, which may require configuring a VPC network if not already set up.
2. Create BigQuery Connection Resource:
In BigQuery Studio:
Click âADD DATAâ instead of â+ ADDâ next to Explorer.
Select âGoogle Cloudâ followed by âCloud SQLâ.
Provide the following details:
Connection ID: A unique name for your connection.
Cloud SQL Instance Name: Select your instance from the dropdown list.
Database: Name of the database within your Cloud SQL instance.
Credentials: Input the username and password.
Scenario 2: Using Cloud SQL Proxy
1. Enable Private IP on Cloud SQL:
Same steps as in Scenario 1.
2. Configure Cloud SQL Proxy:
Deploy the Cloud SQL Proxy on a Compute Engine instance within the same VPC or a connected network environment.
Configure the proxy to use the Private IP of the Cloud SQL instance.
Note the port number on which the proxy is listening (default ports are typically 3306 for MySQL, 5432 for PostgreSQL).
3. Connect Applications to Cloud SQL Proxy:
Ensure that applications connecting to the Cloud SQL through the proxy are on the same network or connected via VPC peering.
Direct connection settings in BigQuery Studio to the Cloud SQL instance do not apply here; instead, focus on application-level configurations that use the proxy setup.
Important Notes
VPC Peering: If your Compute Engine instance (running the Cloud SQL Proxy) and your BigQuery environment are in different VPCs, youâll need to establish VPC Network Peering.
IAM Permissions: The BigQuery service account should be granted the roles/cloudsql.client role on the Cloud SQL instance to facilitate interactions.
Sigh, you must be thinking that I am rather dense, or the interface for BigQuery Studio has changed again. Perhaps your directions work for a brand new BigQuery Studio, but I do not find an ADD DATA link that then allows me to select Google Cloud followed by Cloud SQL. Iâve been looking at some of the release docs and they would lead me to believe that I should find an +ADD DATA button on the Welcome Page . Mine, unfortunately, does not match the one in the docs. I have tried the links under Add your own data (see my screenshot again), but those add individual tables to a dataset.
Sorry to keep bugging you about this, but it really sounds like it should allow what I am trying to achieve.