Unable to connect to Google MySQL using JDBC Connector from Amazon EMR with SSL certificates

I already had the server-ca.pem, client-cert.pem and client-key.pem files. But I was having trouble making jdbc calls with the three pem files.

So I followed the MysSQL documentation and created the truststore file from server-ca.pem file for server authentication, the below is the documentation:

MySQL :: MySQL Connector/J 8.1 Developer Guide :: 6.9.1 Setting up Server Authentication

Then I created keystore file from client-cert.pem and client-key.pem files as per the MySQL documentation:

MySQL :: MySQL Connector/J 8.1 Developer Guide :: 6.9.2 Setting up Client Authentication

Finally once the truststore and keystore files are ready we need copy the files to both driver and executors because we are working on distributed systems like EMR. So to copy files to both driver and executors, I had copied the truststore and keystore files to a S3 bucket and launched another EMR with a bootstrap script below to copy to a same path in both executors and driver. The bootstrap script is as below:

#!/bin/bash

aws s3 cp s3://test-bucket-us-east-2/truststore /home/hadoop
sudo chmod +x /home/hadoop/truststore

aws s3 cp s3://test-bucket-us-east-2/keystore /home/hadoop
sudo chmod +x /home/hadoop/keystore

Finally once the files are ready in the EMR I started a pyspark shell with the below command:

pyspark --jars mysql-connector-j-8.1.0.jar ## You can download the MySQL Connector/J jar file from this location https://downloads.mysql.com/archives/c-j/

I used this MySQL Connector/J - mysql-connector-j-8.1.0.jar

Finally in the pyspark shell I ran the below spark JDBC connection command to extract data from a MySQL table hosted on GCP:

*df = spark.read *
*.format(“jdbc”) *
*.option(“driver”,“com.mysql.cj.jdbc.Driver”) *
*.option(“url”, “jdbc:mysql://:/<db_name>”) *
*.option(“dbtable”, “table_name”) *
*.option(“user”, “user_id”) *
*.option(“password”, “pwd”) *
.option(“sslMode”, “VERIFY_CA”) \ # This needs to be changed basing on your requirement. This determines what type of SSL authentication happens, so this is important
*.option(“trustCertificateKeyStoreUrl”, “file:/path/to/truststore”) *
*.option(“trustCertificateKeyStorePassword”, “truststore_pwd”) *
*.option(“clientCertificateKeyStoreUrl”, “file:/path/to/keystore”) *
*.option(“clientCertificateKeyStorePassword”, “keystore_pwd”) *
.load()

1 Like