I got an issue happened since 3h30 UTC with Dataproc Metastore
As I look into Logs Explorer, it happened after Metastore run script hive-schema-3.1.0.cloudspanner.sql
Error message:
Starting metastore schema initialization to 3.1.0
Initialization script hive-schema-3.1.0.cloudspanner.sql
...
Error: FAILED_PRECONDITION: Operation with name "projects/xxx/instances/dpms-7bef6b94-a914-4ea8-b44/databases/hive/operations/rfea6af8e_6a40_422a_bd1a_8d98607d54ed" failed with status = GrpcStatusCode{transportCode=FAILED_PRECONDITION} and message = Duplicate name in schema: VERSION. (state=,code=9)
org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !!
Underlying cause: java.io.IOException : Schema script failed, errorcode 2
Use --verbose for detailed stacktrace.
*** schemaTool failed ***
Before that, I got an failed validate schema log also:
/opt/hive/bin/schematool -dbType cloudspanner -validate
...
Validating sequence number for SEQUENCE_TABLE
NEXT_VAL for MPartitionColumnStatistics in SEQUENCE_TABLE < max(CS_ID) in PART_COL_STATS
Failed in bit-reversal sequence number validation for SEQUENCE_TABLE.
...
My Dataproc Metastore can’t start and connect from this moment anymore
Do you guys got this issue like me? Please help me resolve this
The error message “Duplicate name in schema: VERSION” indicates that the VERSION table or column already exists in the Cloud Spanner database. This can happen if the Hive Metastore schema has already been initialized for the database.
To resolve this issue, you can try the following:
Backup First: Before making any changes, ensure you have a backup of your Cloud Spanner database.
If you’re sure that the VERSION table is the cause of the issue and you want to delete it, use the following Cloud Spanner SQL statement:
DROP TABLE hive.VERSION;
After ensuring the database is in a clean state, you can initialize the Hive Metastore schema using the following command:
Thank you for your recommendation, I think I should contract GCP Support because it is managed service, I don’t know how to access Cloud Spanner and command shell