[SUPPORT] unable to sync metadata to hive metastore #13057
Labels
hive
Issues related to hive
hudistreamer
issues related to Hudi streamer (Formely deltastreamer)
meta-sync
Describe the problem you faced
I am trying to store table metadata in hive metastore using the following spark command. I have followed the config as shown here. And the following command is run:
spark-submit --class org.apache.hudi.utilities.streamer.HoodieStr eamer $HUDI_UTILITIES_BUNDLE \ --table-type COPY_ON_WRITE \ --source-class org.apache.hudi.utilities.sources.JsonKafkaSource \ --source-ordering-field ts \ --target-base-path hdfs://localhost:9000/user/hive/warehouse/stock_ticks_cow_2 \ --target-table stock_ticks_cow_2 \ --schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider \ --hoodie-conf hoodie.streamer.schemaprovider.registry.url=http://localhost:8081/subjects/stock_ticks-value/versions/latest \ --hoodie-conf hoodie.streamer.source.kafka.topic=stock_ticks \ --hoodie-conf hoodie.datasource.write.recordkey.field=key \ --hoodie-conf hoodie.datasource.write.partitionpath.field=date \ --hoodie-conf schema.registry.url=http://localhost:8081 \ --hoodie-conf auto.offset.reset=earliest \ --hoodie-conf bootstrap.servers=localhost:9092 \ --hoodie-conf hoodie.upsert.shuffle.parallelism=2 \ --hoodie-conf hoodie.insert.shuffle.parallelism=2 \ --hoodie-conf hoodie.delete.shuffle.parallelism=2 \ --hoodie-conf hoodie.bulkinsert.shuffle.parallelism=2 \ --hoodie-conf hoodie.datasource.hive_sync.mode=hms \ --hoodie-conf hoodie.datasource.hive_sync.enable=true \ --hoodie-conf hoodie.datasource.hive_sync.metastore.uris=thrift://localhost:9083 \ --hoodie-conf hoodie.datasource.hive_sync.table=stock_ticks_cow_2 \ --hoodie-conf hoodie.datasource.meta.sync.enable=true \ --hoodie-conf hoodie.datasource.hive_sync.batch_num=10 \ --props file:///dev/null
spark writes the table as intended to hdfs but I don't see the table metadata in hive through beeline. Please let me know if I am missing any required configuration or If I have misunderstood the purpose of this configuration.
To Reproduce
Steps to reproduce the behavior:
stock_ticks
topicshow tables;
Expected behavior
I was expecting the table metadata to be synced with hive upon running the spark command with hive configuration.
Environment Description
Hudi version : 0.15
Spark version : 3.5.5
Hive version : 2.3.9
Hadoop version : 3.4.1
Storage (HDFS/S3/GCS..) : HDFS
Running on Docker? (yes/no) : No
Stacktrace
The text was updated successfully, but these errors were encountered: