As of January 1, 2020 this library no longer supports Python 2 on the latest released version. Library versions released prior to that date will continue to be available. For more information please visit Python 2 support on Google Cloud.

Python Client for Dataproc Metastore

ga pypi versions

Dataproc Metastore is a fully managed, highly available, autoscaled, autohealing, OSS-native metastore service that greatly simplifies technical metadata management. Dataproc Metastore service is based on Apache Hive metastore and serves as a critical component towards enterprise data lakes.

Quick Start

In order to use this library, you first need to go through the following steps:

  1. Select or create a Cloud Platform project.

  2. Enable billing for your project.

  3. Enable the Dataproc Metastore Service.

  4. Setup Authentication.


Install this library in a virtualenv using pip. virtualenv is a tool to create isolated Python environments. The basic problem it addresses is one of dependencies and versions, and indirectly permissions.

With virtualenv, it’s possible to install this library without needing system install permissions, and without clashing with the installed system dependencies.


pip install virtualenv
virtualenv <your-env>
source <your-env>/bin/activate
<your-env>/bin/pip install google-cloud-dataproc-metastore


pip install virtualenv
virtualenv <your-env>
<your-env>\Scripts\pip.exe install google-cloud-dataproc-metastore

Next Steps


Because this client uses grpc library, it is safe to share instances across threads. In multiprocessing scenarios, the best practice is to create client instances after the invocation of os.fork() by multiprocessing.pool.Pool or multiprocessing.Process.

This package includes clients for multiple versions of the Dataproc Metastore API. By default, you will get v1, the latest version.

v1 API Reference

v1beta API Reference

v1alpha API Reference


For a list of all google-cloud-dataproc-metastore releases: