As of January 1, 2020 this library no longer supports Python 2 on the latest released version. Library versions released prior to that date will continue to be available. For more information please visit Python 2 support on Google Cloud.

IPython Magics for BigQuery

IPython Magics

To use these magics, you must first register them. Run the %load_ext magic in a Jupyter notebook cell.

%load_ext google.cloud.bigquery

This makes the %%bigquery magic available.

%%bigquery

IPython cell magic to run a query and display the result as a DataFrame

%%bigquery [<destination_var>] [--project <project>] [--use_legacy_sql]
           [--verbose] [--params <params>]
<query>

Parameters:

  • <destination_var> (Optional[line argument]):

    variable to store the query results. The results are not displayed if this parameter is used. If an error occurs during the query execution, the corresponding QueryJob instance (if available) is stored in the variable instead.

  • --destination_table (Optional[line argument]):

    A dataset and table to store the query results. If table does not exists, it will be created. If table already exists, its data will be overwritten. Variable should be in a format <dataset_id>.<table_id>.

  • --project <project> (Optional[line argument]):

    Project to use for running the query. Defaults to the context project.

  • --use_bqstorage_api (Optional[line argument]):

    [Deprecated] Not used anymore, as BigQuery Storage API is used by default.

  • --use_rest_api (Optional[line argument]):

    Use the BigQuery REST API instead of the Storage API.

  • --use_legacy_sql (Optional[line argument]):

    Runs the query using Legacy SQL syntax. Defaults to Standard SQL if this argument not used.

  • --verbose (Optional[line argument]):

    If this flag is used, information including the query job ID and the amount of time for the query to complete will not be cleared after the query is finished. By default, this information will be displayed but will be cleared after the query is finished.

  • --params <params> (Optional[line argument]):

    If present, the argument following the --params flag must be either:

    • str - A JSON string representation of a dictionary in the format {"param_name": "param_value"} (ex. {"num": 17}). Use of the parameter in the query should be indicated with @param_name. See In[5] in the Examples section below.

    • dict reference - A reference to a dict in the format {"param_name": "param_value"}, where the value types must be JSON serializable. The variable reference is indicated by a $ before the variable name (ex. $my_dict_var). See In[6] and In[7] in the Examples section below.

  • <query> (required, cell argument):

    SQL query to run. If the query does not contain any whitespace (aside from leading and trailing whitespace), it is assumed to represent a fully-qualified table ID, and the latter’s data will be fetched.

Returns:

A pandas.DataFrame with the query results.

Note

All queries run using this magic will run using the context credentials.

Examples:

The following examples can be run in an IPython notebook after loading the bigquery IPython extension (see In[1]) and setting up Application Default Credentials.

In [1]: %load_ext google.cloud.bigquery

In [2]: %%bigquery
   ...: SELECT name, SUM(number) as count
   ...: FROM `bigquery-public-data.usa_names.usa_1910_current`
   ...: GROUP BY name
   ...: ORDER BY count DESC
   ...: LIMIT 3

Out[2]:       name    count
   ...: -------------------
   ...: 0    James  4987296
   ...: 1     John  4866302
   ...: 2   Robert  4738204

In [3]: %%bigquery df --project my-alternate-project --verbose
   ...: SELECT name, SUM(number) as count
   ...: FROM `bigquery-public-data.usa_names.usa_1910_current`
   ...: WHERE gender = 'F'
   ...: GROUP BY name
   ...: ORDER BY count DESC
   ...: LIMIT 3
Executing query with job ID: bf633912-af2c-4780-b568-5d868058632b
Query executing: 2.61s
Query complete after 2.92s

In [4]: df

Out[4]:          name    count
   ...: ----------------------
   ...: 0        Mary  3736239
   ...: 1    Patricia  1568495
   ...: 2   Elizabeth  1519946

In [5]: %%bigquery --params {"num": 17}
   ...: SELECT @num AS num

Out[5]:     num
   ...: -------
   ...: 0    17

In [6]: params = {"num": 17}

In [7]: %%bigquery --params $params
   ...: SELECT @num AS num

Out[7]:     num
   ...: -------
   ...: 0    17
class google.cloud.bigquery.magics.magics.Context[source]

Storage for objects to be used throughout an IPython notebook session.

A Context object is initialized when the magics module is imported, and can be found at google.cloud.bigquery.magics.context.

property bigquery_client_options

client options to be used through IPython magics.

Note::

The client options do not need to be explicitly defined if no special network connections are required. Normally you would be using the https://bigquery.googleapis.com/ end point.

Example

Manually setting the endpoint:

>>> from google.cloud.bigquery import magics
>>> client_options = {}
>>> client_options['api_endpoint'] = "https://some.special.url"
>>> magics.context.bigquery_client_options = client_options
Type

google.api_core.client_options.ClientOptions

property bqstorage_client_options

client options to be used through IPython magics for the storage client.

Note::

The client options do not need to be explicitly defined if no special network connections are required. Normally you would be using the https://bigquerystorage.googleapis.com/ end point.

Example

Manually setting the endpoint:

>>> from google.cloud.bigquery import magics
>>> client_options = {}
>>> client_options['api_endpoint'] = "https://some.special.url"
>>> magics.context.bqstorage_client_options = client_options
Type

google.api_core.client_options.ClientOptions

property credentials

Credentials to use for queries performed through IPython magics.

Note

These credentials do not need to be explicitly defined if you are using Application Default Credentials. If you are not using Application Default Credentials, manually construct a google.auth.credentials.Credentials object and set it as the context credentials as demonstrated in the example below. See auth docs for more information on obtaining credentials.

Example

Manually setting the context credentials:

>>> from google.cloud.bigquery import magics
>>> from google.oauth2 import service_account
>>> credentials = (service_account
...     .Credentials.from_service_account_file(
...         '/path/to/key.json'))
>>> magics.context.credentials = credentials
Type

google.auth.credentials.Credentials

property default_query_job_config

Default job configuration for queries.

The context’s QueryJobConfig is used for queries. Some properties can be overridden with arguments to the magics.

Example

Manually setting the default value for maximum_bytes_billed to 100 MB:

>>> from google.cloud.bigquery import magics
>>> magics.context.default_query_job_config.maximum_bytes_billed = 100000000
Type

google.cloud.bigquery.job.QueryJobConfig

property progress_bar_type

Default progress bar type to use to display progress bar while executing queries through IPython magics.

Note::

Install the tqdm package to use this feature.

Example

Manually setting the progress_bar_type:

>>> from google.cloud.bigquery import magics
>>> magics.context.progress_bar_type = "tqdm"
Type

str

property project

Default project to use for queries performed through IPython magics.

Note

The project does not need to be explicitly defined if you have an environment default project set. If you do not have a default project set in your environment, manually assign the project as demonstrated in the example below.

Example

Manually setting the context project:

>>> from google.cloud.bigquery import magics
>>> magics.context.project = 'my-project'
Type

str