IPython Magics for BigQuery¶
IPython Magics
-
%%bigquery
IPython cell magic to run a query and display the result as a DataFrame
%%bigquery [<destination_var>] [--project <project>] [--use_legacy_sql] [--verbose] [--params <params>] <query>
Parameters:
<destination_var>
(Optional[line argument]):variable to store the query results. The results are not displayed if this parameter is used. If an error occurs during the query execution, the corresponding
QueryJob
instance (if available) is stored in the variable instead.
--destination_table
(Optional[line argument]):A dataset and table to store the query results. If table does not exists, it will be created. If table already exists, its data will be overwritten. Variable should be in a format <dataset_id>.<table_id>.
--project <project>
(Optional[line argument]):Project to use for running the query. Defaults to the context
project
.
--use_bqstorage_api
(Optional[line argument]):[Deprecated] Not used anymore, as BigQuery Storage API is used by default.
--use_rest_api
(Optional[line argument]):Use the BigQuery REST API instead of the Storage API.
--use_legacy_sql
(Optional[line argument]):Runs the query using Legacy SQL syntax. Defaults to Standard SQL if this argument not used.
--verbose
(Optional[line argument]):If this flag is used, information including the query job ID and the amount of time for the query to complete will not be cleared after the query is finished. By default, this information will be displayed but will be cleared after the query is finished.
--params <params>
(Optional[line argument]):If present, the argument following the
--params
flag must be either:str
- A JSON string representation of a dictionary in the format{"param_name": "param_value"}
(ex.{"num": 17}
). Use of the parameter in the query should be indicated with@param_name
. SeeIn[5]
in the Examples section below.dict
reference - A reference to adict
in the format{"param_name": "param_value"}
, where the value types must be JSON serializable. The variable reference is indicated by a$
before the variable name (ex.$my_dict_var
). SeeIn[6]
andIn[7]
in the Examples section below.
Note
Due to the way IPython argument parser works, negative numbers in dictionaries are incorrectly “recognized” as additional arguments, resulting in an error (“unrecognized arguments”). To get around this, pass such dictionary as a JSON string variable.
<query>
(required, cell argument):SQL query to run. If the query does not contain any whitespace (aside from leading and trailing whitespace), it is assumed to represent a fully-qualified table ID, and the latter’s data will be fetched.
- Returns:
A
pandas.DataFrame
with the query results.
Note
All queries run using this magic will run using the context
credentials
.- Examples:
The following examples can be run in an IPython notebook after loading the bigquery IPython extension (see
In[1]
) and setting up Application Default Credentials.
In [1]: %load_ext google.cloud.bigquery In [2]: %%bigquery ...: SELECT name, SUM(number) as count ...: FROM `bigquery-public-data.usa_names.usa_1910_current` ...: GROUP BY name ...: ORDER BY count DESC ...: LIMIT 3 Out[2]: name count ...: ------------------- ...: 0 James 4987296 ...: 1 John 4866302 ...: 2 Robert 4738204 In [3]: %%bigquery df --project my-alternate-project --verbose ...: SELECT name, SUM(number) as count ...: FROM `bigquery-public-data.usa_names.usa_1910_current` ...: WHERE gender = 'F' ...: GROUP BY name ...: ORDER BY count DESC ...: LIMIT 3 Executing query with job ID: bf633912-af2c-4780-b568-5d868058632b Query executing: 2.61s Query complete after 2.92s In [4]: df Out[4]: name count ...: ---------------------- ...: 0 Mary 3736239 ...: 1 Patricia 1568495 ...: 2 Elizabeth 1519946 In [5]: %%bigquery --params {"num": 17} ...: SELECT @num AS num Out[5]: num ...: ------- ...: 0 17 In [6]: params = {"num": 17} In [7]: %%bigquery --params $params ...: SELECT @num AS num Out[7]: num ...: ------- ...: 0 17
-
class
google.cloud.bigquery.magics.
Context
[source]¶ Storage for objects to be used throughout an IPython notebook session.
A Context object is initialized when the
magics
module is imported, and can be found atgoogle.cloud.bigquery.magics.context
.-
property
credentials
¶ Credentials to use for queries performed through IPython magics
Note
These credentials do not need to be explicitly defined if you are using Application Default Credentials. If you are not using Application Default Credentials, manually construct a
google.auth.credentials.Credentials
object and set it as the context credentials as demonstrated in the example below. See auth docs for more information on obtaining credentials.Example
Manually setting the context credentials:
>>> from google.cloud.bigquery import magics >>> from google.oauth2 import service_account >>> credentials = (service_account ... .Credentials.from_service_account_file( ... '/path/to/key.json')) >>> magics.context.credentials = credentials
-
property
default_query_job_config
¶ Default job configuration for queries.
The context’s
QueryJobConfig
is used for queries. Some properties can be overridden with arguments to the magics.Example
Manually setting the default value for
maximum_bytes_billed
to 100 MB:>>> from google.cloud.bigquery import magics >>> magics.context.default_query_job_config.maximum_bytes_billed = 100000000
-
property
project
¶ Default project to use for queries performed through IPython magics
Note
The project does not need to be explicitly defined if you have an environment default project set. If you do not have a default project set in your environment, manually assign the project as demonstrated in the example below.
Example
Manually setting the context project:
>>> from google.cloud.bigquery import magics >>> magics.context.project = 'my-project'
- Type
-
property