As of January 1, 2020 this library no longer supports Python 2 on the latest released version. Library versions released prior to that date will continue to be available. For more information please visit Python 2 support on Google Cloud.

google.cloud.bigquery.job.QueryJob

class google.cloud.bigquery.job.QueryJob(job_id, query, client, job_config=None)[source]

Asynchronous job: query tables.

Parameters
__init__(job_id, query, client, job_config=None)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(job_id, query, client[, job_config])

Initialize self.

add_done_callback(fn)

Add a callback to be executed when the operation is complete.

cancel([client, retry, timeout])

API call: cancel job via a POST request

cancelled()

Check if the job has been cancelled.

done([retry, timeout, reload])

Checks if the job is complete.

exception([timeout])

Get the exception from the operation, blocking if necessary.

exists([client, retry, timeout])

API call: test for the existence of the job via a GET request

from_api_repr(resource, client)

Factory: construct a job given its API representation

reload([client, retry, timeout])

API call: refresh job properties via a GET request.

result([page_size, max_results, retry, …])

Start the job and wait for it to complete and get the result.

running()

True if the operation is currently running.

set_exception(exception)

Set the Future’s exception.

set_result(result)

Set the Future’s result.

to_api_repr()

Generate a resource for _begin().

to_arrow([progress_bar_type, …])

[Beta] Create a class:pyarrow.Table by loading all pages of a table or query.

to_dataframe([bqstorage_client, dtypes, …])

Return a pandas DataFrame from a QueryJob

to_geodataframe([bqstorage_client, dtypes, …])

Return a GeoPandas GeoDataFrame from a QueryJob

Attributes

allow_large_results

See google.cloud.bigquery.job.QueryJobConfig.allow_large_results.

bi_engine_stats

billing_tier

Return billing tier from job statistics, if present.

cache_hit

Return whether or not query results were served from cache.

clustering_fields

See google.cloud.bigquery.job.QueryJobConfig.clustering_fields.

configuration

The configuration for this query job.

connection_properties

See google.cloud.bigquery.job.QueryJobConfig.connection_properties.

create_disposition

See google.cloud.bigquery.job.QueryJobConfig.create_disposition.

create_session

See google.cloud.bigquery.job.QueryJobConfig.create_session.

created

Datetime at which the job was created.

ddl_operation_performed

Return the DDL operation performed.

ddl_target_routine

Return the DDL target routine, present

ddl_target_table

Return the DDL target table, present

default_dataset

See google.cloud.bigquery.job.QueryJobConfig.default_dataset.

destination

See google.cloud.bigquery.job.QueryJobConfig.destination.

destination_encryption_configuration

Custom encryption configuration for the destination table.

dml_stats

dry_run

See google.cloud.bigquery.job.QueryJobConfig.dry_run.

ended

Datetime at which the job finished.

error_result

Error information about the job as a whole.

errors

Information about individual errors generated by the job.

estimated_bytes_processed

Return the estimated number of bytes processed by the query.

etag

ETag for the job resource.

flatten_results

See google.cloud.bigquery.job.QueryJobConfig.flatten_results.

job_id

ID of the job.

job_type

Type of job.

labels

Labels for the job.

location

Location where the job runs.

maximum_billing_tier

See google.cloud.bigquery.job.QueryJobConfig.maximum_billing_tier.

maximum_bytes_billed

See google.cloud.bigquery.job.QueryJobConfig.maximum_bytes_billed.

num_child_jobs

The number of child jobs executed.

num_dml_affected_rows

Return the number of DML rows affected by the job.

parent_job_id

Return the ID of the parent job.

path

URL path for the job’s APIs.

priority

See google.cloud.bigquery.job.QueryJobConfig.priority.

project

Project bound to the job.

query

The query text used in this query job.

query_id

[Preview] ID of a completed query.

query_parameters

See google.cloud.bigquery.job.QueryJobConfig.query_parameters.

query_plan

Return query plan from job statistics, if present.

range_partitioning

See google.cloud.bigquery.job.QueryJobConfig.range_partitioning.

referenced_tables

Return referenced tables from job statistics, if present.

reservation_usage

Job resource usage breakdown by reservation.

schema

The schema of the results.

schema_update_options

See google.cloud.bigquery.job.QueryJobConfig.schema_update_options.

script_statistics

Statistics for a child job of a script.

search_stats

Returns a SearchStats object.

self_link

URL for the job resource.

session_info

[Preview] Information of the session if this job is part of one.

slot_millis

Slot-milliseconds used by this query job.

started

Datetime at which the job was started.

state

Status of the job.

statement_type

Return statement type from job statistics, if present.

table_definitions

See google.cloud.bigquery.job.QueryJobConfig.table_definitions.

time_partitioning

See google.cloud.bigquery.job.QueryJobConfig.time_partitioning.

timeline

Return the query execution timeline from job statistics.

total_bytes_billed

Return total bytes billed from job statistics, if present.

total_bytes_processed

Return total bytes processed from job statistics, if present.

transaction_info

Information of the multi-statement transaction if this job is part of one.

udf_resources

See google.cloud.bigquery.job.QueryJobConfig.udf_resources.

undeclared_query_parameters

Return undeclared query parameters from job statistics, if present.

use_legacy_sql

See google.cloud.bigquery.job.QueryJobConfig.use_legacy_sql.

use_query_cache

See google.cloud.bigquery.job.QueryJobConfig.use_query_cache.

user_email

E-mail address of user who submitted the job.

write_disposition

See google.cloud.bigquery.job.QueryJobConfig.write_disposition.

add_done_callback(fn)

Add a callback to be executed when the operation is complete.

If the operation is not already complete, this will start a helper thread to poll for the status of the operation in the background.

Parameters

fn (Callable[Future]) – The callback to execute when the operation is complete.

property allow_large_results

See google.cloud.bigquery.job.QueryJobConfig.allow_large_results.

property billing_tier

Return billing tier from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.billing_tier

Returns

Billing tier used by the job, or None if job is not yet complete.

Return type

Optional[int]

property cache_hit

Return whether or not query results were served from cache.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.cache_hit

Returns

whether the query results were returned from cache, or None if job is not yet complete.

Return type

Optional[bool]

cancel(client=None, retry: Optional[google.api_core.retry.Retry] = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)bool

API call: cancel job via a POST request

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/cancel

Parameters
  • client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the client stored on the current dataset.

  • retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.

  • timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using retry

Returns

Boolean indicating that the cancel request was sent.

Return type

bool

cancelled()

Check if the job has been cancelled.

This always returns False. It’s not possible to check if a job was cancelled in the API. This method is here to satisfy the interface for google.api_core.future.Future.

Returns

False

Return type

bool

property clustering_fields

See google.cloud.bigquery.job.QueryJobConfig.clustering_fields.

property configuration: google.cloud.bigquery.job.query.QueryJobConfig

The configuration for this query job.

property connection_properties: List[google.cloud.bigquery.query.ConnectionProperty]

See google.cloud.bigquery.job.QueryJobConfig.connection_properties.

New in version 2.29.0.

property create_disposition

See google.cloud.bigquery.job.QueryJobConfig.create_disposition.

property create_session: Optional[bool]

See google.cloud.bigquery.job.QueryJobConfig.create_session.

New in version 2.29.0.

property created

Datetime at which the job was created.

Returns

the creation time (None until set from the server).

Return type

Optional[datetime.datetime]

property ddl_operation_performed

Return the DDL operation performed.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.ddl_operation_performed

Type

Optional[str]

property ddl_target_routine
Return the DDL target routine, present

for CREATE/DROP FUNCTION/PROCEDURE queries.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.ddl_target_routine

Type

Optional[google.cloud.bigquery.routine.RoutineReference]

property ddl_target_table
Return the DDL target table, present

for CREATE/DROP TABLE/VIEW queries.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.ddl_target_table

Type

Optional[google.cloud.bigquery.table.TableReference]

property default_dataset

See google.cloud.bigquery.job.QueryJobConfig.default_dataset.

property destination

See google.cloud.bigquery.job.QueryJobConfig.destination.

property destination_encryption_configuration

Custom encryption configuration for the destination table.

Custom encryption configuration (e.g., Cloud KMS keys) or None if using default encryption.

See google.cloud.bigquery.job.QueryJobConfig.destination_encryption_configuration.

Type

google.cloud.bigquery.encryption_configuration.EncryptionConfiguration

done(retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, reload: bool = True)bool

Checks if the job is complete.

Parameters
  • retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC. If the job state is DONE, retrying is aborted early, as the job will not change anymore.

  • timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using retry.

  • reload (Optional[bool]) – If True, make an API call to refresh the job state of unfinished jobs before checking. Default True.

Returns

True if the job is complete, False otherwise.

Return type

bool

property dry_run

See google.cloud.bigquery.job.QueryJobConfig.dry_run.

property ended

Datetime at which the job finished.

Returns

the end time (None until set from the server).

Return type

Optional[datetime.datetime]

property error_result

Error information about the job as a whole.

Returns

the error information (None until set from the server).

Return type

Optional[Mapping]

property errors

Information about individual errors generated by the job.

Returns

the error information (None until set from the server).

Return type

Optional[List[Mapping]]

property estimated_bytes_processed

Return the estimated number of bytes processed by the query.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.estimated_bytes_processed

Returns

number of DML rows affected by the job, or None if job is not yet complete.

Return type

Optional[int]

property etag

ETag for the job resource.

Returns

the ETag (None until set from the server).

Return type

Optional[str]

exception(timeout=<object object>)

Get the exception from the operation, blocking if necessary.

See the documentation for the result() method for details on how this method operates, as both result and this method rely on the exact same polling logic. The only difference is that this method does not accept retry and polling arguments but relies on the default ones instead.

Parameters
  • timeout (int) – How long to wait for the operation to complete.

  • None (If) –

  • indefinitely. (wait) –

Returns

The operation’s

error.

Return type

Optional[google.api_core.GoogleAPICallError]

exists(client=None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)bool

API call: test for the existence of the job via a GET request

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get

Parameters
  • client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the client stored on the current dataset.

  • retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.

  • timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using retry.

Returns

Boolean indicating existence of the job.

Return type

bool

property flatten_results

See google.cloud.bigquery.job.QueryJobConfig.flatten_results.

classmethod from_api_repr(resource: dict, client: Client)QueryJob[source]

Factory: construct a job given its API representation

Parameters
  • resource (Dict) – dataset job representation returned from the API

  • client (google.cloud.bigquery.client.Client) – Client which holds credentials and project configuration for the dataset.

Returns

Job parsed from resource.

Return type

google.cloud.bigquery.job.QueryJob

property job_id

ID of the job.

Type

str

property job_type

Type of job.

Returns

one of ‘load’, ‘copy’, ‘extract’, ‘query’.

Return type

str

property labels

Labels for the job.

Type

Dict[str, str]

property location

Location where the job runs.

Type

str

property maximum_billing_tier

See google.cloud.bigquery.job.QueryJobConfig.maximum_billing_tier.

property maximum_bytes_billed

See google.cloud.bigquery.job.QueryJobConfig.maximum_bytes_billed.

property num_child_jobs

The number of child jobs executed.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.num_child_jobs

Returns

int

property num_dml_affected_rows: Optional[int]

Return the number of DML rows affected by the job.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.num_dml_affected_rows

Returns

number of DML rows affected by the job, or None if job is not yet complete.

Return type

Optional[int]

property parent_job_id

Return the ID of the parent job.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.parent_job_id

Returns

parent job id.

Return type

Optional[str]

property path

URL path for the job’s APIs.

Returns

the path based on project and job ID.

Return type

str

property priority

See google.cloud.bigquery.job.QueryJobConfig.priority.

property project

Project bound to the job.

Returns

the project (derived from the client).

Return type

str

property query

The query text used in this query job.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery.FIELDS.query

Type

str

property query_id: Optional[str]

[Preview] ID of a completed query.

This ID is auto-generated and not guaranteed to be populated.

property query_parameters

See google.cloud.bigquery.job.QueryJobConfig.query_parameters.

property query_plan

Return query plan from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.query_plan

Returns

mappings describing the query plan, or an empty list if the query has not yet completed.

Return type

List[google.cloud.bigquery.job.QueryPlanEntry]

property range_partitioning

See google.cloud.bigquery.job.QueryJobConfig.range_partitioning.

property referenced_tables

Return referenced tables from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.referenced_tables

Returns

mappings describing the query plan, or an empty list if the query has not yet completed.

Return type

List[Dict]

reload(client=None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)

API call: refresh job properties via a GET request.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get

Parameters
  • client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the client stored on the current dataset.

  • retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.

  • timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using retry.

property reservation_usage

Job resource usage breakdown by reservation.

Returns

Reservation usage stats. Can be empty if not set from the server.

Return type

List[google.cloud.bigquery.job.ReservationUsage]

result(page_size: Optional[int] = None, max_results: Optional[int] = None, retry: Optional[google.api_core.retry.Retry] = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, start_index: Optional[int] = None, job_retry: Optional[google.api_core.retry.Retry] = <google.api_core.retry.Retry object>)Union[RowIterator, google.cloud.bigquery.table._EmptyRowIterator][source]

Start the job and wait for it to complete and get the result.

Parameters
  • page_size (Optional[int]) – The maximum number of rows in each page of results from this request. Non-positive values are ignored.

  • max_results (Optional[int]) – The maximum total number of rows from this request.

  • retry (Optional[google.api_core.retry.Retry]) – How to retry the call that retrieves rows. This only applies to making RPC calls. It isn’t used to retry failed jobs. This has a reasonable default that should only be overridden with care. If the job state is DONE, retrying is aborted early even if the results are not available, as this will not change anymore.

  • timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using retry. If multiple requests are made under the hood, timeout applies to each individual request.

  • start_index (Optional[int]) – The zero-based index of the starting row to read.

  • job_retry (Optional[google.api_core.retry.Retry]) –

    How to retry failed jobs. The default retries rate-limit-exceeded errors. Passing None disables job retry.

    Not all jobs can be retried. If job_id was provided to the query that created this job, then the job returned by the query will not be retryable, and an exception will be raised if non-None non-default job_retry is also provided.

Returns

Iterator of row data Row-s. During each page, the iterator will have the total_rows attribute set, which counts the total number of rows in the result set (this is distinct from the total number of rows in the current page: iterator.page.num_items).

If the query is a special query that produces no results, e.g. a DDL query, an _EmptyRowIterator instance is returned.

Return type

google.cloud.bigquery.table.RowIterator

Raises
  • google.cloud.exceptions.GoogleAPICallError – If the job failed and retries aren’t successful.

  • concurrent.futures.TimeoutError – If the job did not complete in the given timeout.

  • TypeError – If Non-None and non-default job_retry is provided and the job is not retryable.

running()

True if the operation is currently running.

property schema: Optional[List[google.cloud.bigquery.schema.SchemaField]]

The schema of the results.

Present only for successful dry run of non-legacy SQL queries.

property schema_update_options

See google.cloud.bigquery.job.QueryJobConfig.schema_update_options.

property script_statistics: Optional[google.cloud.bigquery.job.base.ScriptStatistics]

Statistics for a child job of a script.

property search_stats: Optional[google.cloud.bigquery.job.query.SearchStats]

Returns a SearchStats object.

URL for the job resource.

Returns

the URL (None until set from the server).

Return type

Optional[str]

property session_info: Optional[google.cloud.bigquery.job.base.SessionInfo]

[Preview] Information of the session if this job is part of one.

New in version 2.29.0.

set_exception(exception)

Set the Future’s exception.

set_result(result)

Set the Future’s result.

property slot_millis

Slot-milliseconds used by this query job.

Type

Union[int, None]

property started

Datetime at which the job was started.

Returns

the start time (None until set from the server).

Return type

Optional[datetime.datetime]

property state

Status of the job.

Returns

the state (None until set from the server).

Return type

Optional[str]

property statement_type

Return statement type from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.statement_type

Returns

type of statement used by the job, or None if job is not yet complete.

Return type

Optional[str]

property table_definitions

See google.cloud.bigquery.job.QueryJobConfig.table_definitions.

property time_partitioning

See google.cloud.bigquery.job.QueryJobConfig.time_partitioning.

property timeline

Return the query execution timeline from job statistics.

Type

List(TimelineEntry)

to_api_repr()[source]

Generate a resource for _begin().

to_arrow(progress_bar_type: Optional[str] = None, bqstorage_client: Optional[bigquery_storage.BigQueryReadClient] = None, create_bqstorage_client: bool = True, max_results: Optional[int] = None)pyarrow.Table[source]

[Beta] Create a class:pyarrow.Table by loading all pages of a table or query.

Parameters
  • progress_bar_type (Optional[str]) –

    If set, use the tqdm library to display a progress bar while the data downloads. Install the tqdm package to use this feature.

    Possible values of progress_bar_type include:

    None

    No progress bar.

    'tqdm'

    Use the tqdm.tqdm() function to print a progress bar to sys.stdout.

    'tqdm_notebook'

    Use the tqdm.notebook.tqdm() function to display a progress bar as a Jupyter notebook widget.

    'tqdm_gui'

    Use the tqdm.tqdm_gui() function to display a progress bar as a graphical dialog box.

  • bqstorage_client (Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]) –

    A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API.

    This method requires google-cloud-bigquery-storage library.

    Reading from a specific partition or snapshot is not currently supported by this method.

  • create_bqstorage_client (Optional[bool]) –

    If True (default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the bqstorage_client parameter for more information.

    This argument does nothing if bqstorage_client is supplied.

    New in version 1.24.0.

  • max_results (Optional[int]) –

    Maximum number of rows to include in the result. No limit by default.

    New in version 2.21.0.

Returns

pyarrow.Table

A pyarrow.Table populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.

Raises

ValueError – If the pyarrow library cannot be imported.

New in version 1.17.0.

to_dataframe(bqstorage_client: Optional[bigquery_storage.BigQueryReadClient] = None, dtypes: Optional[Dict[str, Any]] = None, progress_bar_type: Optional[str] = None, create_bqstorage_client: bool = True, max_results: Optional[int] = None, geography_as_object: bool = False, bool_dtype: Optional[Any] = <DefaultPandasDTypes.BOOL_DTYPE: <object object>>, int_dtype: Optional[Any] = <DefaultPandasDTypes.INT_DTYPE: <object object>>, float_dtype: Optional[Any] = None, string_dtype: Optional[Any] = None, date_dtype: Optional[Any] = <DefaultPandasDTypes.DATE_DTYPE: <object object>>, datetime_dtype: Optional[Any] = None, time_dtype: Optional[Any] = <DefaultPandasDTypes.TIME_DTYPE: <object object>>, timestamp_dtype: Optional[Any] = None)pandas.DataFrame[source]

Return a pandas DataFrame from a QueryJob

Parameters
  • bqstorage_client (Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]) –

    A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API.

    This method requires the fastavro and google-cloud-bigquery-storage libraries.

    Reading from a specific partition or snapshot is not currently supported by this method.

  • dtypes (Optional[Map[str, Union[str, pandas.Series.dtype]]]) – A dictionary of column names pandas dtype``s. The provided ``dtype is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.

  • progress_bar_type (Optional[str]) –

    If set, use the tqdm library to display a progress bar while the data downloads. Install the tqdm package to use this feature.

    See to_dataframe() for details.

    New in version 1.11.0.

  • create_bqstorage_client (Optional[bool]) –

    If True (default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the bqstorage_client parameter for more information.

    This argument does nothing if bqstorage_client is supplied.

    New in version 1.24.0.

  • max_results (Optional[int]) –

    Maximum number of rows to include in the result. No limit by default.

    New in version 2.21.0.

  • geography_as_object (Optional[bool]) –

    If True, convert GEOGRAPHY data to shapely geometry objects. If False (default), don’t cast geography data to shapely geometry objects.

    New in version 2.24.0.

  • bool_dtype (Optional[pandas.Series.dtype, None]) –

    If set, indicate a pandas ExtensionDtype (e.g. pandas.BooleanDtype()) to convert BigQuery Boolean type, instead of relying on the default pandas.BooleanDtype(). If you explicitly set the value to None, then the data type will be numpy.dtype("bool"). BigQuery Boolean type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#boolean_type

    New in version 3.8.0.

  • int_dtype (Optional[pandas.Series.dtype, None]) –

    If set, indicate a pandas ExtensionDtype (e.g. pandas.Int64Dtype()) to convert BigQuery Integer types, instead of relying on the default pandas.Int64Dtype(). If you explicitly set the value to None, then the data type will be numpy.dtype("int64"). A list of BigQuery Integer types can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#integer_types

    New in version 3.8.0.

  • float_dtype (Optional[pandas.Series.dtype, None]) –

    If set, indicate a pandas ExtensionDtype (e.g. pandas.Float32Dtype()) to convert BigQuery Float type, instead of relying on the default numpy.dtype("float64"). If you explicitly set the value to None, then the data type will be numpy.dtype("float64"). BigQuery Float type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#floating_point_types

    New in version 3.8.0.

  • string_dtype (Optional[pandas.Series.dtype, None]) –

    If set, indicate a pandas ExtensionDtype (e.g. pandas.StringDtype()) to convert BigQuery String type, instead of relying on the default numpy.dtype("object"). If you explicitly set the value to None, then the data type will be numpy.dtype("object"). BigQuery String type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#string_type

    New in version 3.8.0.

  • date_dtype (Optional[pandas.Series.dtype, None]) –

    If set, indicate a pandas ExtensionDtype (e.g. pandas.ArrowDtype(pyarrow.date32())) to convert BigQuery Date type, instead of relying on the default db_dtypes.DateDtype(). If you explicitly set the value to None, then the data type will be numpy.dtype("datetime64[ns]") or object if out of bound. BigQuery Date type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#date_type

    New in version 3.10.0.

  • datetime_dtype (Optional[pandas.Series.dtype, None]) –

    If set, indicate a pandas ExtensionDtype (e.g. pandas.ArrowDtype(pyarrow.timestamp("us"))) to convert BigQuery Datetime type, instead of relying on the default numpy.dtype("datetime64[ns]. If you explicitly set the value to None, then the data type will be numpy.dtype("datetime64[ns]") or object if out of bound. BigQuery Datetime type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#datetime_type

    New in version 3.10.0.

  • time_dtype (Optional[pandas.Series.dtype, None]) –

    If set, indicate a pandas ExtensionDtype (e.g. pandas.ArrowDtype(pyarrow.time64("us"))) to convert BigQuery Time type, instead of relying on the default db_dtypes.TimeDtype(). If you explicitly set the value to None, then the data type will be numpy.dtype("object"). BigQuery Time type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#time_type

    New in version 3.10.0.

  • timestamp_dtype (Optional[pandas.Series.dtype, None]) –

    If set, indicate a pandas ExtensionDtype (e.g. pandas.ArrowDtype(pyarrow.timestamp("us", tz="UTC"))) to convert BigQuery Timestamp type, instead of relying on the default numpy.dtype("datetime64[ns, UTC]"). If you explicitly set the value to None, then the data type will be numpy.dtype("datetime64[ns, UTC]") or object if out of bound. BigQuery Datetime type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#timestamp_type

    New in version 3.10.0.

Returns

A DataFrame populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.

Return type

pandas.DataFrame

Raises

ValueError – If the pandas library cannot be imported, or the google.cloud.bigquery_storage_v1 module is required but cannot be imported. Also if geography_as_object is True, but the shapely library cannot be imported.

to_geodataframe(bqstorage_client: Optional[bigquery_storage.BigQueryReadClient] = None, dtypes: Optional[Dict[str, Any]] = None, progress_bar_type: Optional[str] = None, create_bqstorage_client: bool = True, max_results: Optional[int] = None, geography_column: Optional[str] = None)geopandas.GeoDataFrame[source]

Return a GeoPandas GeoDataFrame from a QueryJob

Parameters
  • bqstorage_client (Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]) –

    A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API.

    This method requires the fastavro and google-cloud-bigquery-storage libraries.

    Reading from a specific partition or snapshot is not currently supported by this method.

  • dtypes (Optional[Map[str, Union[str, pandas.Series.dtype]]]) – A dictionary of column names pandas dtype``s. The provided ``dtype is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.

  • progress_bar_type (Optional[str]) –

    If set, use the tqdm library to display a progress bar while the data downloads. Install the tqdm package to use this feature.

    See to_dataframe() for details.

    New in version 1.11.0.

  • create_bqstorage_client (Optional[bool]) –

    If True (default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the bqstorage_client parameter for more information.

    This argument does nothing if bqstorage_client is supplied.

    New in version 1.24.0.

  • max_results (Optional[int]) –

    Maximum number of rows to include in the result. No limit by default.

    New in version 2.21.0.

  • geography_column (Optional[str]) – If there are more than one GEOGRAPHY column, identifies which one to use to construct a GeoPandas GeoDataFrame. This option can be ommitted if there’s only one GEOGRAPHY column.

Returns

A geopandas.GeoDataFrame populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.

Return type

geopandas.GeoDataFrame

Raises

ValueError – If the geopandas library cannot be imported, or the google.cloud.bigquery_storage_v1 module is required but cannot be imported.

New in version 2.24.0.

property total_bytes_billed

Return total bytes billed from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.total_bytes_billed

Returns

Total bytes processed by the job, or None if job is not yet complete.

Return type

Optional[int]

property total_bytes_processed

Return total bytes processed from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.total_bytes_processed

Returns

Total bytes processed by the job, or None if job is not yet complete.

Return type

Optional[int]

property transaction_info: Optional[google.cloud.bigquery.job.base.TransactionInfo]

Information of the multi-statement transaction if this job is part of one.

Since a scripting query job can execute multiple transactions, this property is only expected on child jobs. Use the google.cloud.bigquery.client.Client.list_jobs() method with the parent_job parameter to iterate over child jobs.

New in version 2.24.0.

property udf_resources

See google.cloud.bigquery.job.QueryJobConfig.udf_resources.

property undeclared_query_parameters

Return undeclared query parameters from job statistics, if present.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.undeclared_query_parameters

Returns

Undeclared parameters, or an empty list if the query has not yet completed.

Return type

List[Union[ google.cloud.bigquery.query.ArrayQueryParameter, google.cloud.bigquery.query.ScalarQueryParameter, google.cloud.bigquery.query.StructQueryParameter ]]

property use_legacy_sql

See google.cloud.bigquery.job.QueryJobConfig.use_legacy_sql.

property use_query_cache

See google.cloud.bigquery.job.QueryJobConfig.use_query_cache.

property user_email

E-mail address of user who submitted the job.

Returns

the URL (None until set from the server).

Return type

Optional[str]

property write_disposition

See google.cloud.bigquery.job.QueryJobConfig.write_disposition.