google.cloud.bigquery.job.QueryJob¶
- class google.cloud.bigquery.job.QueryJob(job_id, query, client, job_config=None)[source]¶
Asynchronous job: query tables.
- Parameters
job_id (str) – the job’s ID, within the project belonging to
client
.query (str) – SQL query string.
client (google.cloud.bigquery.client.Client) – A client which holds credentials and project configuration for the dataset (which requires a project).
job_config (Optional[google.cloud.bigquery.job.QueryJobConfig]) – Extra configuration options for the query job.
- __init__(job_id, query, client, job_config=None)[source]¶
Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(job_id, query, client[, job_config])Initialize self.
Add a callback to be executed when the operation is complete.
cancel
([client, retry, timeout])API call: cancel job via a POST request
Check if the job has been cancelled.
done
([retry, timeout, reload])Checks if the job is complete.
exception
([timeout])Get the exception from the operation, blocking if necessary.
exists
([client, retry, timeout])API call: test for the existence of the job via a GET request
from_api_repr
(resource, client)Factory: construct a job given its API representation
reload
([client, retry, timeout])API call: refresh job properties via a GET request.
result
([page_size, max_results, retry, …])Start the job and wait for it to complete and get the result.
running
()True if the operation is currently running.
set_exception
(exception)Set the Future’s exception.
set_result
(result)Set the Future’s result.
Generate a resource for
_begin()
.to_arrow
([progress_bar_type, …])[Beta] Create a class:pyarrow.Table by loading all pages of a table or query.
to_dataframe
([bqstorage_client, dtypes, …])Return a pandas DataFrame from a QueryJob
to_geodataframe
([bqstorage_client, dtypes, …])Return a GeoPandas GeoDataFrame from a QueryJob
Attributes
See
google.cloud.bigquery.job.QueryJobConfig.allow_large_results
.bi_engine_stats
Return billing tier from job statistics, if present.
Return whether or not query results were served from cache.
See
google.cloud.bigquery.job.QueryJobConfig.clustering_fields
.The configuration for this query job.
See
google.cloud.bigquery.job.QueryJobConfig.connection_properties
.See
google.cloud.bigquery.job.QueryJobConfig.create_disposition
.See
google.cloud.bigquery.job.QueryJobConfig.create_session
.Datetime at which the job was created.
Return the DDL operation performed.
Return the DDL target routine, present
Return the DDL target table, present
See
google.cloud.bigquery.job.QueryJobConfig.default_dataset
.Custom encryption configuration for the destination table.
dml_stats
Datetime at which the job finished.
Error information about the job as a whole.
Information about individual errors generated by the job.
Return the estimated number of bytes processed by the query.
ETag for the job resource.
See
google.cloud.bigquery.job.QueryJobConfig.flatten_results
.ID of the job.
Type of job.
Labels for the job.
Location where the job runs.
See
google.cloud.bigquery.job.QueryJobConfig.maximum_billing_tier
.See
google.cloud.bigquery.job.QueryJobConfig.maximum_bytes_billed
.The number of child jobs executed.
Return the number of DML rows affected by the job.
Return the ID of the parent job.
URL path for the job’s APIs.
Project bound to the job.
The query text used in this query job.
[Preview] ID of a completed query.
See
google.cloud.bigquery.job.QueryJobConfig.query_parameters
.Return query plan from job statistics, if present.
See
google.cloud.bigquery.job.QueryJobConfig.range_partitioning
.Return referenced tables from job statistics, if present.
Job resource usage breakdown by reservation.
The schema of the results.
See
google.cloud.bigquery.job.QueryJobConfig.schema_update_options
.Statistics for a child job of a script.
Returns a SearchStats object.
URL for the job resource.
[Preview] Information of the session if this job is part of one.
Slot-milliseconds used by this query job.
Datetime at which the job was started.
Status of the job.
Return statement type from job statistics, if present.
See
google.cloud.bigquery.job.QueryJobConfig.table_definitions
.See
google.cloud.bigquery.job.QueryJobConfig.time_partitioning
.Return the query execution timeline from job statistics.
Return total bytes billed from job statistics, if present.
Return total bytes processed from job statistics, if present.
Information of the multi-statement transaction if this job is part of one.
Return undeclared query parameters from job statistics, if present.
See
google.cloud.bigquery.job.QueryJobConfig.use_legacy_sql
.See
google.cloud.bigquery.job.QueryJobConfig.use_query_cache
.E-mail address of user who submitted the job.
See
google.cloud.bigquery.job.QueryJobConfig.write_disposition
.- add_done_callback(fn)¶
Add a callback to be executed when the operation is complete.
If the operation is not already complete, this will start a helper thread to poll for the status of the operation in the background.
- Parameters
fn (Callable[Future]) – The callback to execute when the operation is complete.
- property allow_large_results¶
See
google.cloud.bigquery.job.QueryJobConfig.allow_large_results
.
- property billing_tier¶
Return billing tier from job statistics, if present.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.billing_tier
- Returns
Billing tier used by the job, or None if job is not yet complete.
- Return type
Optional[int]
- property cache_hit¶
Return whether or not query results were served from cache.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.cache_hit
- Returns
whether the query results were returned from cache, or None if job is not yet complete.
- Return type
Optional[bool]
- cancel(client=None, retry: Optional[google.api_core.retry.Retry] = <google.api_core.retry.Retry object>, timeout: Optional[float] = None) → bool¶
API call: cancel job via a POST request
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/cancel
- Parameters
client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the
client
stored on the current dataset.retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.
timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
- Returns
Boolean indicating that the cancel request was sent.
- Return type
- cancelled()¶
Check if the job has been cancelled.
This always returns False. It’s not possible to check if a job was cancelled in the API. This method is here to satisfy the interface for
google.api_core.future.Future
.- Returns
False
- Return type
- property clustering_fields¶
See
google.cloud.bigquery.job.QueryJobConfig.clustering_fields
.
- property configuration: google.cloud.bigquery.job.query.QueryJobConfig¶
The configuration for this query job.
- property connection_properties: List[google.cloud.bigquery.query.ConnectionProperty]¶
See
google.cloud.bigquery.job.QueryJobConfig.connection_properties
.New in version 2.29.0.
- property create_disposition¶
See
google.cloud.bigquery.job.QueryJobConfig.create_disposition
.
- property create_session: Optional[bool]¶
See
google.cloud.bigquery.job.QueryJobConfig.create_session
.New in version 2.29.0.
- property created¶
Datetime at which the job was created.
- Returns
the creation time (None until set from the server).
- Return type
Optional[datetime.datetime]
- property ddl_target_routine¶
- Return the DDL target routine, present
for CREATE/DROP FUNCTION/PROCEDURE queries.
- Type
- property ddl_target_table¶
- Return the DDL target table, present
for CREATE/DROP TABLE/VIEW queries.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.ddl_target_table
- Type
- property default_dataset¶
See
google.cloud.bigquery.job.QueryJobConfig.default_dataset
.
- property destination¶
- property destination_encryption_configuration¶
Custom encryption configuration for the destination table.
Custom encryption configuration (e.g., Cloud KMS keys) or
None
if using default encryption.See
google.cloud.bigquery.job.QueryJobConfig.destination_encryption_configuration
.
- done(retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, reload: bool = True) → bool¶
Checks if the job is complete.
- Parameters
retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC. If the job state is
DONE
, retrying is aborted early, as the job will not change anymore.timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
.reload (Optional[bool]) – If
True
, make an API call to refresh the job state of unfinished jobs before checking. DefaultTrue
.
- Returns
True if the job is complete, False otherwise.
- Return type
- property dry_run¶
- property ended¶
Datetime at which the job finished.
- Returns
the end time (None until set from the server).
- Return type
Optional[datetime.datetime]
- property error_result¶
Error information about the job as a whole.
- Returns
the error information (None until set from the server).
- Return type
Optional[Mapping]
- property errors¶
Information about individual errors generated by the job.
- Returns
the error information (None until set from the server).
- Return type
Optional[List[Mapping]]
- property estimated_bytes_processed¶
Return the estimated number of bytes processed by the query.
- Returns
number of DML rows affected by the job, or None if job is not yet complete.
- Return type
Optional[int]
- property etag¶
ETag for the job resource.
- Returns
the ETag (None until set from the server).
- Return type
Optional[str]
- exception(timeout=<object object>)¶
Get the exception from the operation, blocking if necessary.
See the documentation for the
result()
method for details on how this method operates, as bothresult
and this method rely on the exact same polling logic. The only difference is that this method does not acceptretry
andpolling
arguments but relies on the default ones instead.- Parameters
timeout (int) – How long to wait for the operation to complete.
None (If) –
indefinitely. (wait) –
- Returns
- The operation’s
error.
- Return type
Optional[google.api_core.GoogleAPICallError]
- exists(client=None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None) → bool¶
API call: test for the existence of the job via a GET request
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get
- Parameters
client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the
client
stored on the current dataset.retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.
timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
.
- Returns
Boolean indicating existence of the job.
- Return type
- property flatten_results¶
See
google.cloud.bigquery.job.QueryJobConfig.flatten_results
.
- classmethod from_api_repr(resource: dict, client: Client) → QueryJob[source]¶
Factory: construct a job given its API representation
- Parameters
resource (Dict) – dataset job representation returned from the API
client (google.cloud.bigquery.client.Client) – Client which holds credentials and project configuration for the dataset.
- Returns
Job parsed from
resource
.- Return type
- property maximum_billing_tier¶
See
google.cloud.bigquery.job.QueryJobConfig.maximum_billing_tier
.
- property maximum_bytes_billed¶
See
google.cloud.bigquery.job.QueryJobConfig.maximum_bytes_billed
.
- property num_child_jobs¶
The number of child jobs executed.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.num_child_jobs
- Returns
int
- property num_dml_affected_rows: Optional[int]¶
Return the number of DML rows affected by the job.
- Returns
number of DML rows affected by the job, or None if job is not yet complete.
- Return type
Optional[int]
- property parent_job_id¶
Return the ID of the parent job.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.parent_job_id
- Returns
parent job id.
- Return type
Optional[str]
- property path¶
URL path for the job’s APIs.
- Returns
the path based on project and job ID.
- Return type
- property priority¶
- property project¶
Project bound to the job.
- Returns
the project (derived from the client).
- Return type
- property query¶
The query text used in this query job.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationQuery.FIELDS.query
- Type
- property query_id: Optional[str]¶
[Preview] ID of a completed query.
This ID is auto-generated and not guaranteed to be populated.
- property query_parameters¶
See
google.cloud.bigquery.job.QueryJobConfig.query_parameters
.
- property query_plan¶
Return query plan from job statistics, if present.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.query_plan
- Returns
mappings describing the query plan, or an empty list if the query has not yet completed.
- Return type
- property range_partitioning¶
See
google.cloud.bigquery.job.QueryJobConfig.range_partitioning
.
- property referenced_tables¶
Return referenced tables from job statistics, if present.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.referenced_tables
- Returns
mappings describing the query plan, or an empty list if the query has not yet completed.
- Return type
List[Dict]
- reload(client=None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)¶
API call: refresh job properties via a GET request.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get
- Parameters
client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the
client
stored on the current dataset.retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.
timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
.
- property reservation_usage¶
Job resource usage breakdown by reservation.
- Returns
Reservation usage stats. Can be empty if not set from the server.
- Return type
- result(page_size: Optional[int] = None, max_results: Optional[int] = None, retry: Optional[google.api_core.retry.Retry] = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, start_index: Optional[int] = None, job_retry: Optional[google.api_core.retry.Retry] = <google.api_core.retry.Retry object>) → Union[RowIterator, google.cloud.bigquery.table._EmptyRowIterator][source]¶
Start the job and wait for it to complete and get the result.
- Parameters
page_size (Optional[int]) – The maximum number of rows in each page of results from this request. Non-positive values are ignored.
max_results (Optional[int]) – The maximum total number of rows from this request.
retry (Optional[google.api_core.retry.Retry]) – How to retry the call that retrieves rows. This only applies to making RPC calls. It isn’t used to retry failed jobs. This has a reasonable default that should only be overridden with care. If the job state is
DONE
, retrying is aborted early even if the results are not available, as this will not change anymore.timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
. If multiple requests are made under the hood,timeout
applies to each individual request.start_index (Optional[int]) – The zero-based index of the starting row to read.
job_retry (Optional[google.api_core.retry.Retry]) –
How to retry failed jobs. The default retries rate-limit-exceeded errors. Passing
None
disables job retry.Not all jobs can be retried. If
job_id
was provided to the query that created this job, then the job returned by the query will not be retryable, and an exception will be raised if non-None
non-defaultjob_retry
is also provided.
- Returns
Iterator of row data
Row
-s. During each page, the iterator will have thetotal_rows
attribute set, which counts the total number of rows in the result set (this is distinct from the total number of rows in the current page:iterator.page.num_items
).If the query is a special query that produces no results, e.g. a DDL query, an
_EmptyRowIterator
instance is returned.- Return type
- Raises
google.cloud.exceptions.GoogleAPICallError – If the job failed and retries aren’t successful.
concurrent.futures.TimeoutError – If the job did not complete in the given timeout.
TypeError – If Non-
None
and non-defaultjob_retry
is provided and the job is not retryable.
- running()¶
True if the operation is currently running.
- property schema: Optional[List[google.cloud.bigquery.schema.SchemaField]]¶
The schema of the results.
Present only for successful dry run of non-legacy SQL queries.
- property schema_update_options¶
See
google.cloud.bigquery.job.QueryJobConfig.schema_update_options
.
- property script_statistics: Optional[google.cloud.bigquery.job.base.ScriptStatistics]¶
Statistics for a child job of a script.
- property search_stats: Optional[google.cloud.bigquery.job.query.SearchStats]¶
Returns a SearchStats object.
- property self_link¶
URL for the job resource.
- Returns
the URL (None until set from the server).
- Return type
Optional[str]
- property session_info: Optional[google.cloud.bigquery.job.base.SessionInfo]¶
[Preview] Information of the session if this job is part of one.
New in version 2.29.0.
- set_exception(exception)¶
Set the Future’s exception.
- set_result(result)¶
Set the Future’s result.
- property started¶
Datetime at which the job was started.
- Returns
the start time (None until set from the server).
- Return type
Optional[datetime.datetime]
- property state¶
Status of the job.
- Returns
the state (None until set from the server).
- Return type
Optional[str]
- property statement_type¶
Return statement type from job statistics, if present.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics2.FIELDS.statement_type
- Returns
type of statement used by the job, or None if job is not yet complete.
- Return type
Optional[str]
- property table_definitions¶
See
google.cloud.bigquery.job.QueryJobConfig.table_definitions
.
- property time_partitioning¶
See
google.cloud.bigquery.job.QueryJobConfig.time_partitioning
.
- property timeline¶
Return the query execution timeline from job statistics.
- Type
List(TimelineEntry)
- to_arrow(progress_bar_type: Optional[str] = None, bqstorage_client: Optional[bigquery_storage.BigQueryReadClient] = None, create_bqstorage_client: bool = True, max_results: Optional[int] = None) → pyarrow.Table[source]¶
[Beta] Create a class:pyarrow.Table by loading all pages of a table or query.
- Parameters
progress_bar_type (Optional[str]) –
If set, use the tqdm library to display a progress bar while the data downloads. Install the
tqdm
package to use this feature.Possible values of
progress_bar_type
include:None
No progress bar.
'tqdm'
Use the
tqdm.tqdm()
function to print a progress bar tosys.stdout
.'tqdm_notebook'
Use the
tqdm.notebook.tqdm()
function to display a progress bar as a Jupyter notebook widget.'tqdm_gui'
Use the
tqdm.tqdm_gui()
function to display a progress bar as a graphical dialog box.
bqstorage_client (Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]) –
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API.
This method requires
google-cloud-bigquery-storage
library.Reading from a specific partition or snapshot is not currently supported by this method.
create_bqstorage_client (Optional[bool]) –
If
True
(default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See thebqstorage_client
parameter for more information.This argument does nothing if
bqstorage_client
is supplied.New in version 1.24.0.
max_results (Optional[int]) –
Maximum number of rows to include in the result. No limit by default.
New in version 2.21.0.
- Returns
- pyarrow.Table
A
pyarrow.Table
populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.
- Raises
ValueError – If the
pyarrow
library cannot be imported.
New in version 1.17.0.
- to_dataframe(bqstorage_client: Optional[bigquery_storage.BigQueryReadClient] = None, dtypes: Optional[Dict[str, Any]] = None, progress_bar_type: Optional[str] = None, create_bqstorage_client: bool = True, max_results: Optional[int] = None, geography_as_object: bool = False, bool_dtype: Optional[Any] = <DefaultPandasDTypes.BOOL_DTYPE: <object object>>, int_dtype: Optional[Any] = <DefaultPandasDTypes.INT_DTYPE: <object object>>, float_dtype: Optional[Any] = None, string_dtype: Optional[Any] = None, date_dtype: Optional[Any] = <DefaultPandasDTypes.DATE_DTYPE: <object object>>, datetime_dtype: Optional[Any] = None, time_dtype: Optional[Any] = <DefaultPandasDTypes.TIME_DTYPE: <object object>>, timestamp_dtype: Optional[Any] = None) → pandas.DataFrame[source]¶
Return a pandas DataFrame from a QueryJob
- Parameters
bqstorage_client (Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]) –
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API.
This method requires the
fastavro
andgoogle-cloud-bigquery-storage
libraries.Reading from a specific partition or snapshot is not currently supported by this method.
dtypes (Optional[Map[str, Union[str, pandas.Series.dtype]]]) – A dictionary of column names pandas
dtype``s. The provided ``dtype
is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.progress_bar_type (Optional[str]) –
If set, use the tqdm library to display a progress bar while the data downloads. Install the
tqdm
package to use this feature.See
to_dataframe()
for details.New in version 1.11.0.
create_bqstorage_client (Optional[bool]) –
If
True
(default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See thebqstorage_client
parameter for more information.This argument does nothing if
bqstorage_client
is supplied.New in version 1.24.0.
max_results (Optional[int]) –
Maximum number of rows to include in the result. No limit by default.
New in version 2.21.0.
geography_as_object (Optional[bool]) –
If
True
, convert GEOGRAPHY data toshapely
geometry objects. IfFalse
(default), don’t cast geography data toshapely
geometry objects.New in version 2.24.0.
bool_dtype (Optional[pandas.Series.dtype, None]) –
If set, indicate a pandas ExtensionDtype (e.g.
pandas.BooleanDtype()
) to convert BigQuery Boolean type, instead of relying on the defaultpandas.BooleanDtype()
. If you explicitly set the value toNone
, then the data type will benumpy.dtype("bool")
. BigQuery Boolean type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#boolean_typeNew in version 3.8.0.
int_dtype (Optional[pandas.Series.dtype, None]) –
If set, indicate a pandas ExtensionDtype (e.g.
pandas.Int64Dtype()
) to convert BigQuery Integer types, instead of relying on the defaultpandas.Int64Dtype()
. If you explicitly set the value toNone
, then the data type will benumpy.dtype("int64")
. A list of BigQuery Integer types can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#integer_typesNew in version 3.8.0.
float_dtype (Optional[pandas.Series.dtype, None]) –
If set, indicate a pandas ExtensionDtype (e.g.
pandas.Float32Dtype()
) to convert BigQuery Float type, instead of relying on the defaultnumpy.dtype("float64")
. If you explicitly set the value toNone
, then the data type will benumpy.dtype("float64")
. BigQuery Float type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#floating_point_typesNew in version 3.8.0.
string_dtype (Optional[pandas.Series.dtype, None]) –
If set, indicate a pandas ExtensionDtype (e.g.
pandas.StringDtype()
) to convert BigQuery String type, instead of relying on the defaultnumpy.dtype("object")
. If you explicitly set the value toNone
, then the data type will benumpy.dtype("object")
. BigQuery String type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#string_typeNew in version 3.8.0.
date_dtype (Optional[pandas.Series.dtype, None]) –
If set, indicate a pandas ExtensionDtype (e.g.
pandas.ArrowDtype(pyarrow.date32())
) to convert BigQuery Date type, instead of relying on the defaultdb_dtypes.DateDtype()
. If you explicitly set the value toNone
, then the data type will benumpy.dtype("datetime64[ns]")
orobject
if out of bound. BigQuery Date type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#date_typeNew in version 3.10.0.
datetime_dtype (Optional[pandas.Series.dtype, None]) –
If set, indicate a pandas ExtensionDtype (e.g.
pandas.ArrowDtype(pyarrow.timestamp("us"))
) to convert BigQuery Datetime type, instead of relying on the defaultnumpy.dtype("datetime64[ns]
. If you explicitly set the value toNone
, then the data type will benumpy.dtype("datetime64[ns]")
orobject
if out of bound. BigQuery Datetime type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#datetime_typeNew in version 3.10.0.
time_dtype (Optional[pandas.Series.dtype, None]) –
If set, indicate a pandas ExtensionDtype (e.g.
pandas.ArrowDtype(pyarrow.time64("us"))
) to convert BigQuery Time type, instead of relying on the defaultdb_dtypes.TimeDtype()
. If you explicitly set the value toNone
, then the data type will benumpy.dtype("object")
. BigQuery Time type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#time_typeNew in version 3.10.0.
timestamp_dtype (Optional[pandas.Series.dtype, None]) –
If set, indicate a pandas ExtensionDtype (e.g.
pandas.ArrowDtype(pyarrow.timestamp("us", tz="UTC"))
) to convert BigQuery Timestamp type, instead of relying on the defaultnumpy.dtype("datetime64[ns, UTC]")
. If you explicitly set the value toNone
, then the data type will benumpy.dtype("datetime64[ns, UTC]")
orobject
if out of bound. BigQuery Datetime type can be found at: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#timestamp_typeNew in version 3.10.0.
- Returns
A
DataFrame
populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.- Return type
- Raises
ValueError – If the
pandas
library cannot be imported, or thegoogle.cloud.bigquery_storage_v1
module is required but cannot be imported. Also if geography_as_object is True, but theshapely
library cannot be imported.
- to_geodataframe(bqstorage_client: Optional[bigquery_storage.BigQueryReadClient] = None, dtypes: Optional[Dict[str, Any]] = None, progress_bar_type: Optional[str] = None, create_bqstorage_client: bool = True, max_results: Optional[int] = None, geography_column: Optional[str] = None) → geopandas.GeoDataFrame[source]¶
Return a GeoPandas GeoDataFrame from a QueryJob
- Parameters
bqstorage_client (Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]) –
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API.
This method requires the
fastavro
andgoogle-cloud-bigquery-storage
libraries.Reading from a specific partition or snapshot is not currently supported by this method.
dtypes (Optional[Map[str, Union[str, pandas.Series.dtype]]]) – A dictionary of column names pandas
dtype``s. The provided ``dtype
is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.progress_bar_type (Optional[str]) –
If set, use the tqdm library to display a progress bar while the data downloads. Install the
tqdm
package to use this feature.See
to_dataframe()
for details.New in version 1.11.0.
create_bqstorage_client (Optional[bool]) –
If
True
(default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See thebqstorage_client
parameter for more information.This argument does nothing if
bqstorage_client
is supplied.New in version 1.24.0.
max_results (Optional[int]) –
Maximum number of rows to include in the result. No limit by default.
New in version 2.21.0.
geography_column (Optional[str]) – If there are more than one GEOGRAPHY column, identifies which one to use to construct a GeoPandas GeoDataFrame. This option can be ommitted if there’s only one GEOGRAPHY column.
- Returns
A
geopandas.GeoDataFrame
populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.- Return type
- Raises
ValueError – If the
geopandas
library cannot be imported, or thegoogle.cloud.bigquery_storage_v1
module is required but cannot be imported.
New in version 2.24.0.
- property total_bytes_billed¶
Return total bytes billed from job statistics, if present.
- Returns
Total bytes processed by the job, or None if job is not yet complete.
- Return type
Optional[int]
- property total_bytes_processed¶
Return total bytes processed from job statistics, if present.
- Returns
Total bytes processed by the job, or None if job is not yet complete.
- Return type
Optional[int]
- property transaction_info: Optional[google.cloud.bigquery.job.base.TransactionInfo]¶
Information of the multi-statement transaction if this job is part of one.
Since a scripting query job can execute multiple transactions, this property is only expected on child jobs. Use the
google.cloud.bigquery.client.Client.list_jobs()
method with theparent_job
parameter to iterate over child jobs.New in version 2.24.0.
- property udf_resources¶
- property undeclared_query_parameters¶
Return undeclared query parameters from job statistics, if present.
- Returns
Undeclared parameters, or an empty list if the query has not yet completed.
- Return type
List[Union[ google.cloud.bigquery.query.ArrayQueryParameter, google.cloud.bigquery.query.ScalarQueryParameter, google.cloud.bigquery.query.StructQueryParameter ]]
- property use_legacy_sql¶
See
google.cloud.bigquery.job.QueryJobConfig.use_legacy_sql
.
- property use_query_cache¶
See
google.cloud.bigquery.job.QueryJobConfig.use_query_cache
.
- property user_email¶
E-mail address of user who submitted the job.
- Returns
the URL (None until set from the server).
- Return type
Optional[str]
- property write_disposition¶
See
google.cloud.bigquery.job.QueryJobConfig.write_disposition
.