google.cloud.bigquery.job.ExtractJob¶
-
class
google.cloud.bigquery.job.
ExtractJob
(job_id, source, destination_uris, client, job_config=None)[source]¶ Asynchronous job: extract data from a table into Cloud Storage.
- Parameters
job_id (str) – the job’s ID.
source (Union[ google.cloud.bigquery.table.TableReference, google.cloud.bigquery.model.ModelReference ]) – Table or Model from which data is to be loaded or extracted.
destination_uris (List[str]) – URIs describing where the extracted data will be written in Cloud Storage, using the format
gs://<bucket_name>/<object_name_or_glob>
.client (google.cloud.bigquery.client.Client) – A client which holds credentials and project configuration.
job_config (Optional[google.cloud.bigquery.job.ExtractJobConfig]) – Extra configuration options for the extract job.
-
__init__
(job_id, source, destination_uris, client, job_config=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(job_id, source, destination_uris, …)Initialize self.
Add a callback to be executed when the operation is complete.
cancel
([client, retry, timeout])API call: cancel job via a POST request
Check if the job has been cancelled.
done
([retry, timeout, reload])Checks if the job is complete.
exception
([timeout])Get the exception from the operation, blocking if necessary.
exists
([client, retry, timeout])API call: test for the existence of the job via a GET request
from_api_repr
(resource, client)Factory: construct a job given its API representation
reload
([client, retry, timeout])API call: refresh job properties via a GET request.
result
([retry, timeout])Start the job and wait for it to complete and get the result.
running
()True if the operation is currently running.
set_exception
(exception)Set the Future’s exception.
set_result
(result)Set the Future’s result.
Generate a resource for
_begin()
.Attributes
Datetime at which the job was created.
See
google.cloud.bigquery.job.ExtractJobConfig.destination_format
.Return file counts from job statistics, if present.
URIs describing where the extracted data will be written in Cloud Storage, using the format
gs://<bucket_name>/<object_name_or_glob>
.Datetime at which the job finished.
Error information about the job as a whole.
Information about individual errors generated by the job.
ETag for the job resource.
See
google.cloud.bigquery.job.ExtractJobConfig.field_delimiter
.ID of the job.
Type of job.
Labels for the job.
Location where the job runs.
The number of child jobs executed.
Return the ID of the parent job.
URL path for the job’s APIs.
See
google.cloud.bigquery.job.ExtractJobConfig.print_header
.Project bound to the job.
script_statistics
URL for the job resource.
Table or Model from which data is to be loaded or extracted.
Datetime at which the job was started.
Status of the job.
E-mail address of user who submitted the job.
-
add_done_callback
(fn)¶ Add a callback to be executed when the operation is complete.
If the operation is not already complete, this will start a helper thread to poll for the status of the operation in the background.
- Parameters
fn (Callable[Future]) – The callback to execute when the operation is complete.
-
cancel
(client=None, retry=<google.api_core.retry.Retry object>, timeout=None)¶ API call: cancel job via a POST request
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/cancel
- Parameters
client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the
client
stored on the current dataset.retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.
timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
- Returns
Boolean indicating that the cancel request was sent.
- Return type
-
cancelled
()¶ Check if the job has been cancelled.
This always returns False. It’s not possible to check if a job was cancelled in the API. This method is here to satisfy the interface for
google.api_core.future.Future
.- Returns
False
- Return type
-
property
compression
¶
-
property
created
¶ Datetime at which the job was created.
- Returns
the creation time (None until set from the server).
- Return type
Optional[datetime.datetime]
-
property
destination_format
¶ See
google.cloud.bigquery.job.ExtractJobConfig.destination_format
.
-
property
destination_uri_file_counts
¶ Return file counts from job statistics, if present.
- Returns
A list of integer counts, each representing the number of files per destination URI or URI pattern specified in the extract configuration. These values will be in the same order as the URIs specified in the ‘destinationUris’ field. Returns None if job is not yet complete.
- Return type
List[int]
-
property
destination_uris
¶ URIs describing where the extracted data will be written in Cloud Storage, using the format
gs://<bucket_name>/<object_name_or_glob>
.- Type
List[str]
-
done
(retry=<google.api_core.retry.Retry object>, timeout=None, reload=True)¶ Checks if the job is complete.
- Parameters
retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.
timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
.reload (Optional[bool]) – If
True
, make an API call to refresh the job state of unfinished jobs before checking. DefaultTrue
.
- Returns
True if the job is complete, False otherwise.
- Return type
-
property
ended
¶ Datetime at which the job finished.
- Returns
the end time (None until set from the server).
- Return type
Optional[datetime.datetime]
-
property
error_result
¶ Error information about the job as a whole.
- Returns
the error information (None until set from the server).
- Return type
Optional[Mapping]
-
property
errors
¶ Information about individual errors generated by the job.
- Returns
the error information (None until set from the server).
- Return type
Optional[List[Mapping]]
-
property
etag
¶ ETag for the job resource.
- Returns
the ETag (None until set from the server).
- Return type
Optional[str]
-
exception
(timeout=None)¶ Get the exception from the operation, blocking if necessary.
- Parameters
timeout (int) – How long to wait for the operation to complete. If None, wait indefinitely.
- Returns
- The operation’s
error.
- Return type
Optional[google.api_core.GoogleAPICallError]
-
exists
(client=None, retry=<google.api_core.retry.Retry object>, timeout=None)¶ API call: test for the existence of the job via a GET request
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get
- Parameters
client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the
client
stored on the current dataset.retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.
timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
.
- Returns
Boolean indicating existence of the job.
- Return type
-
property
field_delimiter
¶ See
google.cloud.bigquery.job.ExtractJobConfig.field_delimiter
.
-
classmethod
from_api_repr
(resource, client)[source]¶ Factory: construct a job given its API representation
- Parameters
resource (Dict) – dataset job representation returned from the API
client (google.cloud.bigquery.client.Client) – Client which holds credentials and project configuration for the dataset.
- Returns
Job parsed from
resource
.- Return type
-
property
num_child_jobs
¶ The number of child jobs executed.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.num_child_jobs
- Returns
int
-
property
parent_job_id
¶ Return the ID of the parent job.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.parent_job_id
- Returns
parent job id.
- Return type
Optional[str]
-
property
path
¶ URL path for the job’s APIs.
- Returns
the path based on project and job ID.
- Return type
-
property
print_header
¶ See
google.cloud.bigquery.job.ExtractJobConfig.print_header
.
-
property
project
¶ Project bound to the job.
- Returns
the project (derived from the client).
- Return type
-
reload
(client=None, retry=<google.api_core.retry.Retry object>, timeout=None)¶ API call: refresh job properties via a GET request.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get
- Parameters
client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the
client
stored on the current dataset.retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.
timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
.
-
result
(retry=<google.api_core.retry.Retry object>, timeout=None)¶ Start the job and wait for it to complete and get the result.
- Parameters
retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.
timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
. If multiple requests are made under the hood,timeout
applies to each individual request.
- Returns
This instance.
- Return type
_AsyncJob
- Raises
google.cloud.exceptions.GoogleAPICallError – if the job failed.
concurrent.futures.TimeoutError – if the job did not complete in the given timeout.
-
running
()¶ True if the operation is currently running.
-
property
self_link
¶ URL for the job resource.
- Returns
the URL (None until set from the server).
- Return type
Optional[str]
-
set_exception
(exception)¶ Set the Future’s exception.
-
set_result
(result)¶ Set the Future’s result.
-
property
source
¶ Table or Model from which data is to be loaded or extracted.
-
property
started
¶ Datetime at which the job was started.
- Returns
the start time (None until set from the server).
- Return type
Optional[datetime.datetime]