google.cloud.bigquery.job.LoadJob¶
- class google.cloud.bigquery.job.LoadJob(job_id, source_uris, destination, client, job_config=None)[source]¶
Asynchronous job for loading data into a table.
Can load from Google Cloud Storage URIs or from a file.
- Parameters
job_id (str) – the job’s ID
source_uris (Optional[Sequence[str]]) – URIs of one or more data files to be loaded. See https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.source_uris for supported URI formats. Pass None for jobs that load from a file.
destination (google.cloud.bigquery.table.TableReference) – reference to table into which data is to be loaded.
client (google.cloud.bigquery.client.Client) – A client which holds credentials and project configuration for the dataset (which requires a project).
- __init__(job_id, source_uris, destination, client, job_config=None)[source]¶
Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(job_id, source_uris, destination, …)Initialize self.
Add a callback to be executed when the operation is complete.
cancel
([client, retry, timeout])API call: cancel job via a POST request
Check if the job has been cancelled.
done
([retry, timeout, reload])Checks if the job is complete.
exception
([timeout])Get the exception from the operation, blocking if necessary.
exists
([client, retry, timeout])API call: test for the existence of the job via a GET request
from_api_repr
(resource, client)Factory: construct a job given its API representation
reload
([client, retry, timeout])API call: refresh job properties via a GET request.
result
([retry, timeout])Start the job and wait for it to complete and get the result.
running
()True if the operation is currently running.
set_exception
(exception)Set the Future’s exception.
set_result
(result)Set the Future’s result.
Generate a resource for
_begin()
.Attributes
See
google.cloud.bigquery.job.LoadJobConfig.allow_jagged_rows
.See
google.cloud.bigquery.job.LoadJobConfig.allow_quoted_newlines
.See
google.cloud.bigquery.job.LoadJobConfig.clustering_fields
.The configuration for this load job.
See
google.cloud.bigquery.job.LoadJobConfig.connection_properties
.See
google.cloud.bigquery.job.LoadJobConfig.create_disposition
.Datetime at which the job was created.
table where loaded rows are written
Custom encryption configuration for the destination table.
Optional[str] name given to destination table.
Optional[str] name given to destination table.
Datetime at which the job finished.
Error information about the job as a whole.
Information about individual errors generated by the job.
ETag for the job resource.
See
google.cloud.bigquery.job.LoadJobConfig.field_delimiter
.See
google.cloud.bigquery.job.LoadJobConfig.ignore_unknown_values
.Count of bytes loaded from source files.
Count of source files.
ID of the job.
Type of job.
Labels for the job.
Location where the job runs.
See
google.cloud.bigquery.job.LoadJobConfig.max_bad_records
.The number of child jobs executed.
Count of bytes saved to destination table.
Count of rows saved to destination table.
Return the ID of the parent job.
URL path for the job’s APIs.
Project bound to the job.
See
google.cloud.bigquery.job.LoadJobConfig.quote_character
.See
google.cloud.bigquery.job.LoadJobConfig.range_partitioning
.See: attr:google.cloud.bigquery.job.LoadJobConfig.reference_file_schema_uri.
Job resource usage breakdown by reservation.
See
google.cloud.bigquery.job.LoadJobConfig.schema_update_options
.Statistics for a child job of a script.
URL for the job resource.
[Preview] Information of the session if this job is part of one.
See
google.cloud.bigquery.job.LoadJobConfig.skip_leading_rows
.URIs of data files to be loaded.
Datetime at which the job was started.
Status of the job.
See
google.cloud.bigquery.job.LoadJobConfig.time_partitioning
.Information of the multi-statement transaction if this job is part of one.
See
google.cloud.bigquery.job.LoadJobConfig.use_avro_logical_types
.E-mail address of user who submitted the job.
See
google.cloud.bigquery.job.LoadJobConfig.write_disposition
.- add_done_callback(fn)¶
Add a callback to be executed when the operation is complete.
If the operation is not already complete, this will start a helper thread to poll for the status of the operation in the background.
- Parameters
fn (Callable[Future]) – The callback to execute when the operation is complete.
- property allow_jagged_rows¶
See
google.cloud.bigquery.job.LoadJobConfig.allow_jagged_rows
.
- property allow_quoted_newlines¶
See
google.cloud.bigquery.job.LoadJobConfig.allow_quoted_newlines
.
- property autodetect¶
- cancel(client=None, retry: Optional[google.api_core.retry.Retry] = <google.api_core.retry.Retry object>, timeout: Optional[float] = None) → bool¶
API call: cancel job via a POST request
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/cancel
- Parameters
client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the
client
stored on the current dataset.retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.
timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
- Returns
Boolean indicating that the cancel request was sent.
- Return type
- cancelled()¶
Check if the job has been cancelled.
This always returns False. It’s not possible to check if a job was cancelled in the API. This method is here to satisfy the interface for
google.api_core.future.Future
.- Returns
False
- Return type
- property clustering_fields¶
See
google.cloud.bigquery.job.LoadJobConfig.clustering_fields
.
- property configuration: google.cloud.bigquery.job.load.LoadJobConfig¶
The configuration for this load job.
- property connection_properties: List[google.cloud.bigquery.query.ConnectionProperty]¶
See
google.cloud.bigquery.job.LoadJobConfig.connection_properties
.New in version 3.7.0.
- property create_disposition¶
See
google.cloud.bigquery.job.LoadJobConfig.create_disposition
.
- property create_session: Optional[bool]¶
See
google.cloud.bigquery.job.LoadJobConfig.create_session
.New in version 3.7.0.
- property created¶
Datetime at which the job was created.
- Returns
the creation time (None until set from the server).
- Return type
Optional[datetime.datetime]
- property destination¶
table where loaded rows are written
- property destination_encryption_configuration¶
Custom encryption configuration for the destination table.
Custom encryption configuration (e.g., Cloud KMS keys) or
None
if using default encryption.See
google.cloud.bigquery.job.LoadJobConfig.destination_encryption_configuration
.
- property destination_table_description¶
Optional[str] name given to destination table.
- property destination_table_friendly_name¶
Optional[str] name given to destination table.
- done(retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None, reload: bool = True) → bool¶
Checks if the job is complete.
- Parameters
retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC. If the job state is
DONE
, retrying is aborted early, as the job will not change anymore.timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
.reload (Optional[bool]) – If
True
, make an API call to refresh the job state of unfinished jobs before checking. DefaultTrue
.
- Returns
True if the job is complete, False otherwise.
- Return type
- property encoding¶
- property ended¶
Datetime at which the job finished.
- Returns
the end time (None until set from the server).
- Return type
Optional[datetime.datetime]
- property error_result¶
Error information about the job as a whole.
- Returns
the error information (None until set from the server).
- Return type
Optional[Mapping]
- property errors¶
Information about individual errors generated by the job.
- Returns
the error information (None until set from the server).
- Return type
Optional[List[Mapping]]
- property etag¶
ETag for the job resource.
- Returns
the ETag (None until set from the server).
- Return type
Optional[str]
- exception(timeout=<object object>)¶
Get the exception from the operation, blocking if necessary.
See the documentation for the
result()
method for details on how this method operates, as bothresult
and this method rely on the exact same polling logic. The only difference is that this method does not acceptretry
andpolling
arguments but relies on the default ones instead.- Parameters
timeout (int) – How long to wait for the operation to complete.
None (If) –
indefinitely. (wait) –
- Returns
- The operation’s
error.
- Return type
Optional[google.api_core.GoogleAPICallError]
- exists(client=None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None) → bool¶
API call: test for the existence of the job via a GET request
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get
- Parameters
client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the
client
stored on the current dataset.retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.
timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
.
- Returns
Boolean indicating existence of the job.
- Return type
- property field_delimiter¶
See
google.cloud.bigquery.job.LoadJobConfig.field_delimiter
.
- classmethod from_api_repr(resource: dict, client) → google.cloud.bigquery.job.load.LoadJob[source]¶
Factory: construct a job given its API representation
Note
This method assumes that the project found in the resource matches the client’s project.
- Parameters
resource (Dict) – dataset job representation returned from the API
client (google.cloud.bigquery.client.Client) – Client which holds credentials and project configuration for the dataset.
- Returns
Job parsed from
resource
.- Return type
- property ignore_unknown_values¶
See
google.cloud.bigquery.job.LoadJobConfig.ignore_unknown_values
.
- property input_file_bytes¶
Count of bytes loaded from source files.
- Returns
the count (None until set from the server).
- Return type
Optional[int]
- Raises
ValueError – for invalid value types.
- property input_files¶
Count of source files.
- Returns
the count (None until set from the server).
- Return type
Optional[int]
- property max_bad_records¶
See
google.cloud.bigquery.job.LoadJobConfig.max_bad_records
.
- property null_marker¶
- property num_child_jobs¶
The number of child jobs executed.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.num_child_jobs
- Returns
int
- property output_bytes¶
Count of bytes saved to destination table.
- Returns
the count (None until set from the server).
- Return type
Optional[int]
- property output_rows¶
Count of rows saved to destination table.
- Returns
the count (None until set from the server).
- Return type
Optional[int]
- property parent_job_id¶
Return the ID of the parent job.
See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.parent_job_id
- Returns
parent job id.
- Return type
Optional[str]
- property path¶
URL path for the job’s APIs.
- Returns
the path based on project and job ID.
- Return type
- property project¶
Project bound to the job.
- Returns
the project (derived from the client).
- Return type
- property quote_character¶
See
google.cloud.bigquery.job.LoadJobConfig.quote_character
.
- property range_partitioning¶
See
google.cloud.bigquery.job.LoadJobConfig.range_partitioning
.
- property reference_file_schema_uri¶
See: attr:google.cloud.bigquery.job.LoadJobConfig.reference_file_schema_uri.
- reload(client=None, retry: google.api_core.retry.Retry = <google.api_core.retry.Retry object>, timeout: Optional[float] = None)¶
API call: refresh job properties via a GET request.
See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get
- Parameters
client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the
client
stored on the current dataset.retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.
timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
.
- property reservation_usage¶
Job resource usage breakdown by reservation.
- Returns
Reservation usage stats. Can be empty if not set from the server.
- Return type
- result(retry: Optional[google.api_core.retry.Retry] = <google.api_core.retry.Retry object>, timeout: Optional[float] = None) → google.cloud.bigquery.job.base._AsyncJob¶
Start the job and wait for it to complete and get the result.
- Parameters
retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC. If the job state is
DONE
, retrying is aborted early, as the job will not change anymore.timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using
retry
. If multiple requests are made under the hood,timeout
applies to each individual request.
- Returns
This instance.
- Return type
_AsyncJob
- Raises
google.cloud.exceptions.GoogleAPICallError – if the job failed.
concurrent.futures.TimeoutError – if the job did not complete in the given timeout.
- running()¶
True if the operation is currently running.
- property schema¶
- property schema_update_options¶
See
google.cloud.bigquery.job.LoadJobConfig.schema_update_options
.
- property script_statistics: Optional[google.cloud.bigquery.job.base.ScriptStatistics]¶
Statistics for a child job of a script.
- property self_link¶
URL for the job resource.
- Returns
the URL (None until set from the server).
- Return type
Optional[str]
- property session_info: Optional[google.cloud.bigquery.job.base.SessionInfo]¶
[Preview] Information of the session if this job is part of one.
New in version 2.29.0.
- set_exception(exception)¶
Set the Future’s exception.
- set_result(result)¶
Set the Future’s result.
- property skip_leading_rows¶
See
google.cloud.bigquery.job.LoadJobConfig.skip_leading_rows
.
- property source_format¶
- property source_uris¶
URIs of data files to be loaded. See https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.source_uris for supported URI formats. None for jobs that load from a file.
- Type
Optional[Sequence[str]]
- property started¶
Datetime at which the job was started.
- Returns
the start time (None until set from the server).
- Return type
Optional[datetime.datetime]
- property state¶
Status of the job.
- Returns
the state (None until set from the server).
- Return type
Optional[str]
- property time_partitioning¶
See
google.cloud.bigquery.job.LoadJobConfig.time_partitioning
.
- property transaction_info: Optional[google.cloud.bigquery.job.base.TransactionInfo]¶
Information of the multi-statement transaction if this job is part of one.
Since a scripting query job can execute multiple transactions, this property is only expected on child jobs. Use the
google.cloud.bigquery.client.Client.list_jobs()
method with theparent_job
parameter to iterate over child jobs.New in version 2.24.0.
- property use_avro_logical_types¶
See
google.cloud.bigquery.job.LoadJobConfig.use_avro_logical_types
.
- property user_email¶
E-mail address of user who submitted the job.
- Returns
the URL (None until set from the server).
- Return type
Optional[str]
- property write_disposition¶
See
google.cloud.bigquery.job.LoadJobConfig.write_disposition
.