As of January 1, 2020 this library no longer supports Python 2 on the latest released version. Library versions released prior to that date will continue to be available. For more information please visit Python 2 support on Google Cloud.

google.cloud.bigquery.job.LoadJob

class google.cloud.bigquery.job.LoadJob(job_id, source_uris, destination, client, job_config=None)[source]

Asynchronous job for loading data into a table.

Can load from Google Cloud Storage URIs or from a file.

Parameters
__init__(job_id, source_uris, destination, client, job_config=None)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(job_id, source_uris, destination, …)

Initialize self.

add_done_callback(fn)

Add a callback to be executed when the operation is complete.

cancel([client, retry, timeout])

API call: cancel job via a POST request

cancelled()

Check if the job has been cancelled.

done([retry, timeout])

Refresh the job and checks if it is complete.

exception([timeout])

Get the exception from the operation, blocking if necessary.

exists([client, retry, timeout])

API call: test for the existence of the job via a GET request

from_api_repr(resource, client)

Factory: construct a job given its API representation

reload([client, retry, timeout])

API call: refresh job properties via a GET request.

result([retry, timeout])

Start the job and wait for it to complete and get the result.

running()

True if the operation is currently running.

set_exception(exception)

Set the Future’s exception.

set_result(result)

Set the Future’s result.

to_api_repr()

Generate a resource for _begin().

Attributes

allow_jagged_rows

See google.cloud.bigquery.job.LoadJobConfig.allow_jagged_rows.

allow_quoted_newlines

See google.cloud.bigquery.job.LoadJobConfig.allow_quoted_newlines.

autodetect

See google.cloud.bigquery.job.LoadJobConfig.autodetect.

clustering_fields

See google.cloud.bigquery.job.LoadJobConfig.clustering_fields.

create_disposition

See google.cloud.bigquery.job.LoadJobConfig.create_disposition.

created

Datetime at which the job was created.

destination

table where loaded rows are written

destination_encryption_configuration

Custom encryption configuration for the destination table.

destination_table_description

Optional[str] name given to destination table.

destination_table_friendly_name

Optional[str] name given to destination table.

encoding

See google.cloud.bigquery.job.LoadJobConfig.encoding.

ended

Datetime at which the job finished.

error_result

Error information about the job as a whole.

errors

Information about individual errors generated by the job.

etag

ETag for the job resource.

field_delimiter

See google.cloud.bigquery.job.LoadJobConfig.field_delimiter.

ignore_unknown_values

See google.cloud.bigquery.job.LoadJobConfig.ignore_unknown_values.

input_file_bytes

Count of bytes loaded from source files.

input_files

Count of source files.

job_id

ID of the job.

job_type

Type of job.

labels

Labels for the job.

location

Location where the job runs.

max_bad_records

See google.cloud.bigquery.job.LoadJobConfig.max_bad_records.

null_marker

See google.cloud.bigquery.job.LoadJobConfig.null_marker.

num_child_jobs

The number of child jobs executed.

output_bytes

Count of bytes saved to destination table.

output_rows

Count of rows saved to destination table.

parent_job_id

Return the ID of the parent job.

path

URL path for the job’s APIs.

project

Project bound to the job.

quote_character

See google.cloud.bigquery.job.LoadJobConfig.quote_character.

range_partitioning

See google.cloud.bigquery.job.LoadJobConfig.range_partitioning.

schema

See google.cloud.bigquery.job.LoadJobConfig.schema.

schema_update_options

See google.cloud.bigquery.job.LoadJobConfig.schema_update_options.

script_statistics

self_link

URL for the job resource.

skip_leading_rows

See google.cloud.bigquery.job.LoadJobConfig.skip_leading_rows.

source_format

See google.cloud.bigquery.job.LoadJobConfig.source_format.

started

Datetime at which the job was started.

state

Status of the job.

time_partitioning

See google.cloud.bigquery.job.LoadJobConfig.time_partitioning.

use_avro_logical_types

See google.cloud.bigquery.job.LoadJobConfig.use_avro_logical_types.

user_email

E-mail address of user who submitted the job.

write_disposition

See google.cloud.bigquery.job.LoadJobConfig.write_disposition.

add_done_callback(fn)

Add a callback to be executed when the operation is complete.

If the operation is not already complete, this will start a helper thread to poll for the status of the operation in the background.

Parameters

fn (Callable[Future]) – The callback to execute when the operation is complete.

property allow_jagged_rows

See google.cloud.bigquery.job.LoadJobConfig.allow_jagged_rows.

property allow_quoted_newlines

See google.cloud.bigquery.job.LoadJobConfig.allow_quoted_newlines.

property autodetect

See google.cloud.bigquery.job.LoadJobConfig.autodetect.

cancel(client=None, retry=<google.api_core.retry.Retry object>, timeout=None)

API call: cancel job via a POST request

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/cancel

Parameters
  • client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the client stored on the current dataset.

  • retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.

  • timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using retry

Returns

Boolean indicating that the cancel request was sent.

Return type

bool

cancelled()

Check if the job has been cancelled.

This always returns False. It’s not possible to check if a job was cancelled in the API. This method is here to satisfy the interface for google.api_core.future.Future.

Returns

False

Return type

bool

property clustering_fields

See google.cloud.bigquery.job.LoadJobConfig.clustering_fields.

property create_disposition

See google.cloud.bigquery.job.LoadJobConfig.create_disposition.

property created

Datetime at which the job was created.

Returns

the creation time (None until set from the server).

Return type

Optional[datetime.datetime]

property destination

table where loaded rows are written

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.destination_table

Type

google.cloud.bigquery.table.TableReference

property destination_encryption_configuration

Custom encryption configuration for the destination table.

Custom encryption configuration (e.g., Cloud KMS keys) or None if using default encryption.

See google.cloud.bigquery.job.LoadJobConfig.destination_encryption_configuration.

Type

google.cloud.bigquery.encryption_configuration.EncryptionConfiguration

property destination_table_description

Optional[str] name given to destination table.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#DestinationTableProperties.FIELDS.description

property destination_table_friendly_name

Optional[str] name given to destination table.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#DestinationTableProperties.FIELDS.friendly_name

done(retry=<google.api_core.retry.Retry object>, timeout=None)

Refresh the job and checks if it is complete.

Parameters
  • retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.

  • timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using retry.

Returns

True if the job is complete, False otherwise.

Return type

bool

property encoding

See google.cloud.bigquery.job.LoadJobConfig.encoding.

property ended

Datetime at which the job finished.

Returns

the end time (None until set from the server).

Return type

Optional[datetime.datetime]

property error_result

Error information about the job as a whole.

Returns

the error information (None until set from the server).

Return type

Optional[Mapping]

property errors

Information about individual errors generated by the job.

Returns

the error information (None until set from the server).

Return type

Optional[List[Mapping]]

property etag

ETag for the job resource.

Returns

the ETag (None until set from the server).

Return type

Optional[str]

exception(timeout=None)

Get the exception from the operation, blocking if necessary.

Parameters

timeout (int) – How long to wait for the operation to complete. If None, wait indefinitely.

Returns

The operation’s

error.

Return type

Optional[google.api_core.GoogleAPICallError]

exists(client=None, retry=<google.api_core.retry.Retry object>, timeout=None)

API call: test for the existence of the job via a GET request

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get

Parameters
  • client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the client stored on the current dataset.

  • retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.

  • timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using retry.

Returns

Boolean indicating existence of the job.

Return type

bool

property field_delimiter

See google.cloud.bigquery.job.LoadJobConfig.field_delimiter.

classmethod from_api_repr(resource, client)[source]

Factory: construct a job given its API representation

Parameters
  • resource (Dict) – dataset job representation returned from the API

  • client (google.cloud.bigquery.client.Client) – Client which holds credentials and project configuration for the dataset.

Returns

Job parsed from resource.

Return type

google.cloud.bigquery.job.LoadJob

property ignore_unknown_values

See google.cloud.bigquery.job.LoadJobConfig.ignore_unknown_values.

property input_file_bytes

Count of bytes loaded from source files.

Returns

the count (None until set from the server).

Return type

Optional[int]

Raises

ValueError – for invalid value types.

property input_files

Count of source files.

Returns

the count (None until set from the server).

Return type

Optional[int]

property job_id

ID of the job.

Type

str

property job_type

Type of job.

Returns

one of ‘load’, ‘copy’, ‘extract’, ‘query’.

Return type

str

property labels

Labels for the job.

Type

Dict[str, str]

property location

Location where the job runs.

Type

str

property max_bad_records

See google.cloud.bigquery.job.LoadJobConfig.max_bad_records.

property null_marker

See google.cloud.bigquery.job.LoadJobConfig.null_marker.

property num_child_jobs

The number of child jobs executed.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.num_child_jobs

Returns

int

property output_bytes

Count of bytes saved to destination table.

Returns

the count (None until set from the server).

Return type

Optional[int]

property output_rows

Count of rows saved to destination table.

Returns

the count (None until set from the server).

Return type

Optional[int]

property parent_job_id

Return the ID of the parent job.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobStatistics.FIELDS.parent_job_id

Returns

parent job id.

Return type

Optional[str]

property path

URL path for the job’s APIs.

Returns

the path based on project and job ID.

Return type

str

property project

Project bound to the job.

Returns

the project (derived from the client).

Return type

str

property quote_character

See google.cloud.bigquery.job.LoadJobConfig.quote_character.

property range_partitioning

See google.cloud.bigquery.job.LoadJobConfig.range_partitioning.

reload(client=None, retry=<google.api_core.retry.Retry object>, timeout=None)

API call: refresh job properties via a GET request.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/get

Parameters
  • client (Optional[google.cloud.bigquery.client.Client]) – the client to use. If not passed, falls back to the client stored on the current dataset.

  • retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.

  • timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using retry.

result(retry=<google.api_core.retry.Retry object>, timeout=None)

Start the job and wait for it to complete and get the result.

Parameters
  • retry (Optional[google.api_core.retry.Retry]) – How to retry the RPC.

  • timeout (Optional[float]) – The number of seconds to wait for the underlying HTTP transport before using retry. If multiple requests are made under the hood, timeout applies to each individual request.

Returns

This instance.

Return type

_AsyncJob

Raises
running()

True if the operation is currently running.

property schema

See google.cloud.bigquery.job.LoadJobConfig.schema.

property schema_update_options

See google.cloud.bigquery.job.LoadJobConfig.schema_update_options.

URL for the job resource.

Returns

the URL (None until set from the server).

Return type

Optional[str]

set_exception(exception)

Set the Future’s exception.

set_result(result)

Set the Future’s result.

property skip_leading_rows

See google.cloud.bigquery.job.LoadJobConfig.skip_leading_rows.

property source_format

See google.cloud.bigquery.job.LoadJobConfig.source_format.

property started

Datetime at which the job was started.

Returns

the start time (None until set from the server).

Return type

Optional[datetime.datetime]

property state

Status of the job.

Returns

the state (None until set from the server).

Return type

Optional[str]

property time_partitioning

See google.cloud.bigquery.job.LoadJobConfig.time_partitioning.

to_api_repr()[source]

Generate a resource for _begin().

property use_avro_logical_types

See google.cloud.bigquery.job.LoadJobConfig.use_avro_logical_types.

property user_email

E-mail address of user who submitted the job.

Returns

the URL (None until set from the server).

Return type

Optional[str]

property write_disposition

See google.cloud.bigquery.job.LoadJobConfig.write_disposition.