As of January 1, 2020 this library no longer supports Python 2 on the latest released version. Library versions released prior to that date will continue to be available. For more information please visit Python 2 support on Google Cloud.

google.cloud.bigquery.job.LoadJobConfig

class google.cloud.bigquery.job.LoadJobConfig(**kwargs)[source]

Configuration options for load jobs.

Set properties on the constructed configuration by using the property name as the name of a keyword argument. Values which are unset or None use the BigQuery REST API default values. See the BigQuery REST API reference documentation for a list of default values.

Required options differ based on the source_format value. For example, the BigQuery API’s default value for source_format is "CSV". When loading a CSV file, either schema must be set or autodetect must be set to True.

__init__(**kwargs)None[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(**kwargs)

Initialize self.

from_api_repr(resource)

Factory: construct a job configuration given its API representation

to_api_repr()

Build an API representation of the job config.

Attributes

allow_jagged_rows

Allow missing trailing optional columns (CSV only).

allow_quoted_newlines

Allow quoted data containing newline characters (CSV only).

autodetect

Automatically infer the schema from a sample of the data.

clustering_fields

Fields defining clustering for the table

connection_properties

Connection properties.

create_disposition

Specifies behavior for creating tables.

create_session

[Preview] If True, creates a new session, where session_info will contain a random server generated session id.

decimal_target_types

Possible SQL data types to which the source decimal values are converted.

destination_encryption_configuration

Custom encryption configuration for the destination table.

destination_table_description

Description of the destination table.

destination_table_friendly_name

Name given to destination table.

encoding

The character encoding of the data.

field_delimiter

The separator for fields in a CSV file.

hive_partitioning

[Beta] When set, it configures hive partitioning support.

ignore_unknown_values

Ignore extra values not represented in the table schema.

job_timeout_ms

Optional parameter.

json_extension

The extension to use for writing JSON data to BigQuery.

labels

Labels for the job.

max_bad_records

Number of invalid rows to ignore.

null_marker

Represents a null value (CSV only).

parquet_options

Additional

preserve_ascii_control_characters

Preserves the embedded ASCII control characters when sourceFormat is set to CSV.

projection_fields

If google.cloud.bigquery.job.LoadJobConfig.source_format is set to “DATASTORE_BACKUP”, indicates which entity properties to load into BigQuery from a Cloud Datastore backup.

quote_character

Character used to quote data sections (CSV only).

range_partitioning

Optional[google.cloud.bigquery.table.RangePartitioning]: Configures range-based partitioning for destination table.

reference_file_schema_uri

Optional[str]: When creating an external table, the user can provide a reference file with the table schema.

schema

Schema of the destination table.

schema_update_options

Specifies updates to the destination table schema to allow as a side effect of the load job.

skip_leading_rows

Number of rows to skip when reading data (CSV only).

source_format

File format of the data.

time_partitioning

Specifies time-based partitioning for the destination table.

use_avro_logical_types

For loads of Avro data, governs whether Avro logical types are converted to their corresponding BigQuery types (e.g.

write_disposition

Action that occurs if the destination table already exists.

__setattr__(name, value)

Override to be able to raise error if an unknown property is being set

property allow_jagged_rows

Allow missing trailing optional columns (CSV only).

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.allow_jagged_rows

Type

Optional[bool]

property allow_quoted_newlines

Allow quoted data containing newline characters (CSV only).

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.allow_quoted_newlines

Type

Optional[bool]

property autodetect

Automatically infer the schema from a sample of the data.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.autodetect

Type

Optional[bool]

property clustering_fields

Fields defining clustering for the table

(Defaults to None).

Clustering fields are immutable after table creation.

Note

BigQuery supports clustering for both partitioned and non-partitioned tables.

Type

Optional[List[str]]

property connection_properties: List[google.cloud.bigquery.query.ConnectionProperty]

Connection properties.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.connection_properties

New in version 3.7.0.

property create_disposition

Specifies behavior for creating tables.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.create_disposition

Type

Optional[google.cloud.bigquery.job.CreateDisposition]

property create_session: Optional[bool]

[Preview] If True, creates a new session, where session_info will contain a random server generated session id.

If False, runs load job with an existing session_id passed in connection_properties, otherwise runs load job in non-session mode.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.create_session

New in version 3.7.0.

property decimal_target_types: Optional[FrozenSet[str]]

Possible SQL data types to which the source decimal values are converted.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.decimal_target_types

New in version 2.21.0.

property destination_encryption_configuration

Custom encryption configuration for the destination table.

Custom encryption configuration (e.g., Cloud KMS keys) or None if using default encryption.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.destination_encryption_configuration

Type

Optional[google.cloud.bigquery.encryption_configuration.EncryptionConfiguration]

property destination_table_description

Description of the destination table.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#DestinationTableProperties.FIELDS.description

Type

Optional[str]

property destination_table_friendly_name

Name given to destination table.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#DestinationTableProperties.FIELDS.friendly_name

Type

Optional[str]

property encoding

The character encoding of the data.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.encoding

Type

Optional[google.cloud.bigquery.job.Encoding]

property field_delimiter

The separator for fields in a CSV file.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.field_delimiter

Type

Optional[str]

classmethod from_api_repr(resource: dict)google.cloud.bigquery.job.base._JobConfig

Factory: construct a job configuration given its API representation

Parameters

resource (Dict) – A job configuration in the same representation as is returned from the API.

Returns

Configuration parsed from resource.

Return type

google.cloud.bigquery.job._JobConfig

property hive_partitioning

[Beta] When set, it configures hive partitioning support.

Note

Experimental. This feature is experimental and might change or have limited support.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.hive_partitioning_options

Type

Optional[HivePartitioningOptions]

property ignore_unknown_values

Ignore extra values not represented in the table schema.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.ignore_unknown_values

Type

Optional[bool]

property job_timeout_ms

Optional parameter. Job timeout in milliseconds. If this time limit is exceeded, BigQuery might attempt to stop the job. https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfiguration.FIELDS.job_timeout_ms e.g.

job_config = bigquery.QueryJobConfig( job_timeout_ms = 5000 ) or job_config.job_timeout_ms = 5000

Raises

ValueError – If value type is invalid.

property json_extension

The extension to use for writing JSON data to BigQuery. Only supports GeoJSON currently.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.json_extension

Type

Optional[str]

property labels

Labels for the job.

This method always returns a dict. Once a job has been created on the server, its labels cannot be modified anymore.

Raises

ValueError – If value type is invalid.

Type

Dict[str, str]

property max_bad_records

Number of invalid rows to ignore.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.max_bad_records

Type

Optional[int]

property null_marker

Represents a null value (CSV only).

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.null_marker

Type

Optional[str]

property parquet_options
Additional

properties to set if sourceFormat is set to PARQUET.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.parquet_options

Type

Optional[google.cloud.bigquery.format_options.ParquetOptions]

property preserve_ascii_control_characters

Preserves the embedded ASCII control characters when sourceFormat is set to CSV.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.preserve_ascii_control_characters

Type

Optional[bool]

property projection_fields: Optional[List[str]]

If google.cloud.bigquery.job.LoadJobConfig.source_format is set to “DATASTORE_BACKUP”, indicates which entity properties to load into BigQuery from a Cloud Datastore backup.

Property names are case sensitive and must be top-level properties. If no properties are specified, BigQuery loads all properties. If any named property isn’t found in the Cloud Datastore backup, an invalid error is returned in the job result.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.projection_fields

Type

Optional[List[str]]

property quote_character

Character used to quote data sections (CSV only).

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.quote

Type

Optional[str]

property range_partitioning

Optional[google.cloud.bigquery.table.RangePartitioning]: Configures range-based partitioning for destination table.

Note

Beta. The integer range partitioning feature is in a pre-release state and might change or have limited support.

Only specify at most one of time_partitioning or range_partitioning.

Raises

ValueError – If the value is not RangePartitioning or None.

property reference_file_schema_uri

Optional[str]: When creating an external table, the user can provide a reference file with the table schema. This is enabled for the following formats:

AVRO, PARQUET, ORC

property schema

Schema of the destination table.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.schema

Type

Optional[Sequence[Union[ SchemaField, Mapping[str, Any] ]]]

property schema_update_options

Specifies updates to the destination table schema to allow as a side effect of the load job.

Type

Optional[List[google.cloud.bigquery.job.SchemaUpdateOption]]

property skip_leading_rows

Number of rows to skip when reading data (CSV only).

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.skip_leading_rows

Type

Optional[int]

property source_format

File format of the data.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.source_format

Type

Optional[google.cloud.bigquery.job.SourceFormat]

property time_partitioning

Specifies time-based partitioning for the destination table.

Only specify at most one of time_partitioning or range_partitioning.

Type

Optional[google.cloud.bigquery.table.TimePartitioning]

to_api_repr()dict

Build an API representation of the job config.

Returns

A dictionary in the format used by the BigQuery API.

Return type

Dict

property use_avro_logical_types

For loads of Avro data, governs whether Avro logical types are converted to their corresponding BigQuery types (e.g. TIMESTAMP) rather than raw types (e.g. INTEGER).

Type

Optional[bool]

property write_disposition

Action that occurs if the destination table already exists.

See: https://cloud.google.com/bigquery/docs/reference/rest/v2/Job#JobConfigurationLoad.FIELDS.write_disposition

Type

Optional[google.cloud.bigquery.job.WriteDisposition]