Class: Google::Cloud::Bigquery::LoadJob::Updater

Inherits:
Google::Cloud::Bigquery::LoadJob show all
Defined in:
lib/google/cloud/bigquery/load_job.rb

Overview

Yielded to a block to accumulate changes for a patch request.

Attributes collapse

Attributes collapse

Schema collapse

Methods inherited from Google::Cloud::Bigquery::LoadJob

#allow_jagged_rows?, #autodetect?, #backup?, #clustering?, #clustering_fields, #csv?, #delimiter, #destination, #encryption, #ignore_unknown_values?, #input_file_bytes, #input_files, #iso8859_1?, #json?, #max_bad_records, #null_marker, #output_bytes, #output_rows, #quote, #quoted_newlines?, #range_partitioning?, #range_partitioning_end, #range_partitioning_field, #range_partitioning_interval, #range_partitioning_start, #schema_update_options, #skip_leading_rows, #sources, #time_partitioning?, #time_partitioning_expiration, #time_partitioning_field, #time_partitioning_require_filter?, #time_partitioning_type, #utf8?

Methods inherited from Job

#configuration, #created_at, #done?, #ended_at, #error, #errors, #failed?, #job_id, #labels, #location, #num_child_jobs, #parent_job_id, #pending?, #project_id, #running?, #script_statistics, #started_at, #state, #statistics, #status, #user_email

Instance Attribute Details

#updatesObject (readonly)

A list of attributes that were updated.



535
536
537
# File 'lib/google/cloud/bigquery/load_job.rb', line 535

def updates
  @updates
end

Instance Method Details

#autodetect=(val) ⇒ Object

Allows BigQuery to autodetect the schema.

Parameters:

  • val (Boolean)

    Indicates if BigQuery should automatically infer the options and schema for CSV and JSON sources. The default value is false.



1134
1135
1136
# File 'lib/google/cloud/bigquery/load_job.rb', line 1134

def autodetect= val
  @gapi.configuration.load.update! autodetect: val
end

#boolean(name, description: nil, mode: :nullable) ⇒ Object

Adds a boolean field to the schema.

See Schema#boolean.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |schema|
  schema.boolean "active", mode: :required
end

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



759
760
761
# File 'lib/google/cloud/bigquery/load_job.rb', line 759

def boolean name, description: nil, mode: :nullable
  schema.boolean name, description: description, mode: mode
end

#bytes(name, description: nil, mode: :nullable) ⇒ Object

Adds a bytes field to the schema.

See Schema#bytes.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |schema|
  schema.bytes "avatar", mode: :required
end

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



787
788
789
# File 'lib/google/cloud/bigquery/load_job.rb', line 787

def bytes name, description: nil, mode: :nullable
  schema.bytes name, description: description, mode: mode
end

#cancelObject



1670
1671
1672
# File 'lib/google/cloud/bigquery/load_job.rb', line 1670

def cancel
  raise "not implemented in #{self.class}"
end

#check_for_mutated_schema!Object

Make sure any access changes are saved



944
945
946
947
948
949
# File 'lib/google/cloud/bigquery/load_job.rb', line 944

def check_for_mutated_schema!
  return if @schema.nil?
  return unless @schema.changed?
  @gapi.configuration.load.schema = @schema.to_gapi
  patch_gapi! :schema
end

#clustering_fields=(fields) ⇒ Object

Sets one or more fields on which the destination table should be clustered. Must be specified with time-based partitioning, data in the table will be first partitioned and subsequently clustered.

Only top-level, non-repeated, simple-type fields are supported. When you cluster a table using multiple columns, the order of columns you specify is important. The order of the specified columns determines the sort order of the data.

See Google::Cloud::Bigquery::LoadJob#clustering_fields.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

gs_url = "gs://my-bucket/file-name.csv"
load_job = dataset.load_job "my_new_table", gs_url do |job|
  job.time_partitioning_type  = "DAY"
  job.time_partitioning_field = "dob"
  job.schema do |schema|
    schema.timestamp "dob", mode: :required
    schema.string "first_name", mode: :required
    schema.string "last_name", mode: :required
  end
  job.clustering_fields = ["last_name", "first_name"]
end

load_job.wait_until_done!
load_job.done? #=> true

Parameters:

  • fields (Array<String>)

    The clustering fields. Only top-level, non-repeated, simple-type fields are supported.

See Also:



1665
1666
1667
1668
# File 'lib/google/cloud/bigquery/load_job.rb', line 1665

def clustering_fields= fields
  @gapi.configuration.load.clustering ||= Google::Apis::BigqueryV2::Clustering.new
  @gapi.configuration.load.clustering.fields = fields
end

#create=(new_create) ⇒ Object

Sets the create disposition.

This specifies whether the job is allowed to create new tables. The default value is needed.

The following values are supported:

  • needed - Create the table if it does not exist.
  • never - The table must already exist. A 'notFound' error is raised if the table does not exist.

Parameters:

  • new_create (String)

    The new create disposition.



1020
1021
1022
# File 'lib/google/cloud/bigquery/load_job.rb', line 1020

def create= new_create
  @gapi.configuration.load.update! create_disposition: Convert.create_disposition(new_create)
end

#date(name, description: nil, mode: :nullable) ⇒ Object

Adds a date field to the schema.

See Schema#date.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |schema|
  schema.date "birthday", mode: :required
end

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



899
900
901
# File 'lib/google/cloud/bigquery/load_job.rb', line 899

def date name, description: nil, mode: :nullable
  schema.date name, description: description, mode: mode
end

#datetime(name, description: nil, mode: :nullable) ⇒ Object

Adds a datetime field to the schema.

See Schema#datetime.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |schema|
  schema.datetime "target_end", mode: :required
end

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



871
872
873
# File 'lib/google/cloud/bigquery/load_job.rb', line 871

def datetime name, description: nil, mode: :nullable
  schema.datetime name, description: description, mode: mode
end

#delimiter=(val) ⇒ Object

Sets the separator for fields in a CSV file.

Parameters:

  • val (String)

    Specifices the separator for fields in a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. Default is ,.



1161
1162
1163
# File 'lib/google/cloud/bigquery/load_job.rb', line 1161

def delimiter= val
  @gapi.configuration.load.update! field_delimiter: val
end

#encoding=(val) ⇒ Object

Sets the character encoding of the data.

Parameters:

  • val (String)

    The character encoding of the data. The supported values are UTF-8 or ISO-8859-1. The default value is UTF-8.



1147
1148
1149
# File 'lib/google/cloud/bigquery/load_job.rb', line 1147

def encoding= val
  @gapi.configuration.load.update! encoding: val
end

#encryption=(val) ⇒ Object

Sets the encryption configuration of the destination table.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

key_name = "projects/a/locations/b/keyRings/c/cryptoKeys/d"
encrypt_config = bigquery.encryption kms_key: key_name
job = dataset.load_job "my_table", "gs://abc/file" do |job|
  job.encryption = encrypt_config
end

Parameters:

  • val (Google::Cloud::BigQuery::EncryptionConfiguration)

    Custom encryption configuration (e.g., Cloud KMS keys).



1299
1300
1301
# File 'lib/google/cloud/bigquery/load_job.rb', line 1299

def encryption= val
  @gapi.configuration.load.update! destination_encryption_configuration: val.to_gapi
end

#float(name, description: nil, mode: :nullable) ⇒ Object

Adds a floating-point number field to the schema.

See Schema#float.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |schema|
  schema.float "price", mode: :required
end

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



701
702
703
# File 'lib/google/cloud/bigquery/load_job.rb', line 701

def float name, description: nil, mode: :nullable
  schema.float name, description: description, mode: mode
end

#format=(new_format) ⇒ Object

Sets the source file format. The default value is csv.

The following values are supported:

Parameters:

  • new_format (String)

    The new source format.



1000
1001
1002
# File 'lib/google/cloud/bigquery/load_job.rb', line 1000

def format= new_format
  @gapi.configuration.load.update! source_format: Convert.source_format(new_format)
end

#ignore_unknown=(val) ⇒ Object

Allows unknown columns to be ignored.

Parameters:

  • val (Boolean)

    Indicates if BigQuery should allow extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false.

    The format property determines what BigQuery treats as an extra value:

    • CSV: Trailing columns
    • JSON: Named values that don't match any column names


1183
1184
1185
# File 'lib/google/cloud/bigquery/load_job.rb', line 1183

def ignore_unknown= val
  @gapi.configuration.load.update! ignore_unknown_values: val
end

#integer(name, description: nil, mode: :nullable) ⇒ Object

Adds an integer field to the schema.

See Schema#integer.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |schema|
  schema.integer "age", mode: :required
end

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



673
674
675
# File 'lib/google/cloud/bigquery/load_job.rb', line 673

def integer name, description: nil, mode: :nullable
  schema.integer name, description: description, mode: mode
end

#jagged_rows=(val) ⇒ Object

Sets flag for allowing jagged rows.

Accept rows that are missing trailing optional columns. The missing values are treated as nulls. If false, records with missing trailing columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false. Only applicable to CSV, ignored for other formats.

Parameters:

  • val (Boolean)

    Accept rows that are missing trailing optional columns.



1108
1109
1110
# File 'lib/google/cloud/bigquery/load_job.rb', line 1108

def jagged_rows= val
  @gapi.configuration.load.update! allow_jagged_rows: val
end

#labels=(val) ⇒ Object

Sets the labels to use for the load job.

Parameters:

  • val (Hash)

    A hash of user-provided labels associated with the job. You can use these to organize and group your jobs.

    The labels applied to a resource must meet the following requirements:

    • Each resource can have multiple labels, up to a maximum of 64.
    • Each label must be a key-value pair.
    • Keys have a minimum length of 1 character and a maximum length of 63 characters, and cannot be empty. Values can be empty, and have a maximum length of 63 characters.
    • Keys and values can contain only lowercase letters, numeric characters, underscores, and dashes. All characters must use UTF-8 encoding, and international characters are allowed.
    • The key portion of a label must be unique. However, you can use the same key with multiple resources.
    • Keys must start with a lowercase letter or international character.


1325
1326
1327
# File 'lib/google/cloud/bigquery/load_job.rb', line 1325

def labels= val
  @gapi.configuration.update! labels: val
end

#location=(value) ⇒ Object

Sets the geographic location where the job should run. Required except for US and EU.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |j|
  j.schema do |s|
    s.string "first_name", mode: :required
    s.record "cities_lived", mode: :repeated do |r|
      r.string "place", mode: :required
      r.integer "number_of_years", mode: :required
    end
  end
  j.location = "EU"
end

Parameters:

  • value (String)

    A geographic location, such as "US", "EU" or "asia-northeast1". Required except for US and EU.



975
976
977
978
979
980
981
982
# File 'lib/google/cloud/bigquery/load_job.rb', line 975

def location= value
  @gapi.job_reference.location = value
  return unless value.nil?

  # Treat assigning value of nil the same as unsetting the value.
  unset = @gapi.job_reference.instance_variables.include? :@location
  @gapi.job_reference.remove_instance_variable :@location if unset
end

#max_bad_records=(val) ⇒ Object

Sets the maximum number of bad records that can be ignored.

Parameters:

  • val (Integer)

    The maximum number of bad records that BigQuery can ignore when running the job. If the number of bad records exceeds this value, an invalid error is returned in the job result. The default value is 0, which requires that all records are valid.



1198
1199
1200
# File 'lib/google/cloud/bigquery/load_job.rb', line 1198

def max_bad_records= val
  @gapi.configuration.load.update! max_bad_records: val
end

#null_marker=(val) ⇒ Object

Sets the string that represents a null value in a CSV file.

Parameters:

  • val (String)

    Specifies a string that represents a null value in a CSV file. For example, if you specify \N, BigQuery interprets \N as a null value when loading a CSV file. The default value is the empty string. If you set this property to a custom value, BigQuery throws an error if an empty string is present for all data types except for STRING and BYTE. For STRING and BYTE columns, BigQuery interprets the empty string as an empty value.



1216
1217
1218
# File 'lib/google/cloud/bigquery/load_job.rb', line 1216

def null_marker= val
  @gapi.configuration.load.update! null_marker: val
end

#numeric(name, description: nil, mode: :nullable) ⇒ Object

Adds a numeric number field to the schema. Numeric is a fixed-precision numeric type with 38 decimal digits, 9 that follow the decimal point.

See Schema#numeric

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |schema|
  schema.numeric "total_cost", mode: :required
end

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



731
732
733
# File 'lib/google/cloud/bigquery/load_job.rb', line 731

def numeric name, description: nil, mode: :nullable
  schema.numeric name, description: description, mode: mode
end

#projection_fields=(new_fields) ⇒ Object

Sets the projection fields.

If the format option is set to datastore_backup, indicates which entity properties to load from a Cloud Datastore backup. Property names are case sensitive and must be top-level properties. If not set, BigQuery loads all properties. If any named property isn't found in the Cloud Datastore backup, an invalid error is returned.

Parameters:

  • new_fields (Array<String>)

    The new projection fields.



1059
1060
1061
1062
1063
1064
1065
# File 'lib/google/cloud/bigquery/load_job.rb', line 1059

def projection_fields= new_fields
  if new_fields.nil?
    @gapi.configuration.load.update! projection_fields: nil
  else
    @gapi.configuration.load.update! projection_fields: Array(new_fields)
  end
end

#quote=(val) ⇒ Object

Sets the character to use to quote string values in CSVs.

Parameters:

  • val (String)

    The value that is used to quote data sections in a CSV file. BigQuery converts the string to ISO-8859-1 encoding, and then uses the first byte of the encoded string to split the data in its raw, binary state. The default value is a double-quote ". If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set the allowQuotedNewlines property to true.



1234
1235
1236
# File 'lib/google/cloud/bigquery/load_job.rb', line 1234

def quote= val
  @gapi.configuration.load.update! quote: val
end

#quoted_newlines=(val) ⇒ Object

Allows quoted data sections to contain newline characters in CSV.

Parameters:

  • val (Boolean)

    Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file. The default value is false.



1121
1122
1123
# File 'lib/google/cloud/bigquery/load_job.rb', line 1121

def quoted_newlines= val
  @gapi.configuration.load.update! allow_quoted_newlines: val
end

#range_partitioning_end=(range_end) ⇒ Object

Sets the end of range partitioning, exclusive, for the destination table. See Creating and using integer range partitioned tables.

You can only set range partitioning when creating a table. BigQuery does not allow you to change partitioning on an existing table.

See #range_partitioning_start=, #range_partitioning_interval= and #range_partitioning_field=.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

gs_url = "gs://my-bucket/file-name.csv"
load_job = dataset.load_job "my_new_table", gs_url do |job|
  job.schema do |schema|
    schema.integer "my_table_id", mode: :required
    schema.string "my_table_data", mode: :required
  end
  job.range_partitioning_field = "my_table_id"
  job.range_partitioning_start = 0
  job.range_partitioning_interval = 10
  job.range_partitioning_end = 100
end

load_job.wait_until_done!
load_job.done? #=> true

Parameters:

  • range_end (Integer)

    The end of range partitioning, exclusive.



1488
1489
1490
1491
1492
1493
# File 'lib/google/cloud/bigquery/load_job.rb', line 1488

def range_partitioning_end= range_end
  @gapi.configuration.load.range_partitioning ||= Google::Apis::BigqueryV2::RangePartitioning.new(
    range: Google::Apis::BigqueryV2::RangePartitioning::Range.new
  )
  @gapi.configuration.load.range_partitioning.range.end = range_end
end

#range_partitioning_field=(field) ⇒ Object

Sets the field on which to range partition the table. See Creating and using integer range partitioned tables.

See #range_partitioning_start=, #range_partitioning_interval= and #range_partitioning_end=.

You can only set range partitioning when creating a table. BigQuery does not allow you to change partitioning on an existing table.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

gs_url = "gs://my-bucket/file-name.csv"
load_job = dataset.load_job "my_new_table", gs_url do |job|
  job.schema do |schema|
    schema.integer "my_table_id", mode: :required
    schema.string "my_table_data", mode: :required
  end
  job.range_partitioning_field = "my_table_id"
  job.range_partitioning_start = 0
  job.range_partitioning_interval = 10
  job.range_partitioning_end = 100
end

load_job.wait_until_done!
load_job.done? #=> true

Parameters:

  • field (String)

    The range partition field. the destination table is partitioned by this field. The field must be a top-level NULLABLE/REQUIRED field. The only supported type is INTEGER/INT64.



1365
1366
1367
1368
1369
1370
# File 'lib/google/cloud/bigquery/load_job.rb', line 1365

def range_partitioning_field= field
  @gapi.configuration.load.range_partitioning ||= Google::Apis::BigqueryV2::RangePartitioning.new(
    range: Google::Apis::BigqueryV2::RangePartitioning::Range.new
  )
  @gapi.configuration.load.range_partitioning.field = field
end

#range_partitioning_interval=(range_interval) ⇒ Object

Sets width of each interval for data in range partitions. See Creating and using integer range partitioned tables.

You can only set range partitioning when creating a table. BigQuery does not allow you to change partitioning on an existing table.

See #range_partitioning_field=, #range_partitioning_start= and #range_partitioning_end=.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

gs_url = "gs://my-bucket/file-name.csv"
load_job = dataset.load_job "my_new_table", gs_url do |job|
  job.schema do |schema|
    schema.integer "my_table_id", mode: :required
    schema.string "my_table_data", mode: :required
  end
  job.range_partitioning_field = "my_table_id"
  job.range_partitioning_start = 0
  job.range_partitioning_interval = 10
  job.range_partitioning_end = 100
end

load_job.wait_until_done!
load_job.done? #=> true

Parameters:

  • range_interval (Integer)

    The width of each interval, for data in partitions.



1447
1448
1449
1450
1451
1452
# File 'lib/google/cloud/bigquery/load_job.rb', line 1447

def range_partitioning_interval= range_interval
  @gapi.configuration.load.range_partitioning ||= Google::Apis::BigqueryV2::RangePartitioning.new(
    range: Google::Apis::BigqueryV2::RangePartitioning::Range.new
  )
  @gapi.configuration.load.range_partitioning.range.interval = range_interval
end

#range_partitioning_start=(range_start) ⇒ Object

Sets the start of range partitioning, inclusive, for the destination table. See Creating and using integer range partitioned tables.

You can only set range partitioning when creating a table. BigQuery does not allow you to change partitioning on an existing table.

See #range_partitioning_field=, #range_partitioning_interval= and #range_partitioning_end=.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

gs_url = "gs://my-bucket/file-name.csv"
load_job = dataset.load_job "my_new_table", gs_url do |job|
  job.schema do |schema|
    schema.integer "my_table_id", mode: :required
    schema.string "my_table_data", mode: :required
  end
  job.range_partitioning_field = "my_table_id"
  job.range_partitioning_start = 0
  job.range_partitioning_interval = 10
  job.range_partitioning_end = 100
end

load_job.wait_until_done!
load_job.done? #=> true

Parameters:

  • range_start (Integer)

    The start of range partitioning, inclusive.



1406
1407
1408
1409
1410
1411
# File 'lib/google/cloud/bigquery/load_job.rb', line 1406

def range_partitioning_start= range_start
  @gapi.configuration.load.range_partitioning ||= Google::Apis::BigqueryV2::RangePartitioning.new(
    range: Google::Apis::BigqueryV2::RangePartitioning::Range.new
  )
  @gapi.configuration.load.range_partitioning.range.start = range_start
end

#record(name, description: nil, mode: nil) {|nested_schema| ... } ⇒ Object

Adds a record field to the schema. A block must be passed describing the nested fields of the record. For more information about nested and repeated records, see Loading denormalized, nested, and repeated data .

See Schema#record.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |schema|
  schema.record "cities_lived", mode: :repeated do |cities_lived|
    cities_lived.string "place", mode: :required
    cities_lived.integer "number_of_years", mode: :required
  end
end

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: nil)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

Yields:

  • (nested_schema)

    a block for setting the nested schema

Yield Parameters:

  • nested_schema (Schema)

    the object accepting the nested schema



938
939
940
# File 'lib/google/cloud/bigquery/load_job.rb', line 938

def record name, description: nil, mode: nil, &block
  schema.record name, description: description, mode: mode, &block
end

#reload!Object Also known as: refresh!



1678
1679
1680
# File 'lib/google/cloud/bigquery/load_job.rb', line 1678

def reload!
  raise "not implemented in #{self.class}"
end

#rerun!Object



1674
1675
1676
# File 'lib/google/cloud/bigquery/load_job.rb', line 1674

def rerun!
  raise "not implemented in #{self.class}"
end

#schema(replace: false) {|schema| ... } ⇒ Google::Cloud::Bigquery::Schema

Returns the table's schema. This method can also be used to set, replace, or add to the schema by passing a block. See Schema for available methods.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |j|
  j.schema do |s|
    s.string "first_name", mode: :required
    s.record "cities_lived", mode: :repeated do |r|
      r.string "place", mode: :required
      r.integer "number_of_years", mode: :required
    end
  end
end

Parameters:

  • replace (Boolean) (defaults to: false)

    Whether to replace the existing schema with the new schema. If true, the fields will replace the existing schema. If false, the fields will be added to the existing schema. When a table already contains data, schema changes must be additive. Thus, the default value is false.

Yields:

  • (schema)

    a block for setting the schema

Yield Parameters:

  • schema (Schema)

    the object accepting the schema

Returns:



577
578
579
580
581
582
583
584
585
586
587
588
# File 'lib/google/cloud/bigquery/load_job.rb', line 577

def schema replace: false
  # Same as Table#schema, but not frozen
  # TODO: make sure to call ensure_full_data! on Dataset#update
  @schema ||= Schema.from_gapi @gapi.configuration.load.schema
  if block_given?
    @schema = Schema.from_gapi if replace
    yield @schema
    check_for_mutated_schema!
  end
  # Do not freeze on updater, allow modifications
  @schema
end

#schema=(new_schema) ⇒ Object

Sets the schema of the destination table.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
schema = bigquery.schema do |s|
  s.string "first_name", mode: :required
  s.record "cities_lived", mode: :repeated do |nested_schema|
    nested_schema.string "place", mode: :required
    nested_schema.integer "number_of_years", mode: :required
  end
end
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |j|
  j.schema = schema
end

Parameters:

  • new_schema (Google::Cloud::Bigquery::Schema)

    The schema for the destination table. Optional. The schema can be omitted if the destination table already exists, or if you're loading data from a source that includes a schema, such as Avro or a Google Cloud Datastore backup.



617
618
619
# File 'lib/google/cloud/bigquery/load_job.rb', line 617

def schema= new_schema
  @schema = new_schema
end

#schema_update_options=(new_options) ⇒ Object

Sets the schema update options, which allow the schema of the destination table to be updated as a side effect of the load job if a schema is autodetected or supplied in the job configuration. Schema update options are supported in two cases: when write disposition is WRITE_APPEND; when write disposition is WRITE_TRUNCATE and the destination table is a partition of a table, specified by partition decorators. For normal tables, WRITE_TRUNCATE will always overwrite the schema. One or more of the following values are specified:

  • ALLOW_FIELD_ADDITION: allow adding a nullable field to the schema.
  • ALLOW_FIELD_RELAXATION: allow relaxing a required field in the original schema to nullable.

Parameters:

  • new_options (Array<String>)

    The new schema update options.



1258
1259
1260
1261
1262
1263
1264
# File 'lib/google/cloud/bigquery/load_job.rb', line 1258

def schema_update_options= new_options
  if new_options.nil?
    @gapi.configuration.load.update! schema_update_options: nil
  else
    @gapi.configuration.load.update! schema_update_options: Array(new_options)
  end
end

#skip_leading=(val) ⇒ Object

Sets the number of leading rows to skip in the file.

Parameters:

  • val (Integer)

    The number of rows at the top of a CSV file that BigQuery will skip when loading the data. The default value is 0. This property is useful if you have header rows in the file that should be skipped.



1276
1277
1278
# File 'lib/google/cloud/bigquery/load_job.rb', line 1276

def skip_leading= val
  @gapi.configuration.load.update! skip_leading_rows: val
end

#source_uris=(new_uris) ⇒ Object

Sets the source URIs to load.

The fully-qualified URIs that point to your data in Google Cloud.

  • For Google Cloud Storage URIs: Each URI can contain one '*' wildcard character and it must come after the 'bucket' name. Size limits related to load jobs apply to external data sources. For
  • Google Cloud Bigtable URIs: Exactly one URI can be specified and it has be a fully specified and valid HTTPS URL for a Google Cloud Bigtable table.
  • For Google Cloud Datastore backups: Exactly one URI can be specified. Also, the '*' wildcard character is not allowed.

Parameters:

  • new_uris (Array<String>)

    The new source URIs to load.



1085
1086
1087
1088
1089
1090
1091
# File 'lib/google/cloud/bigquery/load_job.rb', line 1085

def source_uris= new_uris
  if new_uris.nil?
    @gapi.configuration.load.update! source_uris: nil
  else
    @gapi.configuration.load.update! source_uris: Array(new_uris)
  end
end

#string(name, description: nil, mode: :nullable) ⇒ Object

Adds a string field to the schema.

See Schema#string.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |schema|
  schema.string "first_name", mode: :required
end

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



645
646
647
# File 'lib/google/cloud/bigquery/load_job.rb', line 645

def string name, description: nil, mode: :nullable
  schema.string name, description: description, mode: mode
end

#time(name, description: nil, mode: :nullable) ⇒ Object

Adds a time field to the schema.

See Schema#time.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |schema|
  schema.time "duration", mode: :required
end

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



843
844
845
# File 'lib/google/cloud/bigquery/load_job.rb', line 843

def time name, description: nil, mode: :nullable
  schema.time name, description: description, mode: mode
end

#time_partitioning_expiration=(expiration) ⇒ Object

Sets the time partition expiration for the destination table. See Partitioned Tables.

The destination table must also be time partitioned. See #time_partitioning_type=.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

gs_url = "gs://my-bucket/file-name.csv"
load_job = dataset.load_job "my_new_table", gs_url do |job|
  job.time_partitioning_type = "DAY"
  job.time_partitioning_expiration = 86_400
end

load_job.wait_until_done!
load_job.done? #=> true

Parameters:

  • expiration (Integer)

    An expiration time, in seconds, for data in time partitions.



1599
1600
1601
1602
# File 'lib/google/cloud/bigquery/load_job.rb', line 1599

def time_partitioning_expiration= expiration
  @gapi.configuration.load.time_partitioning ||= Google::Apis::BigqueryV2::TimePartitioning.new
  @gapi.configuration.load.time_partitioning.update! expiration_ms: expiration * 1000
end

#time_partitioning_field=(field) ⇒ Object

Sets the field on which to time partition the destination table. If not set, the destination table is time partitioned by pseudo column _PARTITIONTIME; if set, the table is time partitioned by this field. See Partitioned Tables.

The destination table must also be time partitioned. See #time_partitioning_type=.

You can only set the time partitioning field while creating a table. BigQuery does not allow you to change partitioning on an existing table.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

gs_url = "gs://my-bucket/file-name.csv"
load_job = dataset.load_job "my_new_table", gs_url do |job|
  job.time_partitioning_type  = "DAY"
  job.time_partitioning_field = "dob"
  job.schema do |schema|
    schema.timestamp "dob", mode: :required
  end
end

load_job.wait_until_done!
load_job.done? #=> true

Parameters:

  • field (String)

    The time partition field. The field must be a top-level TIMESTAMP or DATE field. Its mode must be NULLABLE or REQUIRED.



1566
1567
1568
1569
# File 'lib/google/cloud/bigquery/load_job.rb', line 1566

def time_partitioning_field= field
  @gapi.configuration.load.time_partitioning ||= Google::Apis::BigqueryV2::TimePartitioning.new
  @gapi.configuration.load.time_partitioning.update! field: field
end

#time_partitioning_require_filter=(val) ⇒ Object

If set to true, queries over the destination table will require a time partition filter that can be used for time partition elimination to be specified. See Partitioned Tables.

Parameters:

  • val (Boolean)

    Indicates if queries over the destination table will require a time partition filter. The default value is false.



1615
1616
1617
1618
# File 'lib/google/cloud/bigquery/load_job.rb', line 1615

def time_partitioning_require_filter= val
  @gapi.configuration.load.time_partitioning ||= Google::Apis::BigqueryV2::TimePartitioning.new
  @gapi.configuration.load.time_partitioning.update! require_partition_filter: val
end

#time_partitioning_type=(type) ⇒ Object

Sets the time partitioning for the destination table. See Partitioned Tables.

You can only set the time partitioning field while creating a table. BigQuery does not allow you to change partitioning on an existing table.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

gs_url = "gs://my-bucket/file-name.csv"
load_job = dataset.load_job "my_new_table", gs_url do |job|
  job.time_partitioning_type = "DAY"
end

load_job.wait_until_done!
load_job.done? #=> true

Parameters:

  • type (String)

    The time partition type. The supported types are DAY, HOUR, MONTH, and YEAR, which will generate one partition per day, hour, month, and year, respectively.



1523
1524
1525
1526
# File 'lib/google/cloud/bigquery/load_job.rb', line 1523

def time_partitioning_type= type
  @gapi.configuration.load.time_partitioning ||= Google::Apis::BigqueryV2::TimePartitioning.new
  @gapi.configuration.load.time_partitioning.update! type: type
end

#timestamp(name, description: nil, mode: :nullable) ⇒ Object

Adds a timestamp field to the schema.

See Schema#timestamp.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
job = dataset.load_job "my_table", "gs://abc/file" do |schema|
  schema.timestamp "creation_date", mode: :required
end

Parameters:

  • name (String)

    The field name. The name must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.



815
816
817
# File 'lib/google/cloud/bigquery/load_job.rb', line 815

def timestamp name, description: nil, mode: :nullable
  schema.timestamp name, description: description, mode: mode
end

#wait_until_done!Object



1683
1684
1685
# File 'lib/google/cloud/bigquery/load_job.rb', line 1683

def wait_until_done!
  raise "not implemented in #{self.class}"
end

#write=(new_write) ⇒ Object

Sets the write disposition.

This specifies how to handle data already present in the table. The default value is append.

The following values are supported:

  • truncate - BigQuery overwrites the table data.
  • append - BigQuery appends the data to the table.
  • empty - An error will be returned if the table already contains data.

Parameters:

  • new_write (String)

    The new write disposition.



1041
1042
1043
# File 'lib/google/cloud/bigquery/load_job.rb', line 1041

def write= new_write
  @gapi.configuration.load.update! write_disposition: Convert.write_disposition(new_write)
end