Class: Google::Cloud::Bigquery::Schema

Inherits:
Object
  • Object
show all
Defined in:
lib/google/cloud/bigquery/schema.rb,
lib/google/cloud/bigquery/schema/field.rb

Overview

Table Schema

A builder for BigQuery table schemas, passed to block arguments to Dataset#create_table and Table#schema. Supports nested and repeated fields via a nested block.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"

table.schema do |schema|
  schema.string "first_name", mode: :required
  schema.record "cities_lived", mode: :repeated do |cities_lived|
    cities_lived.string "place", mode: :required
    cities_lived.integer "number_of_years", mode: :required
  end
end

See Also:

Defined Under Namespace

Classes: Field

Class Method Summary collapse

Instance Method Summary collapse

Class Method Details

.dump(schema, destination) ⇒ Schema

Write a schema as JSON to a file.

The JSON schema file is the same as for the bq CLI.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
schema = Google::Cloud::Bigquery::Schema.dump(
  table.schema,
  "schema.json"
)

Parameters:

  • schema (Schema)

    A Google::Cloud::Bigquery::Schema.

  • destination (IO, String)

    An IO to which to write the schema, or a String containing the filename to write to.

Returns:

  • (Schema)

    The schema so that commands are chainable.



105
106
107
# File 'lib/google/cloud/bigquery/schema.rb', line 105

def dump schema, destination
  schema.dump destination
end

.load(source) ⇒ Schema

Load a schema from a JSON file.

The JSON schema file is the same as for the bq CLI consisting of an array of JSON objects containing the following:

  • name: The column name
  • type: The column's data type
  • description: (Optional) The column's description
  • mode: (Optional) The column's mode (if unspecified, mode defaults to NULLABLE)
  • fields: If type is RECORD, an array of objects defining child fields with these properties

Examples:

require "google/cloud/bigquery"

schema = Google::Cloud::Bigquery::Schema.load(
  File.read("schema.json")
)

Parameters:

  • source (IO, String, Array<Hash>)

    An IO containing the JSON schema, a String containing the JSON schema, or an Array of Hashes containing the schema details.

Returns:



77
78
79
# File 'lib/google/cloud/bigquery/schema.rb', line 77

def load source
  new.load source
end

Instance Method Details

#bignumeric(name, description: nil, mode: :nullable, policy_tags: nil, precision: nil, scale: nil) ⇒ Object

Adds a bignumeric number field to the schema. BIGNUMERIC is a decimal type with fixed precision and scale. Precision is the number of digits that the number contains. Scale is how many of these digits appear after the decimal point. It supports:

Precision: 76.76 (the 77th digit is partial) Scale: 38 Min: -5.7896044618658097711785492504343953926634992332820282019728792003956564819968E+38 Max: 5.7896044618658097711785492504343953926634992332820282019728792003956564819967E+38

This type can represent decimal fractions exactly, and is suitable for financial calculations.

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

  • policy_tags (Array<String>, String) (defaults to: nil)

    The policy tag list or single policy tag for the field. Policy tag identifiers are of the form projects/*/locations/*/taxonomies/*/policyTags/*. At most 1 policy tag is currently allowed.

  • precision (Integer) (defaults to: nil)

    The precision (maximum number of total digits) for the field. Acceptable values for precision must be: 1 ≤ (precision - scale) ≤ 38. Values for scale must be: 0 ≤ scale ≤ 38. If the scale value is set, the precision value must be set as well.

  • scale (Integer) (defaults to: nil)

    The scale (maximum number of digits in the fractional part) for the field. Acceptable values for precision must be: 1 ≤ (precision - scale) ≤ 38. Values for scale must be: 0 ≤ scale ≤ 38. If the scale value is set, the precision value must be set as well.



437
438
439
440
441
442
443
444
445
# File 'lib/google/cloud/bigquery/schema.rb', line 437

def bignumeric name, description: nil, mode: :nullable, policy_tags: nil, precision: nil, scale: nil
  add_field name,
            :bignumeric,
            description: description,
            mode: mode,
            policy_tags: policy_tags,
            precision: precision,
            scale: scale
end

#boolean(name, description: nil, mode: :nullable, policy_tags: nil) ⇒ Object

Adds a boolean field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

  • policy_tags (Array<String>, String) (defaults to: nil)

    The policy tag list or single policy tag for the field. Policy tag identifiers are of the form projects/*/locations/*/taxonomies/*/policyTags/*. At most 1 policy tag is currently allowed.



463
464
465
# File 'lib/google/cloud/bigquery/schema.rb', line 463

def boolean name, description: nil, mode: :nullable, policy_tags: nil
  add_field name, :boolean, description: description, mode: mode, policy_tags: policy_tags
end

#bytes(name, description: nil, mode: :nullable, policy_tags: nil, max_length: nil) ⇒ Object

Adds a bytes field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

  • policy_tags (Array<String>, String) (defaults to: nil)

    The policy tag list or single policy tag for the field. Policy tag identifiers are of the form projects/*/locations/*/taxonomies/*/policyTags/*. At most 1 policy tag is currently allowed.

  • max_length (Integer) (defaults to: nil)

    The maximum the maximum number of bytes in the field.



485
486
487
# File 'lib/google/cloud/bigquery/schema.rb', line 485

def bytes name, description: nil, mode: :nullable, policy_tags: nil, max_length: nil
  add_field name, :bytes, description: description, mode: mode, policy_tags: policy_tags, max_length: max_length
end

#date(name, description: nil, mode: :nullable, policy_tags: nil) ⇒ Object

Adds a date field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

  • policy_tags (Array<String>, String) (defaults to: nil)

    The policy tag list or single policy tag for the field. Policy tag identifiers are of the form projects/*/locations/*/taxonomies/*/policyTags/*. At most 1 policy tag is currently allowed.



565
566
567
# File 'lib/google/cloud/bigquery/schema.rb', line 565

def date name, description: nil, mode: :nullable, policy_tags: nil
  add_field name, :date, description: description, mode: mode, policy_tags: policy_tags
end

#datetime(name, description: nil, mode: :nullable, policy_tags: nil) ⇒ Object

Adds a datetime field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

  • policy_tags (Array<String>, String) (defaults to: nil)

    The policy tag list or single policy tag for the field. Policy tag identifiers are of the form projects/*/locations/*/taxonomies/*/policyTags/*. At most 1 policy tag is currently allowed.



545
546
547
# File 'lib/google/cloud/bigquery/schema.rb', line 545

def datetime name, description: nil, mode: :nullable, policy_tags: nil
  add_field name, :datetime, description: description, mode: mode, policy_tags: policy_tags
end

#dump(destination) ⇒ Schema

Write the schema as JSON to a file.

The JSON schema file is the same as for the bq CLI.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"
table.schema.dump "schema.json"

Parameters:

  • destination (IO, String)

    An IO to which to write the schema, or a String containing the filename to write to.

Returns:

  • (Schema)

    The schema so that commands are chainable.



275
276
277
278
279
280
281
282
283
284
# File 'lib/google/cloud/bigquery/schema.rb', line 275

def dump destination
  if destination.respond_to?(:rewind) && destination.respond_to?(:write)
    destination.rewind
    destination.write JSON.dump(fields.map(&:to_hash))
  else
    File.write String(destination), JSON.dump(fields.map(&:to_hash))
  end

  self
end

#empty?Boolean

Whether the schema has no fields defined.

Returns:

  • (Boolean)

    true when there are no fields, false otherwise.



205
206
207
# File 'lib/google/cloud/bigquery/schema.rb', line 205

def empty?
  fields.empty?
end

#field(name) {|f| ... } ⇒ Field

Retrieve a field by name.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

field = table.schema.field "name"
field.required? #=> true

Yields:

  • (f)

Returns:

  • (Field)

    A field object.



193
194
195
196
197
198
# File 'lib/google/cloud/bigquery/schema.rb', line 193

def field name
  f = fields.find { |fld| fld.name == name.to_s }
  return nil if f.nil?
  yield f if block_given?
  f
end

#fieldsArray<Field>

The fields of the table schema.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

schema = table.schema

schema.fields.each do |field|
  puts field.name
end

Returns:

  • (Array<Field>)

    An array of field objects.



127
128
129
130
131
132
133
# File 'lib/google/cloud/bigquery/schema.rb', line 127

def fields
  if frozen?
    Array(@gapi.fields).map { |f| Field.from_gapi(f).freeze }.freeze
  else
    Array(@gapi.fields).map { |f| Field.from_gapi f }
  end
end

#float(name, description: nil, mode: :nullable, policy_tags: nil) ⇒ Object

Adds a floating-point number field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

  • policy_tags (Array<String>, String) (defaults to: nil)

    The policy tag list or single policy tag for the field. Policy tag identifiers are of the form projects/*/locations/*/taxonomies/*/policyTags/*. At most 1 policy tag is currently allowed.



349
350
351
# File 'lib/google/cloud/bigquery/schema.rb', line 349

def float name, description: nil, mode: :nullable, policy_tags: nil
  add_field name, :float, description: description, mode: mode, policy_tags: policy_tags
end

#geography(name, description: nil, mode: :nullable, policy_tags: nil) ⇒ Object

Adds a geography field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

  • policy_tags (Array<String>, String) (defaults to: nil)

    The policy tag list or single policy tag for the field. Policy tag identifiers are of the form projects/*/locations/*/taxonomies/*/policyTags/*. At most 1 policy tag is currently allowed.

See Also:



587
588
589
# File 'lib/google/cloud/bigquery/schema.rb', line 587

def geography name, description: nil, mode: :nullable, policy_tags: nil
  add_field name, :geography, description: description, mode: mode, policy_tags: policy_tags
end

#headersArray<Symbol>

The names of the fields as symbols.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"

schema = table.schema

schema.headers.each do |header|
  puts header
end

Returns:

  • (Array<Symbol>)

    An array of column names.



153
154
155
# File 'lib/google/cloud/bigquery/schema.rb', line 153

def headers
  fields.map(&:name).map(&:to_sym)
end

#integer(name, description: nil, mode: :nullable, policy_tags: nil) ⇒ Object

Adds an integer field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

  • policy_tags (Array<String>, String) (defaults to: nil)

    The policy tag list or single policy tag for the field. Policy tag identifiers are of the form projects/*/locations/*/taxonomies/*/policyTags/*. At most 1 policy tag is currently allowed.



329
330
331
# File 'lib/google/cloud/bigquery/schema.rb', line 329

def integer name, description: nil, mode: :nullable, policy_tags: nil
  add_field name, :integer, description: description, mode: mode, policy_tags: policy_tags
end

#load(source) ⇒ Schema

Load the schema from a JSON file.

The JSON schema file is the same as for the bq CLI consisting of an array of JSON objects containing the following:

  • name: The column name
  • type: The column's data type
  • description: (Optional) The column's description
  • mode: (Optional) The column's mode (if unspecified, mode defaults to NULLABLE)
  • fields: If type is RECORD, an array of objects defining child fields with these properties

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table" do |t|
  t.schema.load File.read("path/to/schema.json")
end

Parameters:

  • source (IO, String, Array<Hash>)

    An IO containing the JSON schema, a String containing the JSON schema, or an Array of Hashes containing the schema details.

Returns:

  • (Schema)

    The schema so that commands are chainable.



239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
# File 'lib/google/cloud/bigquery/schema.rb', line 239

def load source
  if source.respond_to?(:rewind) && source.respond_to?(:read)
    source.rewind
    schema_json = String source.read
  elsif source.is_a? Array
    schema_json = JSON.dump source
  else
    schema_json = String source
  end

  schema_json = %({"fields":#{schema_json}})

  @gapi = Google::Apis::BigqueryV2::TableSchema.from_json schema_json

  self
end

#numeric(name, description: nil, mode: :nullable, policy_tags: nil, precision: nil, scale: nil) ⇒ Object

Adds a numeric number field to the schema. NUMERIC is a decimal type with fixed precision and scale. Precision is the number of digits that the number contains. Scale is how many of these digits appear after the decimal point. It supports:

Precision: 38 Scale: 9 Min: -9.9999999999999999999999999999999999999E+28 Max: 9.9999999999999999999999999999999999999E+28

This type can represent decimal fractions exactly, and is suitable for financial calculations.

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

  • policy_tags (Array<String>, String) (defaults to: nil)

    The policy tag list or single policy tag for the field. Policy tag identifiers are of the form projects/*/locations/*/taxonomies/*/policyTags/*. At most 1 policy tag is currently allowed.

  • precision (Integer) (defaults to: nil)

    The precision (maximum number of total digits) for the field. Acceptable values for precision must be: 1 ≤ (precision - scale) ≤ 29. Values for scale must be: 0 ≤ scale ≤ 9. If the scale value is set, the precision value must be set as well.

  • scale (Integer) (defaults to: nil)

    The scale (maximum number of digits in the fractional part) for the field. Acceptable values for precision must be: 1 ≤ (precision - scale) ≤ 29. Values for scale must be: 0 ≤ scale ≤ 9. If the scale value is set, the precision value must be set as well.



390
391
392
393
394
395
396
397
398
# File 'lib/google/cloud/bigquery/schema.rb', line 390

def numeric name, description: nil, mode: :nullable, policy_tags: nil, precision: nil, scale: nil
  add_field name,
            :numeric,
            description: description,
            mode: mode,
            policy_tags: policy_tags,
            precision: precision,
            scale: scale
end

#param_typesHash

The types of the fields, using the same format as the optional query parameter types.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"

schema = table.schema

schema.param_types

Returns:

  • (Hash)

    A hash with column names as keys, and types as values.



174
175
176
# File 'lib/google/cloud/bigquery/schema.rb', line 174

def param_types
  Hash[fields.map { |field| [field.name.to_sym, field.param_type] }]
end

#record(name, description: nil, mode: nil) {|field| ... } ⇒ Object

Adds a record field to the schema. A block must be passed describing the nested fields of the record. For more information about nested and repeated records, see Loading denormalized, nested, and repeated data .

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.create_table "my_table"

table.schema do |schema|
  schema.string "first_name", mode: :required
  schema.record "cities_lived", mode: :repeated do |cities_lived|
    cities_lived.string "place", mode: :required
    cities_lived.integer "number_of_years", mode: :required
  end
end

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: nil)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

Yields:

  • (field)

    a block for setting the nested record's schema

Yield Parameters:

  • field (Field)

    the object accepting the nested schema

Raises:

  • (ArgumentError)


625
626
627
628
629
630
631
632
# File 'lib/google/cloud/bigquery/schema.rb', line 625

def record name, description: nil, mode: nil
  # TODO: do we need to raise if no block was given?
  raise ArgumentError, "a block is required" unless block_given?

  nested_field = add_field name, :record, description: description, mode: mode
  yield nested_field
  nested_field
end

#string(name, description: nil, mode: :nullable, policy_tags: nil, max_length: nil) ⇒ Object

Adds a string field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

  • policy_tags (Array<String>, String) (defaults to: nil)

    The policy tag list or single policy tag for the field. Policy tag identifiers are of the form projects/*/locations/*/taxonomies/*/policyTags/*. At most 1 policy tag is currently allowed.

  • max_length (Integer) (defaults to: nil)

    The maximum UTF-8 length of strings allowed in the field.



304
305
306
307
308
309
310
311
# File 'lib/google/cloud/bigquery/schema.rb', line 304

def string name, description: nil, mode: :nullable, policy_tags: nil, max_length: nil
  add_field name,
            :string,
            description: description,
            mode: mode,
            policy_tags: policy_tags,
            max_length: max_length
end

#time(name, description: nil, mode: :nullable, policy_tags: nil) ⇒ Object

Adds a time field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

  • policy_tags (Array<String>, String) (defaults to: nil)

    The policy tag list or single policy tag for the field. Policy tag identifiers are of the form projects/*/locations/*/taxonomies/*/policyTags/*. At most 1 policy tag is currently allowed.



525
526
527
# File 'lib/google/cloud/bigquery/schema.rb', line 525

def time name, description: nil, mode: :nullable, policy_tags: nil
  add_field name, :time, description: description, mode: mode, policy_tags: policy_tags
end

#timestamp(name, description: nil, mode: :nullable, policy_tags: nil) ⇒ Object

Adds a timestamp field to the schema.

Parameters:

  • name (String)

    The field name. The name must contain only letters ([A-Za-z]), numbers ([0-9]), or underscores (_), and must start with a letter or underscore. The maximum length is 128 characters.

  • description (String) (defaults to: nil)

    A description of the field.

  • mode (Symbol) (defaults to: :nullable)

    The field's mode. The possible values are :nullable, :required, and :repeated. The default value is :nullable.

  • policy_tags (Array<String>, String) (defaults to: nil)

    The policy tag list or single policy tag for the field. Policy tag identifiers are of the form projects/*/locations/*/taxonomies/*/policyTags/*. At most 1 policy tag is currently allowed.



505
506
507
# File 'lib/google/cloud/bigquery/schema.rb', line 505

def timestamp name, description: nil, mode: :nullable, policy_tags: nil
  add_field name, :timestamp, description: description, mode: mode, policy_tags: policy_tags
end