Class: Google::Cloud::Bigquery::Model

Inherits:
Object
  • Object
show all
Defined in:
lib/google/cloud/bigquery/model.rb,
lib/google/cloud/bigquery/model/list.rb

Overview

Model

A model in BigQuery ML represents what an ML system has learned from the training data.

The following types of models are supported by BigQuery ML:

  • Linear regression for forecasting; for example, the sales of an item on a given day. Labels are real-valued (they cannot be +/- infinity or NaN).
  • Binary logistic regression for classification; for example, determining whether a customer will make a purchase. Labels must only have two possible values.
  • Multiclass logistic regression for classification. These models can be used to predict multiple possible values such as whether an input is "low-value," "medium-value," or "high-value." Labels can have up to 50 unique values. In BigQuery ML, multiclass logistic regression training uses a multinomial classifier with a cross entropy loss function.
  • K-means clustering for data segmentation (beta); for example, identifying customer segments. K-means is an unsupervised learning technique, so model training does not require labels nor split data for training or evaluation.

In BigQuery ML, a model can be used with data from multiple BigQuery datasets for training and for prediction.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"

model = dataset.model "my_model"

See Also:

Defined Under Namespace

Classes: List

Attributes collapse

Lifecycle collapse

Instance Method Details

#created_atTime?

The time when this model was created.

Returns:

  • (Time, nil)

    The creation time, or nil if the object is a reference (see #reference?).



241
242
243
244
# File 'lib/google/cloud/bigquery/model.rb', line 241

def created_at
  return nil if reference?
  Convert.millis_to_time @gapi_json[:creationTime]
end

#dataset_idString

The ID of the Dataset containing this model.

Returns:

  • (String)

    The ID must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_). The maximum length is 1,024 characters.



108
109
110
111
# File 'lib/google/cloud/bigquery/model.rb', line 108

def dataset_id
  return @reference.dataset_id if reference?
  @gapi_json[:modelReference][:datasetId]
end

#deleteBoolean

Permanently deletes the model.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model"

model.delete

Returns:

  • (Boolean)

    Returns true if the model was deleted.



501
502
503
504
505
506
507
# File 'lib/google/cloud/bigquery/model.rb', line 501

def delete
  ensure_service!
  service.delete_model dataset_id, model_id
  # Set flag for #exists?
  @exists = false
  true
end

#descriptionString?

A user-friendly description of the model.

Returns:

  • (String, nil)

    The description, or nil if the object is a reference (see #reference?).



211
212
213
214
215
# File 'lib/google/cloud/bigquery/model.rb', line 211

def description
  return nil if reference?
  ensure_full_data!
  @gapi_json[:description]
end

#description=(new_description) ⇒ Object

Updates the user-friendly description of the model.

If the model is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.

Parameters:

  • new_description (String)

    The new user-friendly description.



228
229
230
231
# File 'lib/google/cloud/bigquery/model.rb', line 228

def description= new_description
  ensure_full_data!
  patch_gapi! description: new_description
end

#encryptionEncryptionConfiguration?

The EncryptionConfiguration object that represents the custom encryption method used to protect this model. If not set, Dataset#default_encryption is used.

Present only if this model is using custom encryption.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model"

encrypt_config = model.encryption

Returns:

See Also:



394
395
396
397
398
399
400
401
402
# File 'lib/google/cloud/bigquery/model.rb', line 394

def encryption
  return nil if reference?
  return nil if @gapi_json[:encryptionConfiguration].nil?
  # We have to create a gapic object from the hash because that is what
  # EncryptionConfiguration is expecing.
  json_cmek = @gapi_json[:encryptionConfiguration].to_json
  gapi_cmek = Google::Apis::BigqueryV2::EncryptionConfiguration.from_json json_cmek
  EncryptionConfiguration.from_gapi(gapi_cmek).freeze
end

#encryption=(value) ⇒ Object

Set the EncryptionConfiguration object that represents the custom encryption method used to protect this model. If not set, Dataset#default_encryption is used.

Present only if this model is using custom encryption.

If the model is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model"

key_name = "projects/a/locations/b/keyRings/c/cryptoKeys/d"
encrypt_config = bigquery.encryption kms_key: key_name

model.encryption = encrypt_config

Parameters:

See Also:



434
435
436
437
438
439
440
# File 'lib/google/cloud/bigquery/model.rb', line 434

def encryption= value
  ensure_full_data!
  # We have to create a hash from the gapic object's JSON because that
  # is what Model is expecing.
  json_cmek = JSON.parse value.to_gapi.to_json, symbolize_names: true
  patch_gapi! encryptionConfiguration: json_cmek
end

#etagString?

The ETag hash of the model.

Returns:

  • (String, nil)

    The ETag hash, or nil if the object is a reference (see #reference?).



197
198
199
200
201
# File 'lib/google/cloud/bigquery/model.rb', line 197

def etag
  return nil if reference?
  ensure_full_data!
  @gapi_json[:etag]
end

#exists?(force: false) ⇒ Boolean

Determines whether the model exists in the BigQuery service. The result is cached locally. To refresh state, set force to true.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model", skip_lookup: true
model.exists? #=> true

Parameters:

  • force (Boolean) (defaults to: false)

    Force the latest resource representation to be retrieved from the BigQuery service when true. Otherwise the return value of this method will be memoized to reduce the number of API calls made to the BigQuery service. The default is false.

Returns:

  • (Boolean)

    true when the model exists in the BigQuery service, false otherwise.



559
560
561
562
563
564
565
566
# File 'lib/google/cloud/bigquery/model.rb', line 559

def exists? force: false
  return resource_exists? if force
  # If we have a value, return it
  return @exists unless @exists.nil?
  # Always true if we have a gapi_json object
  return true if resource?
  resource_exists?
end

#expires_atTime?

The time when this model expires. If not present, the model will persist indefinitely. Expired models will be deleted and their storage reclaimed.

Returns:

  • (Time, nil)

    The expiration time, or nil if not present or the object is a reference (see #reference?).



269
270
271
272
273
# File 'lib/google/cloud/bigquery/model.rb', line 269

def expires_at
  return nil if reference?
  ensure_full_data!
  Convert.millis_to_time @gapi_json[:expirationTime]
end

#expires_at=(new_expires_at) ⇒ Object

Updates time when this model expires.

If the model is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.

Parameters:

  • new_expires_at (Integer)

    The new time when this model expires.



286
287
288
289
290
# File 'lib/google/cloud/bigquery/model.rb', line 286

def expires_at= new_expires_at
  ensure_full_data!
  new_expires_millis = Convert.time_to_millis new_expires_at
  patch_gapi! expirationTime: new_expires_millis
end

#feature_columnsArray<StandardSql::Field>

The input feature columns that were used to train this model.

Returns:



449
450
451
452
453
454
455
# File 'lib/google/cloud/bigquery/model.rb', line 449

def feature_columns
  ensure_full_data!
  Array(@gapi_json[:featureColumns]).map do |field_gapi_json|
    field_gapi = Google::Apis::BigqueryV2::StandardSqlField.from_json field_gapi_json.to_json
    StandardSql::Field.from_gapi field_gapi
  end
end

#label_columnsArray<StandardSql::Field>

The label columns that were used to train this model. The output of the model will have a "predicted_" prefix to these columns.

Returns:



465
466
467
468
469
470
471
# File 'lib/google/cloud/bigquery/model.rb', line 465

def label_columns
  ensure_full_data!
  Array(@gapi_json[:labelColumns]).map do |field_gapi_json|
    field_gapi = Google::Apis::BigqueryV2::StandardSqlField.from_json field_gapi_json.to_json
    StandardSql::Field.from_gapi field_gapi
  end
end

#labelsHash<String, String>?

A hash of user-provided labels associated with this model. Labels are used to organize and group models. See Using Labels.

The returned hash is frozen and changes are not allowed. Use #labels= to replace the entire hash.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model"

labels = model.labels

Returns:

  • (Hash<String, String>, nil)

    A hash containing key/value pairs.



327
328
329
330
331
332
# File 'lib/google/cloud/bigquery/model.rb', line 327

def labels
  return nil if reference?
  m = @gapi_json[:labels]
  m = m.to_h if m.respond_to? :to_h
  m.dup.freeze
end

#labels=(new_labels) ⇒ Object

Updates the hash of user-provided labels associated with this model. Labels are used to organize and group models. See Using Labels.

If the model is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model"

model.labels = { "env" => "production" }

Parameters:

  • new_labels (Hash<String, String>)

    A hash containing key/value pairs.

    • Label keys and values can be no longer than 63 characters.
    • Label keys and values can contain only lowercase letters, numbers, underscores, hyphens, and international characters.
    • Label keys and values cannot exceed 128 bytes in size.
    • Label keys must begin with a letter.
    • Label keys must be unique within a model.


364
365
366
367
# File 'lib/google/cloud/bigquery/model.rb', line 364

def labels= new_labels
  ensure_full_data!
  patch_gapi! labels: new_labels
end

#locationString?

The geographic location where the model should reside. Possible values include EU and US. The default value is US.

Returns:

  • (String, nil)

    The location code.



300
301
302
303
304
# File 'lib/google/cloud/bigquery/model.rb', line 300

def location
  return nil if reference?
  ensure_full_data!
  @gapi_json[:location]
end

#model_idString

A unique ID for this model.

Returns:

  • (String)

    The ID must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_). The maximum length is 1,024 characters.



95
96
97
98
# File 'lib/google/cloud/bigquery/model.rb', line 95

def model_id
  return @reference.model_id if reference?
  @gapi_json[:modelReference][:modelId]
end

#model_typeString?

Type of the model resource. Expected to be one of the following:

  • LINEAR_REGRESSION - Linear regression model.
  • LOGISTIC_REGRESSION - Logistic regression based classification model.
  • KMEANS - K-means clustering model (beta).
  • TENSORFLOW - An imported TensorFlow model (beta).

Returns:

  • (String, nil)

    The model type, or nil if the object is a reference (see #reference?).



154
155
156
157
# File 'lib/google/cloud/bigquery/model.rb', line 154

def model_type
  return nil if reference?
  @gapi_json[:modelType]
end

#modified_atTime?

The date when this model was last modified.

Returns:

  • (Time, nil)

    The last modified time, or nil if not present or the object is a reference (see #reference?).



254
255
256
257
# File 'lib/google/cloud/bigquery/model.rb', line 254

def modified_at
  return nil if reference?
  Convert.millis_to_time @gapi_json[:lastModifiedTime]
end

#nameString?

The name of the model.

Returns:

  • (String, nil)

    The friendly name, or nil if the object is a reference (see #reference?).



167
168
169
170
171
# File 'lib/google/cloud/bigquery/model.rb', line 167

def name
  return nil if reference?
  ensure_full_data!
  @gapi_json[:friendlyName]
end

#name=(new_name) ⇒ Object

Updates the name of the model.

If the model is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.

Parameters:

  • new_name (String)

    The new friendly name.



184
185
186
187
# File 'lib/google/cloud/bigquery/model.rb', line 184

def name= new_name
  ensure_full_data!
  patch_gapi! friendlyName: new_name
end

#project_idString

The ID of the Project containing this model.

Returns:

  • (String)

    The project ID.



120
121
122
123
# File 'lib/google/cloud/bigquery/model.rb', line 120

def project_id
  return @reference.project_id if reference?
  @gapi_json[:modelReference][:projectId]
end

#reference?Boolean

Whether the model was created without retrieving the resource representation from the BigQuery service.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model", skip_lookup: true

model.reference? #=> true
model.reload!
model.reference? #=> false

Returns:

  • (Boolean)

    true when the model is just a local reference object, false otherwise.



587
588
589
# File 'lib/google/cloud/bigquery/model.rb', line 587

def reference?
  @gapi_json.nil?
end

#reload!Google::Cloud::Bigquery::Model Also known as: refresh!

Reloads the model with current data from the BigQuery service.

Examples:

Skip retrieving the model from the service, then load it:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model", skip_lookup: true

model.reference? #=> true
model.reload!
model.resource? #=> true

Returns:



529
530
531
532
533
534
535
# File 'lib/google/cloud/bigquery/model.rb', line 529

def reload!
  ensure_service!
  @gapi_json = service.get_model dataset_id, model_id
  @reference = nil
  @exists = nil
  self
end

#resource?Boolean

Whether the model was created with a resource representation from the BigQuery service.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model", skip_lookup: true

model.resource? #=> false
model.reload!
model.resource? #=> true

Returns:

  • (Boolean)

    true when the model was created with a resource representation, false otherwise.



610
611
612
# File 'lib/google/cloud/bigquery/model.rb', line 610

def resource?
  !@gapi_json.nil?
end

#resource_full?Boolean

Whether the model was created with a full resource representation from the BigQuery service.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
model = dataset.model "my_model"

model.resource_full? #=> true

Returns:

  • (Boolean)

    true when the model was created with a full resource representation, false otherwise.



659
660
661
# File 'lib/google/cloud/bigquery/model.rb', line 659

def resource_full?
  resource? && @gapi_json.key?(:friendlyName)
end

#resource_partial?Boolean

Whether the model was created with a partial resource representation from the BigQuery service by retrieval through Dataset#models. See Models: list response for the contents of the partial representation. Accessing any attribute outside of the partial representation will result in loading the full representation.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

dataset = bigquery.dataset "my_dataset"
model = dataset.models.first

model.resource_partial? #=> true
model.description # Loads the full resource.
model.resource_partial? #=> false

Returns:

  • (Boolean)

    true when the model was created with a partial resource representation, false otherwise.



638
639
640
# File 'lib/google/cloud/bigquery/model.rb', line 638

def resource_partial?
  resource? && !resource_full?
end

#training_runsArray<Google::Cloud::Bigquery::Model::TrainingRun>

Information for all training runs in increasing order of startTime.

Returns:

  • (Array<Google::Cloud::Bigquery::Model::TrainingRun>)


480
481
482
483
# File 'lib/google/cloud/bigquery/model.rb', line 480

def training_runs
  ensure_full_data!
  Array @gapi_json[:trainingRuns]
end