Class: Google::Cloud::Bigquery::Model
- Inherits:
-
Object
- Object
- Google::Cloud::Bigquery::Model
- Defined in:
- lib/google/cloud/bigquery/model.rb,
lib/google/cloud/bigquery/model/list.rb
Overview
Model
A model in BigQuery ML represents what an ML system has learned from the training data.
The following types of models are supported by BigQuery ML:
- Linear regression for forecasting; for example, the sales of an item on a given day. Labels are real-valued (they cannot be +/- infinity or NaN).
- Binary logistic regression for classification; for example, determining whether a customer will make a purchase. Labels must only have two possible values.
- Multiclass logistic regression for classification. These models can be used to predict multiple possible values such as whether an input is "low-value," "medium-value," or "high-value." Labels can have up to 50 unique values. In BigQuery ML, multiclass logistic regression training uses a multinomial classifier with a cross entropy loss function.
- K-means clustering for data segmentation (beta); for example, identifying customer segments. K-means is an unsupervised learning technique, so model training does not require labels nor split data for training or evaluation.
In BigQuery ML, a model can be used with data from multiple BigQuery datasets for training and for prediction.
Defined Under Namespace
Classes: List
Attributes collapse
-
#created_at ⇒ Time?
The time when this model was created.
-
#dataset_id ⇒ String
The ID of the
Dataset
containing this model. -
#description ⇒ String?
A user-friendly description of the model.
-
#description=(new_description) ⇒ Object
Updates the user-friendly description of the model.
-
#encryption ⇒ EncryptionConfiguration?
The EncryptionConfiguration object that represents the custom encryption method used to protect this model.
-
#encryption=(value) ⇒ Object
Set the EncryptionConfiguration object that represents the custom encryption method used to protect this model.
-
#etag ⇒ String?
The ETag hash of the model.
-
#expires_at ⇒ Time?
The time when this model expires.
-
#expires_at=(new_expires_at) ⇒ Object
Updates time when this model expires.
-
#feature_columns ⇒ Array<StandardSql::Field>
The input feature columns that were used to train this model.
-
#label_columns ⇒ Array<StandardSql::Field>
The label columns that were used to train this model.
-
#labels ⇒ Hash<String, String>?
A hash of user-provided labels associated with this model.
-
#labels=(new_labels) ⇒ Object
Updates the hash of user-provided labels associated with this model.
-
#location ⇒ String?
The geographic location where the model should reside.
-
#model_id ⇒ String
A unique ID for this model.
-
#model_type ⇒ String?
Type of the model resource.
-
#modified_at ⇒ Time?
The date when this model was last modified.
-
#name ⇒ String?
The name of the model.
-
#name=(new_name) ⇒ Object
Updates the name of the model.
-
#project_id ⇒ String
The ID of the
Project
containing this model. -
#training_runs ⇒ Array<Google::Cloud::Bigquery::Model::TrainingRun>
Information for all training runs in increasing order of startTime.
Data collapse
-
#extract(extract_url, format: nil) {|job| ... } ⇒ Boolean
Exports the model to Google Cloud Storage using a synchronous method that blocks for a response.
-
#extract_job(extract_url, format: nil, job_id: nil, prefix: nil, labels: nil) {|job| ... } ⇒ Google::Cloud::Bigquery::ExtractJob
Exports the model to Google Cloud Storage asynchronously, immediately returning an ExtractJob that can be used to track the progress of the export job.
Lifecycle collapse
-
#delete ⇒ Boolean
Permanently deletes the model.
-
#exists?(force: false) ⇒ Boolean
Determines whether the model exists in the BigQuery service.
-
#reference? ⇒ Boolean
Whether the model was created without retrieving the resource representation from the BigQuery service.
-
#reload! ⇒ Google::Cloud::Bigquery::Model
(also: #refresh!)
Reloads the model with current data from the BigQuery service.
-
#resource? ⇒ Boolean
Whether the model was created with a resource representation from the BigQuery service.
-
#resource_full? ⇒ Boolean
Whether the model was created with a full resource representation from the BigQuery service.
-
#resource_partial? ⇒ Boolean
Whether the model was created with a partial resource representation from the BigQuery service by retrieval through Dataset#models.
Instance Method Details
#created_at ⇒ Time?
The time when this model was created.
241 242 243 244 |
# File 'lib/google/cloud/bigquery/model.rb', line 241 def created_at return nil if reference? Convert.millis_to_time @gapi_json[:creationTime] end |
#dataset_id ⇒ String
The ID of the Dataset
containing this model.
108 109 110 111 |
# File 'lib/google/cloud/bigquery/model.rb', line 108 def dataset_id return @reference.dataset_id if reference? @gapi_json[:modelReference][:datasetId] end |
#delete ⇒ Boolean
Permanently deletes the model.
646 647 648 649 650 651 652 |
# File 'lib/google/cloud/bigquery/model.rb', line 646 def delete ensure_service! service.delete_model dataset_id, model_id # Set flag for #exists? @exists = false true end |
#description ⇒ String?
A user-friendly description of the model.
211 212 213 214 215 |
# File 'lib/google/cloud/bigquery/model.rb', line 211 def description return nil if reference? ensure_full_data! @gapi_json[:description] end |
#description=(new_description) ⇒ Object
Updates the user-friendly description of the model.
If the model is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
228 229 230 231 |
# File 'lib/google/cloud/bigquery/model.rb', line 228 def description= new_description ensure_full_data! patch_gapi! description: new_description end |
#encryption ⇒ EncryptionConfiguration?
The EncryptionConfiguration object that represents the custom encryption method used to protect this model. If not set, Dataset#default_encryption is used.
Present only if this model is using custom encryption.
399 400 401 402 403 404 405 406 407 |
# File 'lib/google/cloud/bigquery/model.rb', line 399 def encryption return nil if reference? return nil if @gapi_json[:encryptionConfiguration].nil? # We have to create a gapic object from the hash because that is what # EncryptionConfiguration is expecing. json_cmek = @gapi_json[:encryptionConfiguration].to_json gapi_cmek = Google::Apis::BigqueryV2::EncryptionConfiguration.from_json json_cmek EncryptionConfiguration.from_gapi(gapi_cmek).freeze end |
#encryption=(value) ⇒ Object
Set the EncryptionConfiguration object that represents the custom encryption method used to protect this model. If not set, Dataset#default_encryption is used.
Present only if this model is using custom encryption.
If the model is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
439 440 441 442 443 444 445 |
# File 'lib/google/cloud/bigquery/model.rb', line 439 def encryption= value ensure_full_data! # We have to create a hash from the gapic object's JSON because that # is what Model is expecing. json_cmek = JSON.parse value.to_gapi.to_json, symbolize_names: true patch_gapi! encryptionConfiguration: json_cmek end |
#etag ⇒ String?
The ETag hash of the model.
197 198 199 200 201 |
# File 'lib/google/cloud/bigquery/model.rb', line 197 def etag return nil if reference? ensure_full_data! @gapi_json[:etag] end |
#exists?(force: false) ⇒ Boolean
Determines whether the model exists in the BigQuery service. The
result is cached locally. To refresh state, set force
to true
.
704 705 706 707 708 709 710 711 |
# File 'lib/google/cloud/bigquery/model.rb', line 704 def exists? force: false return resource_exists? if force # If we have a value, return it return @exists unless @exists.nil? # Always true if we have a gapi_json object return true if resource? resource_exists? end |
#expires_at ⇒ Time?
The time when this model expires. If not present, the model will persist indefinitely. Expired models will be deleted and their storage reclaimed.
269 270 271 272 273 |
# File 'lib/google/cloud/bigquery/model.rb', line 269 def expires_at return nil if reference? ensure_full_data! Convert.millis_to_time @gapi_json[:expirationTime] end |
#expires_at=(new_expires_at) ⇒ Object
Updates time when this model expires.
If the model is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
286 287 288 289 290 |
# File 'lib/google/cloud/bigquery/model.rb', line 286 def expires_at= new_expires_at ensure_full_data! new_expires_millis = Convert.time_to_millis new_expires_at patch_gapi! expirationTime: new_expires_millis end |
#extract(extract_url, format: nil) {|job| ... } ⇒ Boolean
Exports the model to Google Cloud Storage using a synchronous method that blocks for a response. Timeouts and transient errors are generally handled as needed to complete the job. See also #extract_job.
The geographic location for the job ("US", "EU", etc.) can be set via ExtractJob::Updater#location= in a block passed to this method. If the model is a full resource representation (see #resource_full?), the location of the job will automatically be set to the location of the model.
623 624 625 626 627 628 |
# File 'lib/google/cloud/bigquery/model.rb', line 623 def extract extract_url, format: nil, &block job = extract_job extract_url, format: format, &block job.wait_until_done! ensure_job_succeeded! job true end |
#extract_job(extract_url, format: nil, job_id: nil, prefix: nil, labels: nil) {|job| ... } ⇒ Google::Cloud::Bigquery::ExtractJob
Exports the model to Google Cloud Storage asynchronously, immediately returning an ExtractJob that can be used to track the progress of the export job. The caller may poll the service by repeatedly calling Job#reload! and Job#done? to detect when the job is done, or simply block until the job is done by calling #Job#wait_until_done!. See also #extract.
The geographic location for the job ("US", "EU", etc.) can be set via ExtractJob::Updater#location= in a block passed to this method. If the model is a full resource representation (see #resource_full?), the location of the job will automatically be set to the location of the model.
569 570 571 572 573 574 575 576 577 578 579 580 |
# File 'lib/google/cloud/bigquery/model.rb', line 569 def extract_job extract_url, format: nil, job_id: nil, prefix: nil, labels: nil ensure_service! = { format: format, job_id: job_id, prefix: prefix, labels: labels } updater = ExtractJob::Updater. service, model_ref, extract_url, updater.location = location if location # may be model reference yield updater if block_given? job_gapi = updater.to_gapi gapi = service.extract_table job_gapi Job.from_gapi gapi, service end |
#feature_columns ⇒ Array<StandardSql::Field>
The input feature columns that were used to train this model.
454 455 456 457 458 459 460 |
# File 'lib/google/cloud/bigquery/model.rb', line 454 def feature_columns ensure_full_data! Array(@gapi_json[:featureColumns]).map do |field_gapi_json| field_gapi = Google::Apis::BigqueryV2::StandardSqlField.from_json field_gapi_json.to_json StandardSql::Field.from_gapi field_gapi end end |
#label_columns ⇒ Array<StandardSql::Field>
The label columns that were used to train this model. The output of the model will have a "predicted_" prefix to these columns.
470 471 472 473 474 475 476 |
# File 'lib/google/cloud/bigquery/model.rb', line 470 def label_columns ensure_full_data! Array(@gapi_json[:labelColumns]).map do |field_gapi_json| field_gapi = Google::Apis::BigqueryV2::StandardSqlField.from_json field_gapi_json.to_json StandardSql::Field.from_gapi field_gapi end end |
#labels ⇒ Hash<String, String>?
A hash of user-provided labels associated with this model. Labels are used to organize and group models. See Using Labels.
The returned hash is frozen and changes are not allowed. Use #labels= to replace the entire hash.
327 328 329 330 331 332 |
# File 'lib/google/cloud/bigquery/model.rb', line 327 def labels return nil if reference? m = @gapi_json[:labels] m = m.to_h if m.respond_to? :to_h m.dup.freeze end |
#labels=(new_labels) ⇒ Object
Updates the hash of user-provided labels associated with this model. Labels are used to organize and group models. See Using Labels.
If the model is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
369 370 371 372 |
# File 'lib/google/cloud/bigquery/model.rb', line 369 def labels= new_labels ensure_full_data! patch_gapi! labels: new_labels end |
#location ⇒ String?
The geographic location where the model should reside. Possible
values include EU
and US
. The default value is US
.
300 301 302 303 304 |
# File 'lib/google/cloud/bigquery/model.rb', line 300 def location return nil if reference? ensure_full_data! @gapi_json[:location] end |
#model_id ⇒ String
A unique ID for this model.
95 96 97 98 |
# File 'lib/google/cloud/bigquery/model.rb', line 95 def model_id return @reference.model_id if reference? @gapi_json[:modelReference][:modelId] end |
#model_type ⇒ String?
Type of the model resource. Expected to be one of the following:
- LINEAR_REGRESSION - Linear regression model.
- LOGISTIC_REGRESSION - Logistic regression based classification model.
- KMEANS - K-means clustering model (beta).
- TENSORFLOW - An imported TensorFlow model (beta).
154 155 156 157 |
# File 'lib/google/cloud/bigquery/model.rb', line 154 def model_type return nil if reference? @gapi_json[:modelType] end |
#modified_at ⇒ Time?
The date when this model was last modified.
254 255 256 257 |
# File 'lib/google/cloud/bigquery/model.rb', line 254 def modified_at return nil if reference? Convert.millis_to_time @gapi_json[:lastModifiedTime] end |
#name ⇒ String?
The name of the model.
167 168 169 170 171 |
# File 'lib/google/cloud/bigquery/model.rb', line 167 def name return nil if reference? ensure_full_data! @gapi_json[:friendlyName] end |
#name=(new_name) ⇒ Object
Updates the name of the model.
If the model is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
184 185 186 187 |
# File 'lib/google/cloud/bigquery/model.rb', line 184 def name= new_name ensure_full_data! patch_gapi! friendlyName: new_name end |
#project_id ⇒ String
The ID of the Project
containing this model.
120 121 122 123 |
# File 'lib/google/cloud/bigquery/model.rb', line 120 def project_id return @reference.project_id if reference? @gapi_json[:modelReference][:projectId] end |
#reference? ⇒ Boolean
Whether the model was created without retrieving the resource representation from the BigQuery service.
732 733 734 |
# File 'lib/google/cloud/bigquery/model.rb', line 732 def reference? @gapi_json.nil? end |
#reload! ⇒ Google::Cloud::Bigquery::Model Also known as: refresh!
Reloads the model with current data from the BigQuery service.
674 675 676 677 678 679 680 |
# File 'lib/google/cloud/bigquery/model.rb', line 674 def reload! ensure_service! @gapi_json = service.get_model dataset_id, model_id @reference = nil @exists = nil self end |
#resource? ⇒ Boolean
Whether the model was created with a resource representation from the BigQuery service.
755 756 757 |
# File 'lib/google/cloud/bigquery/model.rb', line 755 def resource? !@gapi_json.nil? end |
#resource_full? ⇒ Boolean
Whether the model was created with a full resource representation from the BigQuery service.
804 805 806 |
# File 'lib/google/cloud/bigquery/model.rb', line 804 def resource_full? resource? && @gapi_json.key?(:friendlyName) end |
#resource_partial? ⇒ Boolean
Whether the model was created with a partial resource representation from the BigQuery service by retrieval through Dataset#models. See Models: list response for the contents of the partial representation. Accessing any attribute outside of the partial representation will result in loading the full representation.
783 784 785 |
# File 'lib/google/cloud/bigquery/model.rb', line 783 def resource_partial? resource? && !resource_full? end |
#training_runs ⇒ Array<Google::Cloud::Bigquery::Model::TrainingRun>
Information for all training runs in increasing order of startTime.
485 486 487 488 |
# File 'lib/google/cloud/bigquery/model.rb', line 485 def training_runs ensure_full_data! Array @gapi_json[:trainingRuns] end |