Class: Google::Cloud::Bigquery::Dataset
- Inherits:
-
Object
- Object
- Google::Cloud::Bigquery::Dataset
- Defined in:
- lib/google/cloud/bigquery/dataset.rb,
lib/google/cloud/bigquery/dataset/list.rb,
lib/google/cloud/bigquery/dataset/access.rb
Overview
Dataset
Represents a Dataset. A dataset is a grouping mechanism that holds zero or more tables. Datasets are the lowest level unit of access control; you cannot control access at the table level. A dataset is contained within a specific project.
Direct Known Subclasses
Defined Under Namespace
Classes: Access, List, Updater
Attributes collapse
-
#access {|access| ... } ⇒ Google::Cloud::Bigquery::Dataset::Access
Retrieves the access rules for a Dataset.
-
#api_url ⇒ String?
A URL that can be used to access the dataset using the REST API.
-
#created_at ⇒ Time?
The time when this dataset was created.
-
#dataset_id ⇒ String
A unique ID for this dataset, without the project name.
-
#default_encryption ⇒ EncryptionConfiguration?
The EncryptionConfiguration object that represents the default encryption method for all tables and models in the dataset.
-
#default_encryption=(value) ⇒ Object
Set the EncryptionConfiguration object that represents the default encryption method for all tables and models in the dataset.
-
#default_expiration ⇒ Integer?
The default lifetime of all tables in the dataset, in milliseconds.
-
#default_expiration=(new_default_expiration) ⇒ Object
Updates the default lifetime of all tables in the dataset, in milliseconds.
-
#description ⇒ String?
A user-friendly description of the dataset.
-
#description=(new_description) ⇒ Object
Updates the user-friendly description of the dataset.
-
#etag ⇒ String?
The ETag hash of the dataset.
-
#labels ⇒ Hash<String, String>?
A hash of user-provided labels associated with this dataset.
-
#labels=(labels) ⇒ Object
Updates the hash of user-provided labels associated with this dataset.
-
#location ⇒ String?
The geographic location where the dataset should reside.
-
#modified_at ⇒ Time?
The date when this dataset or any of its tables was last modified.
-
#name ⇒ String?
A descriptive name for the dataset.
-
#name=(new_name) ⇒ Object
Updates the descriptive name for the dataset.
-
#project_id ⇒ String
The ID of the project containing this dataset.
Lifecycle collapse
-
#delete(force: nil) ⇒ Boolean
Permanently deletes the dataset.
Table collapse
-
#create_table(table_id, name: nil, description: nil) {|table| ... } ⇒ Google::Cloud::Bigquery::Table
Creates a new table.
-
#create_view(table_id, query, name: nil, description: nil, standard_sql: nil, legacy_sql: nil, udfs: nil) ⇒ Google::Cloud::Bigquery::Table
Creates a new view table, which is a virtual table defined by the given SQL query.
-
#table(table_id, skip_lookup: nil) ⇒ Google::Cloud::Bigquery::Table?
Retrieves an existing table by ID.
-
#tables(token: nil, max: nil) ⇒ Array<Google::Cloud::Bigquery::Table>
Retrieves the list of tables belonging to the dataset.
Model collapse
-
#model(model_id, skip_lookup: nil) ⇒ Google::Cloud::Bigquery::Model?
Retrieves an existing model by ID.
-
#models(token: nil, max: nil) ⇒ Array<Google::Cloud::Bigquery::Model>
Retrieves the list of models belonging to the dataset.
Routine collapse
-
#create_routine(routine_id) {|routine| ... } ⇒ Google::Cloud::Bigquery::Routine
Creates a new routine.
-
#routine(routine_id, skip_lookup: nil) ⇒ Google::Cloud::Bigquery::Routine?
Retrieves an existing routine by ID.
-
#routines(token: nil, max: nil, filter: nil) ⇒ Array<Google::Cloud::Bigquery::Routine>
Retrieves the list of routines belonging to the dataset.
Data collapse
-
#exists?(force: false) ⇒ Boolean
Determines whether the dataset exists in the BigQuery service.
-
#external(url, format: nil) {|ext| ... } ⇒ External::DataSource
Creates a new External::DataSource (or subclass) object that represents the external data source that can be queried from directly, even though the data is not stored in BigQuery.
-
#insert(table_id, rows, insert_ids: nil, skip_invalid: nil, ignore_unknown: nil, autocreate: nil) {|table| ... } ⇒ Google::Cloud::Bigquery::InsertResponse
Inserts data into the given table for near-immediate querying, without the need to complete a load operation before the data can appear in query results.
-
#insert_async(table_id, skip_invalid: nil, ignore_unknown: nil, max_bytes: 10_000_000, max_rows: 500, interval: 10, threads: 4) {|response| ... } ⇒ Table::AsyncInserter
Create an asynchronous inserter object used to insert rows in batches.
-
#load(table_id, files, format: nil, create: nil, write: nil, projection_fields: nil, jagged_rows: nil, quoted_newlines: nil, encoding: nil, delimiter: nil, ignore_unknown: nil, max_bad_records: nil, quote: nil, skip_leading: nil, schema: nil, autodetect: nil, null_marker: nil) {|updater| ... } ⇒ Boolean
Loads data into the provided destination table using a synchronous method that blocks for a response.
-
#load_job(table_id, files, format: nil, create: nil, write: nil, projection_fields: nil, jagged_rows: nil, quoted_newlines: nil, encoding: nil, delimiter: nil, ignore_unknown: nil, max_bad_records: nil, quote: nil, skip_leading: nil, schema: nil, job_id: nil, prefix: nil, labels: nil, autodetect: nil, null_marker: nil, dryrun: nil) {|updater| ... } ⇒ Google::Cloud::Bigquery::LoadJob
Loads data into the provided destination table using an asynchronous method.
-
#query(query, params: nil, types: nil, external: nil, max: nil, cache: true, standard_sql: nil, legacy_sql: nil) {|job| ... } ⇒ Google::Cloud::Bigquery::Data
Queries data and waits for the results.
-
#query_job(query, params: nil, types: nil, external: nil, priority: "INTERACTIVE", cache: true, table: nil, create: nil, write: nil, dryrun: nil, standard_sql: nil, legacy_sql: nil, large_results: nil, flatten: nil, maximum_billing_tier: nil, maximum_bytes_billed: nil, job_id: nil, prefix: nil, labels: nil, udfs: nil) {|job| ... } ⇒ Google::Cloud::Bigquery::QueryJob
Queries data by creating a query job.
-
#reference? ⇒ Boolean
Whether the dataset was created without retrieving the resource representation from the BigQuery service.
-
#reload! ⇒ Google::Cloud::Bigquery::Dataset
(also: #refresh!)
Reloads the dataset with current data from the BigQuery service.
-
#resource? ⇒ Boolean
Whether the dataset was created with a resource representation from the BigQuery service.
-
#resource_full? ⇒ Boolean
Whether the dataset was created with a full resource representation from the BigQuery service.
-
#resource_partial? ⇒ Boolean
Whether the dataset was created with a partial resource representation from the BigQuery service by retrieval through Project#datasets.
Instance Method Details
#access {|access| ... } ⇒ Google::Cloud::Bigquery::Dataset::Access
Retrieves the access rules for a Dataset. The rules can be updated when passing a block, see Access for all the methods available.
If the dataset is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
455 456 457 458 459 460 461 462 463 464 465 466 467 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 455 def access ensure_full_data! reload! unless resource_full? access_builder = Access.from_gapi @gapi if block_given? yield access_builder if access_builder.changed? @gapi.update! access: access_builder.to_gapi patch_gapi! :access end end access_builder.freeze end |
#api_url ⇒ String?
A URL that can be used to access the dataset using the REST API.
157 158 159 160 161 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 157 def api_url return nil if reference? ensure_full_data! @gapi.self_link end |
#create_routine(routine_id) {|routine| ... } ⇒ Google::Cloud::Bigquery::Routine
Creates a new routine. The following attributes may be set in the yielded block: Routine::Updater#routine_type=, Routine::Updater#language=, Routine::Updater#arguments=, Routine::Updater#return_type=, Routine::Updater#imported_libraries=, Routine::Updater#body=, and Routine::Updater#description=.
939 940 941 942 943 944 945 946 947 948 949 950 951 952 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 939 def create_routine routine_id ensure_service! new_tb = Google::Apis::BigqueryV2::Routine.new( routine_reference: Google::Apis::BigqueryV2::RoutineReference.new( project_id: project_id, dataset_id: dataset_id, routine_id: routine_id ) ) updater = Routine::Updater.new new_tb yield updater if block_given? gapi = service.insert_routine dataset_id, updater.to_gapi Routine.from_gapi gapi, service end |
#create_table(table_id, name: nil, description: nil) {|table| ... } ⇒ Google::Cloud::Bigquery::Table
Creates a new table. If you are adapting existing code that was written for the Rest API , you can pass the table's schema as a hash (see example.)
601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 601 def create_table table_id, name: nil, description: nil ensure_service! new_tb = Google::Apis::BigqueryV2::Table.new( table_reference: Google::Apis::BigqueryV2::TableReference.new( project_id: project_id, dataset_id: dataset_id, table_id: table_id ) ) updater = Table::Updater.new(new_tb).tap do |tb| tb.name = name unless name.nil? tb.description = description unless description.nil? end yield updater if block_given? gapi = service.insert_table dataset_id, updater.to_gapi Table.from_gapi gapi, service end |
#create_view(table_id, query, name: nil, description: nil, standard_sql: nil, legacy_sql: nil, udfs: nil) ⇒ Google::Cloud::Bigquery::Table
Creates a new view table, which is a virtual table defined by the given SQL query.
BigQuery's views are logical views, not materialized views, which means that the query that defines the view is re-executed every time the view is queried. Queries are billed according to the total amount of data in all table fields referenced directly or indirectly by the top-level query. (See Table#view? and Table#query.)
684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 684 def create_view table_id, query, name: nil, description: nil, standard_sql: nil, legacy_sql: nil, udfs: nil use_legacy_sql = Convert.resolve_legacy_sql standard_sql, legacy_sql new_view_opts = { table_reference: Google::Apis::BigqueryV2::TableReference.new( project_id: project_id, dataset_id: dataset_id, table_id: table_id ), friendly_name: name, description: description, view: Google::Apis::BigqueryV2::ViewDefinition.new( query: query, use_legacy_sql: use_legacy_sql, user_defined_function_resources: udfs_gapi(udfs) ) }.delete_if { |_, v| v.nil? } new_view = Google::Apis::BigqueryV2::Table.new new_view_opts gapi = service.insert_table dataset_id, new_view Table.from_gapi gapi, service end |
#created_at ⇒ Time?
The time when this dataset was created.
240 241 242 243 244 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 240 def created_at return nil if reference? ensure_full_data! Convert.millis_to_time @gapi.creation_time end |
#dataset_id ⇒ String
A unique ID for this dataset, without the project name.
77 78 79 80 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 77 def dataset_id return reference.dataset_id if reference? @gapi.dataset_reference.dataset_id end |
#default_encryption ⇒ EncryptionConfiguration?
The EncryptionConfiguration object that represents the default encryption method for all tables and models in the dataset. Once this property is set, all newly-created partitioned tables and models in the dataset will have their encryption set to this value, unless table creation request (or query) overrides it.
Present only if this dataset is using custom default encryption.
373 374 375 376 377 378 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 373 def default_encryption return nil if reference? ensure_full_data! return nil if @gapi.default_encryption_configuration.nil? EncryptionConfiguration.from_gapi(@gapi.default_encryption_configuration).freeze end |
#default_encryption=(value) ⇒ Object
Set the EncryptionConfiguration object that represents the default encryption method for all tables and models in the dataset. Once this property is set, all newly-created partitioned tables and models in the dataset will have their encryption set to this value, unless table creation request (or query) overrides it.
If the dataset is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
409 410 411 412 413 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 409 def default_encryption= value ensure_full_data! @gapi.default_encryption_configuration = value.to_gapi patch_gapi! :default_encryption_configuration end |
#default_expiration ⇒ Integer?
The default lifetime of all tables in the dataset, in milliseconds.
203 204 205 206 207 208 209 210 211 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 203 def default_expiration return nil if reference? ensure_full_data! begin Integer @gapi.default_table_expiration_ms rescue StandardError nil end end |
#default_expiration=(new_default_expiration) ⇒ Object
Updates the default lifetime of all tables in the dataset, in milliseconds.
If the dataset is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
226 227 228 229 230 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 226 def default_expiration= new_default_expiration reload! unless resource_full? @gapi.update! default_table_expiration_ms: new_default_expiration patch_gapi! :default_table_expiration_ms end |
#delete(force: nil) ⇒ Boolean
Permanently deletes the dataset. The dataset must be empty before it
can be deleted unless the force
option is set to true
.
489 490 491 492 493 494 495 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 489 def delete force: nil ensure_service! service.delete_dataset dataset_id, force # Set flag for #exists? @exists = false true end |
#description ⇒ String?
A user-friendly description of the dataset.
171 172 173 174 175 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 171 def description return nil if reference? ensure_full_data! @gapi.description end |
#description=(new_description) ⇒ Object
Updates the user-friendly description of the dataset.
If the dataset is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
188 189 190 191 192 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 188 def description= new_description reload! unless resource_full? @gapi.update! description: new_description patch_gapi! :description end |
#etag ⇒ String?
The ETag hash of the dataset.
143 144 145 146 147 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 143 def etag return nil if reference? ensure_full_data! @gapi.etag end |
#exists?(force: false) ⇒ Boolean
Determines whether the dataset exists in the BigQuery service. The
result is cached locally. To refresh state, set force
to true
.
2204 2205 2206 2207 2208 2209 2210 2211 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 2204 def exists? force: false return gapi_exists? if force # If we have a memoized value, return it return @exists unless @exists.nil? # Always true if we have a gapi object return true if resource? gapi_exists? end |
#external(url, format: nil) {|ext| ... } ⇒ External::DataSource
Creates a new External::DataSource (or subclass) object that represents the external data source that can be queried from directly, even though the data is not stored in BigQuery. Instead of loading or streaming the data, this object references the external data source.
1668 1669 1670 1671 1672 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 1668 def external url, format: nil ext = External.from_urls url, format yield ext if block_given? ext end |
#insert(table_id, rows, insert_ids: nil, skip_invalid: nil, ignore_unknown: nil, autocreate: nil) {|table| ... } ⇒ Google::Cloud::Bigquery::InsertResponse
Inserts data into the given table for near-immediate querying, without the need to complete a load operation before the data can appear in query results.
Because BigQuery's streaming API is designed for high insertion rates, modifications to the underlying table metadata are eventually consistent when interacting with the streaming system. In most cases metadata changes are propagated within minutes, but during this period API responses may reflect the inconsistent state of the table.
The value :skip
can be provided to skip the generation of IDs for all rows, or to skip the generation of an
ID for a specific row in the array.
2413 2414 2415 2416 2417 2418 2419 2420 2421 2422 2423 2424 2425 2426 2427 2428 2429 2430 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 2413 def insert table_id, rows, insert_ids: nil, skip_invalid: nil, ignore_unknown: nil, autocreate: nil, &block rows = [rows] if rows.is_a? Hash raise ArgumentError, "No rows provided" if rows.empty? insert_ids = Array.new(rows.count) { :skip } if insert_ids == :skip insert_ids = Array insert_ids if insert_ids.count.positive? && insert_ids.count != rows.count raise ArgumentError, "insert_ids must be the same size as rows" end if autocreate insert_data_with_autocreate table_id, rows, skip_invalid: skip_invalid, ignore_unknown: ignore_unknown, insert_ids: insert_ids, &block else insert_data table_id, rows, skip_invalid: skip_invalid, ignore_unknown: ignore_unknown, insert_ids: insert_ids end end |
#insert_async(table_id, skip_invalid: nil, ignore_unknown: nil, max_bytes: 10_000_000, max_rows: 500, interval: 10, threads: 4) {|response| ... } ⇒ Table::AsyncInserter
Create an asynchronous inserter object used to insert rows in batches.
2479 2480 2481 2482 2483 2484 2485 2486 2487 2488 2489 2490 2491 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 2479 def insert_async table_id, skip_invalid: nil, ignore_unknown: nil, max_bytes: 10_000_000, max_rows: 500, interval: 10, threads: 4, &block ensure_service! # Get table, don't use Dataset#table which handles NotFoundError gapi = service.get_table dataset_id, table_id table = Table.from_gapi gapi, service # Get the AsyncInserter from the table table.insert_async skip_invalid: skip_invalid, ignore_unknown: ignore_unknown, max_bytes: max_bytes, max_rows: max_rows, interval: interval, threads: threads, &block end |
#labels ⇒ Hash<String, String>?
A hash of user-provided labels associated with this dataset. Labels are used to organize and group datasets. See Using Labels.
The returned hash is frozen and changes are not allowed. Use #labels= to replace the entire hash.
297 298 299 300 301 302 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 297 def labels return nil if reference? m = @gapi.labels m = m.to_h if m.respond_to? :to_h m.dup.freeze end |
#labels=(labels) ⇒ Object
Updates the hash of user-provided labels associated with this dataset. Labels are used to organize and group datasets. See Using Labels.
If the dataset is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
340 341 342 343 344 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 340 def labels= labels reload! unless resource_full? @gapi.labels = labels patch_gapi! :labels end |
#load(table_id, files, format: nil, create: nil, write: nil, projection_fields: nil, jagged_rows: nil, quoted_newlines: nil, encoding: nil, delimiter: nil, ignore_unknown: nil, max_bad_records: nil, quote: nil, skip_leading: nil, schema: nil, autodetect: nil, null_marker: nil) {|updater| ... } ⇒ Boolean
Loads data into the provided destination table using a synchronous method that blocks for a response. Timeouts and transient errors are generally handled as needed to complete the job. See also #load_job.
For the source of the data, you can pass a google-cloud storage file
path or a google-cloud-storage File
instance. Or, you can upload a
file directly. See Loading Data with a POST
Request.
The geographic location for the job ("US", "EU", etc.) can be set via LoadJob::Updater#location= in a block passed to this method. If the dataset is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the dataset.
2146 2147 2148 2149 2150 2151 2152 2153 2154 2155 2156 2157 2158 2159 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 2146 def load table_id, files, format: nil, create: nil, write: nil, projection_fields: nil, jagged_rows: nil, quoted_newlines: nil, encoding: nil, delimiter: nil, ignore_unknown: nil, max_bad_records: nil, quote: nil, skip_leading: nil, schema: nil, autodetect: nil, null_marker: nil, &block job = load_job table_id, files, format: format, create: create, write: write, projection_fields: projection_fields, jagged_rows: jagged_rows, quoted_newlines: quoted_newlines, encoding: encoding, delimiter: delimiter, ignore_unknown: ignore_unknown, max_bad_records: max_bad_records, quote: quote, skip_leading: skip_leading, schema: schema, autodetect: autodetect, null_marker: null_marker, &block job.wait_until_done! ensure_job_succeeded! job true end |
#load_job(table_id, files, format: nil, create: nil, write: nil, projection_fields: nil, jagged_rows: nil, quoted_newlines: nil, encoding: nil, delimiter: nil, ignore_unknown: nil, max_bad_records: nil, quote: nil, skip_leading: nil, schema: nil, job_id: nil, prefix: nil, labels: nil, autodetect: nil, null_marker: nil, dryrun: nil) {|updater| ... } ⇒ Google::Cloud::Bigquery::LoadJob
Loads data into the provided destination table using an asynchronous method. In this method, a LoadJob is immediately returned. The caller may poll the service by repeatedly calling Job#reload! and Job#done? to detect when the job is done, or simply block until the job is done by calling #Job#wait_until_done!. See also #load.
For the source of the data, you can pass a google-cloud storage file
path or a google-cloud-storage File
instance. Or, you can upload a
file directly. See Loading Data with a POST
Request.
The geographic location for the job ("US", "EU", etc.) can be set via LoadJob::Updater#location= in a block passed to this method. If the dataset is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the dataset.
1918 1919 1920 1921 1922 1923 1924 1925 1926 1927 1928 1929 1930 1931 1932 1933 1934 1935 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 1918 def load_job table_id, files, format: nil, create: nil, write: nil, projection_fields: nil, jagged_rows: nil, quoted_newlines: nil, encoding: nil, delimiter: nil, ignore_unknown: nil, max_bad_records: nil, quote: nil, skip_leading: nil, schema: nil, job_id: nil, prefix: nil, labels: nil, autodetect: nil, null_marker: nil, dryrun: nil ensure_service! updater = load_job_updater table_id, format: format, create: create, write: write, projection_fields: projection_fields, jagged_rows: jagged_rows, quoted_newlines: quoted_newlines, encoding: encoding, delimiter: delimiter, ignore_unknown: ignore_unknown, max_bad_records: max_bad_records, quote: quote, skip_leading: skip_leading, dryrun: dryrun, schema: schema, job_id: job_id, prefix: prefix, labels: labels, autodetect: autodetect, null_marker: null_marker yield updater if block_given? load_local_or_uri files, updater end |
#location ⇒ String?
The geographic location where the dataset should reside. Possible
values include EU
and US
. The default value is US
.
269 270 271 272 273 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 269 def location return nil if reference? ensure_full_data! @gapi.location end |
#model(model_id, skip_lookup: nil) ⇒ Google::Cloud::Bigquery::Model?
Retrieves an existing model by ID.
820 821 822 823 824 825 826 827 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 820 def model model_id, skip_lookup: nil ensure_service! return Model.new_reference project_id, dataset_id, model_id, service if skip_lookup gapi = service.get_model dataset_id, model_id Model.from_gapi_json gapi, service rescue Google::Cloud::NotFoundError nil end |
#models(token: nil, max: nil) ⇒ Array<Google::Cloud::Bigquery::Model>
Retrieves the list of models belonging to the dataset.
863 864 865 866 867 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 863 def models token: nil, max: nil ensure_service! gapi = service.list_models dataset_id, token: token, max: max Model::List.from_gapi gapi, service, dataset_id, max end |
#modified_at ⇒ Time?
The date when this dataset or any of its tables was last modified.
254 255 256 257 258 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 254 def modified_at return nil if reference? ensure_full_data! Convert.millis_to_time @gapi.last_modified_time end |
#name ⇒ String?
A descriptive name for the dataset.
112 113 114 115 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 112 def name return nil if reference? @gapi.friendly_name end |
#name=(new_name) ⇒ Object
Updates the descriptive name for the dataset.
If the dataset is not a full resource representation (see #resource_full?), the full representation will be retrieved before the update to comply with ETag-based optimistic concurrency control.
129 130 131 132 133 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 129 def name= new_name reload! unless resource_full? @gapi.update! friendly_name: new_name patch_gapi! :friendly_name end |
#project_id ⇒ String
The ID of the project containing this dataset.
89 90 91 92 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 89 def project_id return reference.project_id if reference? @gapi.dataset_reference.project_id end |
#query(query, params: nil, types: nil, external: nil, max: nil, cache: true, standard_sql: nil, legacy_sql: nil) {|job| ... } ⇒ Google::Cloud::Bigquery::Data
Queries data and waits for the results. In this method, a QueryJob is created and its results are saved to a temporary table, then read from the table. Timeouts and transient errors are generally handled as needed to complete the query. When used for executing DDL/DML statements, this method does not return row data.
Sets the current dataset as the default dataset in the query. Useful for using unqualified table names.
The geographic location for the job ("US", "EU", etc.) can be set via QueryJob::Updater#location= in a block passed to this method. If the dataset is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the dataset.
1611 1612 1613 1614 1615 1616 1617 1618 1619 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 1611 def query query, params: nil, types: nil, external: nil, max: nil, cache: true, standard_sql: nil, legacy_sql: nil, &block job = query_job query, params: params, types: types, external: external, cache: cache, standard_sql: standard_sql, legacy_sql: legacy_sql, &block job.wait_until_done! ensure_job_succeeded! job job.data max: max end |
#query_job(query, params: nil, types: nil, external: nil, priority: "INTERACTIVE", cache: true, table: nil, create: nil, write: nil, dryrun: nil, standard_sql: nil, legacy_sql: nil, large_results: nil, flatten: nil, maximum_billing_tier: nil, maximum_bytes_billed: nil, job_id: nil, prefix: nil, labels: nil, udfs: nil) {|job| ... } ⇒ Google::Cloud::Bigquery::QueryJob
Queries data by creating a query job.
Sets the current dataset as the default dataset in the query. Useful for using unqualified table names.
The geographic location for the job ("US", "EU", etc.) can be set via QueryJob::Updater#location= in a block passed to this method. If the dataset is a full resource representation (see #resource_full?), the location of the job will be automatically set to the location of the dataset.
1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 1352 def query_job query, params: nil, types: nil, external: nil, priority: "INTERACTIVE", cache: true, table: nil, create: nil, write: nil, dryrun: nil, standard_sql: nil, legacy_sql: nil, large_results: nil, flatten: nil, maximum_billing_tier: nil, maximum_bytes_billed: nil, job_id: nil, prefix: nil, labels: nil, udfs: nil ensure_service! = { params: params, types: types, external: external, priority: priority, cache: cache, table: table, create: create, write: write, dryrun: dryrun, standard_sql: standard_sql, legacy_sql: legacy_sql, large_results: large_results, flatten: flatten, maximum_billing_tier: maximum_billing_tier, maximum_bytes_billed: maximum_bytes_billed, job_id: job_id, prefix: prefix, labels: labels, udfs: udfs } updater = QueryJob::Updater. service, query, updater.dataset = self updater.location = location if location # may be dataset reference yield updater if block_given? gapi = service.query_job updater.to_gapi Job.from_gapi gapi, service end |
#reference? ⇒ Boolean
Whether the dataset was created without retrieving the resource representation from the BigQuery service.
2231 2232 2233 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 2231 def reference? @gapi.nil? end |
#reload! ⇒ Google::Cloud::Bigquery::Dataset Also known as: refresh!
Reloads the dataset with current data from the BigQuery service.
2175 2176 2177 2178 2179 2180 2181 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 2175 def reload! ensure_service! @gapi = service.get_dataset dataset_id @reference = nil @exists = nil self end |
#resource? ⇒ Boolean
Whether the dataset was created with a resource representation from the BigQuery service.
2253 2254 2255 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 2253 def resource? !@gapi.nil? end |
#resource_full? ⇒ Boolean
Whether the dataset was created with a full resource representation from the BigQuery service.
2300 2301 2302 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 2300 def resource_full? @gapi.is_a? Google::Apis::BigqueryV2::Dataset end |
#resource_partial? ⇒ Boolean
Whether the dataset was created with a partial resource representation from the BigQuery service by retrieval through Project#datasets. See Datasets: list response for the contents of the partial representation. Accessing any attribute outside of the partial representation will result in loading the full representation.
2280 2281 2282 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 2280 def resource_partial? @gapi.is_a? Google::Apis::BigqueryV2::DatasetList::Dataset end |
#routine(routine_id, skip_lookup: nil) ⇒ Google::Cloud::Bigquery::Routine?
Retrieves an existing routine by ID.
986 987 988 989 990 991 992 993 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 986 def routine routine_id, skip_lookup: nil ensure_service! return Routine.new_reference project_id, dataset_id, routine_id, service if skip_lookup gapi = service.get_routine dataset_id, routine_id Routine.from_gapi gapi, service rescue Google::Cloud::NotFoundError nil end |
#routines(token: nil, max: nil, filter: nil) ⇒ Array<Google::Cloud::Bigquery::Routine>
Retrieves the list of routines belonging to the dataset.
1031 1032 1033 1034 1035 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 1031 def routines token: nil, max: nil, filter: nil ensure_service! gapi = service.list_routines dataset_id, token: token, max: max, filter: filter Routine::List.from_gapi gapi, service, dataset_id, max, filter: filter end |
#table(table_id, skip_lookup: nil) ⇒ Google::Cloud::Bigquery::Table?
Retrieves an existing table by ID.
739 740 741 742 743 744 745 746 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 739 def table table_id, skip_lookup: nil ensure_service! return Table.new_reference project_id, dataset_id, table_id, service if skip_lookup gapi = service.get_table dataset_id, table_id Table.from_gapi gapi, service rescue Google::Cloud::NotFoundError nil end |
#tables(token: nil, max: nil) ⇒ Array<Google::Cloud::Bigquery::Table>
Retrieves the list of tables belonging to the dataset.
782 783 784 785 786 |
# File 'lib/google/cloud/bigquery/dataset.rb', line 782 def tables token: nil, max: nil ensure_service! gapi = service.list_tables dataset_id, token: token, max: max Table::List.from_gapi gapi, service, dataset_id, max end |