Class: Google::Cloud::Bigquery::Job

Inherits:
Object
  • Object
show all
Defined in:
lib/google/cloud/bigquery/job.rb,
lib/google/cloud/bigquery/job/list.rb

Overview

Job

Represents a generic Job that may be performed on a Table.

The subclasses of Job represent the specific BigQuery job types: CopyJob, ExtractJob, LoadJob, and QueryJob.

A job instance is created when you call Project#query_job, Dataset#query_job, Table#copy_job, Table#extract_job, Table#load_job.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

job = bigquery.query_job "SELECT COUNT(word) as count FROM " \
                         "`bigquery-public-data.samples.shakespeare`"

job.wait_until_done!

if job.failed?
  puts job.error
else
  puts job.data.first
end

See Also:

Direct Known Subclasses

CopyJob, ExtractJob, LoadJob, QueryJob

Defined Under Namespace

Classes: List, ScriptStackFrame, ScriptStatistics

Attributes collapse

Instance Method Summary collapse

Instance Method Details

#cancelObject

Cancels the job.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

query = "SELECT COUNT(word) as count FROM " \
        "`bigquery-public-data.samples.shakespeare`"

job = bigquery.query_job query

job.cancel


368
369
370
371
372
373
# File 'lib/google/cloud/bigquery/job.rb', line 368

def cancel
  ensure_service!
  resp = service.cancel_job job_id, location: location
  @gapi = resp.job
  true
end

#configurationObject Also known as: config

The configuration for the job. Returns a hash.

See Also:



271
272
273
# File 'lib/google/cloud/bigquery/job.rb', line 271

def configuration
  JSON.parse @gapi.configuration.to_json
end

#created_atTime?

The time when the job was created.

Returns:

  • (Time, nil)

    The creation time from the job statistics.



175
176
177
# File 'lib/google/cloud/bigquery/job.rb', line 175

def created_at
  Convert.millis_to_time @gapi.statistics.creation_time
end

#done?Boolean

Checks if the job's state is DONE. When true, the job has stopped running. However, a DONE state does not mean that the job completed successfully. Use #failed? to detect if an error occurred or if the job was successful.

Returns:

  • (Boolean)

    true when DONE, false otherwise.



155
156
157
158
# File 'lib/google/cloud/bigquery/job.rb', line 155

def done?
  return false if state.nil?
  "done".casecmp(state).zero?
end

#ended_atTime?

The time when the job ended. This field is present when the job's state is DONE.

Returns:

  • (Time, nil)

    The end time from the job statistics.



196
197
198
# File 'lib/google/cloud/bigquery/job.rb', line 196

def ended_at
  Convert.millis_to_time @gapi.statistics.end_time
end

#errorHash?

The last error for the job, if any errors have occurred. Returns a hash.

Returns:

  • (Hash, nil)

    Returns a hash containing reason and message keys:

    { "reason"=>"notFound", "message"=>"Not found: Table bigquery-public-data:samples.BAD_ID" }

See Also:



314
315
316
# File 'lib/google/cloud/bigquery/job.rb', line 314

def error
  status["errorResult"]
end

#errorsArray<Hash>?

The errors for the job, if any errors have occurred. Returns an array of hash objects. See #error.

Returns:

  • (Array<Hash>, nil)

    Returns an array of hashes containing reason and message keys:

    { "reason"=>"notFound", "message"=>"Not found: Table bigquery-public-data:samples.BAD_ID" }



330
331
332
# File 'lib/google/cloud/bigquery/job.rb', line 330

def errors
  Array status["errors"]
end

#failed?Boolean

Checks if an error is present. Use #error to access the error object.

Returns:

  • (Boolean)

    true when there is an error, false otherwise.



166
167
168
# File 'lib/google/cloud/bigquery/job.rb', line 166

def failed?
  !error.nil?
end

#job_idString

The ID of the job.

Returns:

  • (String)

    The ID must contain only letters (a-z, A-Z), numbers (0-9), underscores (_), or dashes (-). The maximum length is 1,024 characters.



81
82
83
# File 'lib/google/cloud/bigquery/job.rb', line 81

def job_id
  @gapi.job_reference.job_id
end

#labelsHash

A hash of user-provided labels associated with this job. Labels can be provided when the job is created, and used to organize and group jobs.

The returned hash is frozen and changes are not allowed. Use CopyJob::Updater#labels= or ExtractJob::Updater#labels= or LoadJob::Updater#labels= or QueryJob::Updater#labels= to replace the entire hash.

Returns:

  • (Hash)

    The job labels.



347
348
349
350
351
# File 'lib/google/cloud/bigquery/job.rb', line 347

def labels
  m = @gapi.configuration.labels
  m = m.to_h if m.respond_to? :to_h
  m.dup.freeze
end

#locationString

The geographic location where the job runs.

Returns:

  • (String)

    A geographic location, such as "US", "EU" or "asia-northeast1".



101
102
103
# File 'lib/google/cloud/bigquery/job.rb', line 101

def location
  @gapi.job_reference.location
end

#num_child_jobsInteger

The number of child jobs executed.

Returns:

  • (Integer)

    The number of child jobs executed.



205
206
207
# File 'lib/google/cloud/bigquery/job.rb', line 205

def num_child_jobs
  @gapi.statistics.num_child_jobs || 0
end

#parent_job_idString?

If this is a child job, the id of the parent.

Returns:

  • (String, nil)

    The ID of the parent job, or nil if not a child job.



214
215
216
# File 'lib/google/cloud/bigquery/job.rb', line 214

def parent_job_id
  @gapi.statistics.parent_job_id
end

#pending?Boolean

Checks if the job's state is PENDING.

Returns:

  • (Boolean)

    true when PENDING, false otherwise.



142
143
144
145
# File 'lib/google/cloud/bigquery/job.rb', line 142

def pending?
  return false if state.nil?
  "pending".casecmp(state).zero?
end

#project_idString

The ID of the project containing the job.

Returns:

  • (String)

    The project ID.



90
91
92
# File 'lib/google/cloud/bigquery/job.rb', line 90

def project_id
  @gapi.job_reference.project_id
end

#reload!Object Also known as: refresh!

Reloads the job with current data from the BigQuery service.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

query = "SELECT COUNT(word) as count FROM " \
        "`bigquery-public-data.samples.shakespeare`"

job = bigquery.query_job query

job.done?
job.reload!
job.done? #=> true


414
415
416
417
418
# File 'lib/google/cloud/bigquery/job.rb', line 414

def reload!
  ensure_service!
  gapi = service.get_job job_id, location: location
  @gapi = gapi
end

#rerun!Object

Created a new job with the current configuration.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

query = "SELECT COUNT(word) as count FROM " \
        "`bigquery-public-data.samples.shakespeare`"

job = bigquery.query_job query

job.wait_until_done!
job.rerun!


391
392
393
394
395
# File 'lib/google/cloud/bigquery/job.rb', line 391

def rerun!
  ensure_service!
  gapi = service.insert_job @gapi.configuration, location: location
  Job.from_gapi gapi, service
end

#running?Boolean

Checks if the job's state is RUNNING.

Returns:

  • (Boolean)

    true when RUNNING, false otherwise.



132
133
134
135
# File 'lib/google/cloud/bigquery/job.rb', line 132

def running?
  return false if state.nil?
  "running".casecmp(state).zero?
end

#script_statisticsGoogle::Cloud::Bigquery::Job::ScriptStatistics?

The statistics including stack frames for a child job of a script.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new

multi_statement_sql = <<~SQL
  -- Declare a variable to hold names as an array.
  DECLARE top_names ARRAY<STRING>;
  -- Build an array of the top 100 names from the year 2017.
  SET top_names = (
  SELECT ARRAY_AGG(name ORDER BY number DESC LIMIT 100)
  FROM `bigquery-public-data.usa_names.usa_1910_current`
  WHERE year = 2017
  );
  -- Which names appear as words in Shakespeare's plays?
  SELECT
  name AS shakespeare_name
  FROM UNNEST(top_names) AS name
  WHERE name IN (
  SELECT word
  FROM `bigquery-public-data.samples.shakespeare`
  );
SQL

job = bigquery.query_job multi_statement_sql

job.wait_until_done!

child_jobs = bigquery.jobs parent_job: job

child_jobs.each do |child_job|
  script_statistics = child_job.script_statistics
  puts script_statistics.evaluation_kind
  script_statistics.stack_frames.each do |stack_frame|
    puts stack_frame.text
  end
end

Returns:



262
263
264
# File 'lib/google/cloud/bigquery/job.rb', line 262

def script_statistics
  ScriptStatistics.from_gapi @gapi.statistics.script_statistics if @gapi.statistics.script_statistics
end

#started_atTime?

The time when the job was started. This field is present after the job's state changes from PENDING to either RUNNING or DONE.

Returns:

  • (Time, nil)

    The start time from the job statistics.



186
187
188
# File 'lib/google/cloud/bigquery/job.rb', line 186

def started_at
  Convert.millis_to_time @gapi.statistics.start_time
end

#stateString

The current state of the job. A DONE state does not mean that the job completed successfully. Use #failed? to discover if an error occurred or if the job was successful.

Returns:

  • (String)

    The state code. The possible values are PENDING, RUNNING, and DONE.



122
123
124
125
# File 'lib/google/cloud/bigquery/job.rb', line 122

def state
  return nil if @gapi.status.nil?
  @gapi.status.state
end

#statisticsHash Also known as: stats

The statistics for the job. Returns a hash.

Returns:

  • (Hash)

    The job statistics.

See Also:



284
285
286
# File 'lib/google/cloud/bigquery/job.rb', line 284

def statistics
  JSON.parse @gapi.statistics.to_json
end

#statusHash

The job's status. Returns a hash. The values contained in the hash are also exposed by #state, #error, and #errors.

Returns:

  • (Hash)

    The job status.



295
296
297
# File 'lib/google/cloud/bigquery/job.rb', line 295

def status
  JSON.parse @gapi.status.to_json
end

#user_emailString

The email address of the user who ran the job.

Returns:

  • (String)

    The email address.



110
111
112
# File 'lib/google/cloud/bigquery/job.rb', line 110

def user_email
  @gapi.user_email
end

#wait_until_done!Object

Refreshes the job until the job is DONE. The delay between refreshes starts at 5 seconds and increases exponentially to a maximum of 60 seconds.

Examples:

require "google/cloud/bigquery"

bigquery = Google::Cloud::Bigquery.new
dataset = bigquery.dataset "my_dataset"
table = dataset.table "my_table"

extract_job = table.extract_job "gs://my-bucket/file-name.json",
                                format: "json"
extract_job.wait_until_done!
extract_job.done? #=> true


438
439
440
441
442
443
444
445
446
447
448
449
# File 'lib/google/cloud/bigquery/job.rb', line 438

def wait_until_done!
  backoff = lambda do |retries|
    delay = [retries**2 + 5, 60].min # Maximum delay is 60
    sleep delay
  end
  retries = 0
  until done?
    backoff.call retries
    retries += 1
    reload!
  end
end