BigQueryStorageClient

BigQueryStorageClient

BigQuery storage API.

The BigQuery storage API can be used to read data stored in BigQuery.

Constructor

new BigQueryStorageClient(optionsopt)

Construct an instance of BigQueryStorageClient.

Parameters:
Name Type Attributes Description
options object <optional>

The configuration object. The options accepted by the constructor are described in detail in this document. The common options are:

Properties
Name Type Attributes Description
credentials object <optional>

Credentials object.

Properties
Name Type Attributes Description
client_email string <optional>
private_key string <optional>
email string <optional>

Account email address. Required when using a .pem or .p12 keyFilename.

keyFilename string <optional>

Full path to the a .json, .pem, or .p12 key downloaded from the Google Developers Console. If you provide a path to a JSON file, the projectId option below is not necessary. NOTE: .pem and .p12 require you to specify options.email as well.

port number <optional>

The port on which to connect to the remote host.

projectId string <optional>

The project ID from the Google Developer's Console, e.g. 'grape-spaceship-123'. We will also check the environment variable GCLOUD_PROJECT for your project ID. If your app is running in an environment which supports Application Default Credentials, your project ID will be detected automatically.

apiEndpoint string <optional>

The domain name of the API remote host.

clientConfig gax.ClientConfig <optional>

Client configuration override. Follows the structure of gapicConfig.

fallback boolean <optional>

Use HTTP fallback mode. In fallback mode, a special browser-compatible transport implementation is used instead of gRPC transport. In browser context (if the window object is defined) the fallback mode is enabled automatically; set options.fallback to false if you need to override this behavior.

Members

apiEndpoint

The DNS address for this API service - same as servicePath(), exists for compatibility reasons.

port

The port for this API service.

scopes

The scopes needed to make gRPC calls for every method defined in this service.

servicePath

The DNS address for this API service.

Methods

batchCreateReadSessionStreams(request, optionsopt) → {Promise}

Creates additional streams for a ReadSession. This API can be used to dynamically adjust the parallelism of a batch processing task upwards by adding additional workers.

Parameters:
Name Type Attributes Description
request Object

The request object that will be sent.

Properties
Name Type Description
session google.cloud.bigquery.storage.v1beta1.ReadSession

Required. Must be a non-expired session obtained from a call to CreateReadSession. Only the name field needs to be set.

requestedStreams number

Required. Number of new streams requested. Must be positive. Number of added streams may be less than this, see CreateReadSessionRequest for more information.

options object <optional>

Call options. See CallOptions for more details.

Returns:
Type Description
Promise
Example
const [response] = await client.batchCreateReadSessionStreams(request);

close() → {Promise}

Terminate the gRPC channel and close the client.

The client will no longer be usable and all future behavior is undefined.

Returns:
Type Description
Promise

A promise that resolves when the client is closed.

createReadSession(request, optionsopt) → {Promise}

Creates a new read session. A read session divides the contents of a BigQuery table into one or more streams, which can then be used to read data from the table. The read session also specifies properties of the data to be read, such as a list of columns or a push-down filter describing the rows to be returned.

A particular row can be read by at most one stream. When the caller has reached the end of each stream in the session, then all the data in the table has been read.

Read sessions automatically expire 24 hours after they are created and do not require manual clean-up by the caller.

Parameters:
Name Type Attributes Description
request Object

The request object that will be sent.

Properties
Name Type Description
tableReference google.cloud.bigquery.storage.v1beta1.TableReference

Required. Reference to the table to read.

parent string

Required. String of the form projects/{project_id} indicating the project this ReadSession is associated with. This is the project that will be billed for usage.

tableModifiers google.cloud.bigquery.storage.v1beta1.TableModifiers

Any modifiers to the Table (e.g. snapshot timestamp).

requestedStreams number

Initial number of streams. If unset or 0, we will provide a value of streams so as to produce reasonable throughput. Must be non-negative. The number of streams may be lower than the requested number, depending on the amount parallelism that is reasonable for the table and the maximum amount of parallelism allowed by the system.

Streams must be read starting from offset 0.

readOptions google.cloud.bigquery.storage.v1beta1.TableReadOptions

Read options for this session (e.g. column selection, filters).

format google.cloud.bigquery.storage.v1beta1.DataFormat

Data output format. Currently default to Avro.

shardingStrategy google.cloud.bigquery.storage.v1beta1.ShardingStrategy

The strategy to use for distributing data among multiple streams. Currently defaults to liquid sharding.

options object <optional>

Call options. See CallOptions for more details.

Returns:
Type Description
Promise
  • The promise which resolves to an array. The first element of the array is an object representing ReadSession. Please see the documentation for more details and examples.
Example
const [response] = await client.createReadSession(request);

finalizeStream(request, optionsopt) → {Promise}

Triggers the graceful termination of a single stream in a ReadSession. This API can be used to dynamically adjust the parallelism of a batch processing task downwards without losing data.

This API does not delete the stream -- it remains visible in the ReadSession, and any data processed by the stream is not released to other streams. However, no additional data will be assigned to the stream once this call completes. Callers must continue reading data on the stream until the end of the stream is reached so that data which has already been assigned to the stream will be processed.

This method will return an error if there are no other live streams in the Session, or if SplitReadStream() has been called on the given Stream.

Parameters:
Name Type Attributes Description
request Object

The request object that will be sent.

Properties
Name Type Description
stream google.cloud.bigquery.storage.v1beta1.Stream

Required. Stream to finalize.

options object <optional>

Call options. See CallOptions for more details.

Returns:
Type Description
Promise
  • The promise which resolves to an array. The first element of the array is an object representing Empty. Please see the documentation for more details and examples.
Example
const [response] = await client.finalizeStream(request);

getProjectId() → {Promise}

Return the project ID used by this class.

Returns:
Type Description
Promise

A promise that resolves to string containing the project ID.

initialize() → {Promise}

Initialize the client. Performs asynchronous operations (such as authentication) and prepares the client. This function will be called automatically when any class method is called for the first time, but if you need to initialize it before calling an actual method, feel free to call initialize() directly.

You can await on this method if you want to make sure the client is initialized.

Returns:
Type Description
Promise

A promise that resolves to an authenticated service stub.

matchLocationFromReadSessionName(readSessionName) → {string}

Parse the location from ReadSession resource.

Parameters:
Name Type Description
readSessionName string

A fully-qualified path representing ReadSession resource.

Returns:
Type Description
string

A string representing the location.

matchLocationFromStreamName(streamName) → {string}

Parse the location from Stream resource.

Parameters:
Name Type Description
streamName string

A fully-qualified path representing Stream resource.

Returns:
Type Description
string

A string representing the location.

matchProjectFromProjectName(projectName) → {string}

Parse the project from Project resource.

Parameters:
Name Type Description
projectName string

A fully-qualified path representing Project resource.

Returns:
Type Description
string

A string representing the project.

matchProjectFromReadSessionName(readSessionName) → {string}

Parse the project from ReadSession resource.

Parameters:
Name Type Description
readSessionName string

A fully-qualified path representing ReadSession resource.

Returns:
Type Description
string

A string representing the project.

matchProjectFromStreamName(streamName) → {string}

Parse the project from Stream resource.

Parameters:
Name Type Description
streamName string

A fully-qualified path representing Stream resource.

Returns:
Type Description
string

A string representing the project.

matchSessionFromReadSessionName(readSessionName) → {string}

Parse the session from ReadSession resource.

Parameters:
Name Type Description
readSessionName string

A fully-qualified path representing ReadSession resource.

Returns:
Type Description
string

A string representing the session.

matchStreamFromStreamName(streamName) → {string}

Parse the stream from Stream resource.

Parameters:
Name Type Description
streamName string

A fully-qualified path representing Stream resource.

Returns:
Type Description
string

A string representing the stream.

projectPath(project) → {string}

Return a fully-qualified project resource name string.

Parameters:
Name Type Description
project string
Returns:
Type Description
string

Resource name string.

readRows(request, optionsopt) → {Stream}

Reads rows from the table in the format prescribed by the read session. Each response contains one or more table rows, up to a maximum of 10 MiB per response; read requests which attempt to read individual rows larger than this will fail.

Each request also returns a set of stream statistics reflecting the estimated total number of rows in the read stream. This number is computed based on the total table size and the number of active streams in the read session, and may change as other streams continue to read data.

Parameters:
Name Type Attributes Description
request Object

The request object that will be sent.

Properties
Name Type Description
readPosition google.cloud.bigquery.storage.v1beta1.StreamPosition

Required. Identifier of the position in the stream to start reading from. The offset requested must be less than the last row read from ReadRows. Requesting a larger offset is undefined.

options object <optional>

Call options. See CallOptions for more details.

Returns:
Type Description
Stream

An object stream which emits ReadRowsResponse on 'data' event. Please see the documentation for more details and examples.

Example
const stream = client.readRows(request);
stream.on('data', (response) => { ... });
stream.on('end', () => { ... });

readSessionPath(project, location, session) → {string}

Return a fully-qualified readSession resource name string.

Parameters:
Name Type Description
project string
location string
session string
Returns:
Type Description
string

Resource name string.

splitReadStream(request, optionsopt) → {Promise}

Splits a given read stream into two Streams. These streams are referred to as the primary and the residual of the split. The original stream can still be read from in the same manner as before. Both of the returned streams can also be read from, and the total rows return by both child streams will be the same as the rows read from the original stream.

Moreover, the two child streams will be allocated back to back in the original Stream. Concretely, it is guaranteed that for streams Original, Primary, and Residual, that Original[0-j] = Primary[0-j] and Original[j-n] = Residual[0-m] once the streams have been read to completion.

This method is guaranteed to be idempotent.

Parameters:
Name Type Attributes Description
request Object

The request object that will be sent.

Properties
Name Type Description
originalStream google.cloud.bigquery.storage.v1beta1.Stream

Required. Stream to split.

fraction number

A value in the range (0.0, 1.0) that specifies the fractional point at which the original stream should be split. The actual split point is evaluated on pre-filtered rows, so if a filter is provided, then there is no guarantee that the division of the rows between the new child streams will be proportional to this fractional value. Additionally, because the server-side unit for assigning data is collections of rows, this fraction will always map to to a data storage boundary on the server side.

options object <optional>

Call options. See CallOptions for more details.

Returns:
Type Description
Promise
  • The promise which resolves to an array. The first element of the array is an object representing SplitReadStreamResponse. Please see the documentation for more details and examples.
Example
const [response] = await client.splitReadStream(request);

streamPath(project, location, stream) → {string}

Return a fully-qualified stream resource name string.

Parameters:
Name Type Description
project string
location string
stream string
Returns:
Type Description
string

Resource name string.