As of January 1, 2020 this library no longer supports Python 2 on the latest released version. Library versions released prior to that date will continue to be available. For more information please visit Python 2 support on Google Cloud.

Bigquery Storage v1beta2 API Library¶

Parent client for calling the Cloud BigQuery Storage API.

This is the base from which all interactions with the API occur.

class google.cloud.bigquery_storage_v1beta2.client.BigQueryReadClient(*, credentials: typing.Optional[google.auth.credentials.Credentials] = None, transport: typing.Optional[typing.Union[str, google.cloud.bigquery_storage_v1beta2.services.big_query_read.transports.base.BigQueryReadTransport, typing.Callable[[...], google.cloud.bigquery_storage_v1beta2.services.big_query_read.transports.base.BigQueryReadTransport]]] = None, client_options: typing.Optional[typing.Union[google.api_core.client_options.ClientOptions, dict]] = None, client_info: google.api_core.gapic_v1.client_info.ClientInfo = <google.api_core.gapic_v1.client_info.ClientInfo object>)[source]¶

Client for interacting with BigQuery Storage API.

The BigQuery storage API can be used to read data stored in BigQuery.

Instantiates the big query read client.

Parameters

credentials (Optional[google.auth.credentials.Credentials]) – The authorization credentials to attach to requests. These credentials identify the application to the service; if none are specified, the client will attempt to ascertain the credentials from the environment.
transport (Optional[Union[str,BigQueryReadTransport,Callable[..., BigQueryReadTransport]]]) – The transport to use, or a Callable that constructs and returns a new transport. If a Callable is given, it will be called with the same set of initialization arguments as used in the BigQueryReadTransport constructor. If set to None, a transport is chosen automatically.
client_options (Optional[Union[google.api_core.client_options.ClientOptions, dict]]) –
Custom options for the client.

1. The api_endpoint property can be used to override the default endpoint provided by the client when transport is not explicitly provided. Only if this property is not set and transport was not explicitly provided, the endpoint is determined by the GOOGLE_API_USE_MTLS_ENDPOINT environment variable, which have one of the following values: “always” (always use the default mTLS endpoint), “never” (always use the default regular endpoint) and “auto” (auto-switch to the default mTLS endpoint if client certificate is present; this is the default value).

2. If the GOOGLE_API_USE_CLIENT_CERTIFICATE environment variable is “true”, then the client_cert_source property can be used to provide a client certificate for mTLS transport. If not provided, the default SSL client certificate will be used if present. If GOOGLE_API_USE_CLIENT_CERTIFICATE is “false” or not set, no client certificate will be used.

3. The universe_domain property can be used to override the default “googleapis.com” universe. Note that the api_endpoint property still takes precedence; and universe_domain is currently not supported for mTLS.
client_info (google.api_core.gapic_v1.client_info.ClientInfo) – The client info used to send a user-agent string along with API requests. If None, then default info will be used. Generally, you only need to set this if you’re developing your own client library.

Raises

google.auth.exceptions.MutualTLSChannelError – If mutual TLS transport creation failed for any reason.

__exit__(type, value, traceback)[source]¶: Releases underlying transport’s resources.

Warning

ONLY use as a context manager if the transport is NOT shared with other clients! Exiting the with block will CLOSE the transport and may cause errors in other clients!

property api_endpoint¶

Return the API endpoint used by the client instance.

Returns: The API endpoint used by the client instance.
Return type: str

static common_billing_account_path(billing_account: str) → str[source]¶: Returns a fully-qualified billing_account string.

static common_folder_path(folder: str) → str[source]¶: Returns a fully-qualified folder string.

static common_location_path(project: str, location: str) → str[source]¶: Returns a fully-qualified location string.

static common_organization_path(organization: str) → str[source]¶: Returns a fully-qualified organization string.

static common_project_path(project: str) → str[source]¶: Returns a fully-qualified project string.

create_read_session(request: Optional[Union[google.cloud.bigquery_storage_v1beta2.types.storage.CreateReadSessionRequest, dict]] = None, *, parent: Optional[str] = None, read_session: Optional[google.cloud.bigquery_storage_v1beta2.types.stream.ReadSession] = None, max_stream_count: Optional[int] = None, retry: Optional[Union[google.api_core.retry.retry_unary.Retry, google.api_core.gapic_v1.method._MethodDefault]] = _MethodDefault._DEFAULT_VALUE, timeout: Union[float, object] = _MethodDefault._DEFAULT_VALUE, metadata: Sequence[Tuple[str, str]] = ()) → google.cloud.bigquery_storage_v1beta2.types.stream.ReadSession[source]¶

Creates a new read session. A read session divides the contents of a BigQuery table into one or more streams, which can then be used to read data from the table. The read session also specifies properties of the data to be read, such as a list of columns or a push-down filter describing the rows to be returned.

A particular row can be read by at most one stream. When the caller has reached the end of each stream in the session, then all the data in the table has been read.

Data is assigned to each stream such that roughly the same number of rows can be read from each stream. Because the server-side unit for assigning data is collections of rows, the API does not guarantee that each stream will return the same number or rows. Additionally, the limits are enforced based on the number of pre-filtered rows, so some filters can lead to lopsided assignments.

Read sessions automatically expire 6 hours after they are created and do not require manual clean-up by the caller.

# This snippet has been automatically generated and should be regarded as a
# code template only.
# It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
#   client as shown in:
#   https://googleapis.dev/python/google-api-core/latest/client_options.html
from google.cloud import bigquery_storage_v1beta2

def sample_create_read_session():
    # Create a client
    client = bigquery_storage_v1beta2.BigQueryReadClient()

    # Initialize request argument(s)
    request = bigquery_storage_v1beta2.CreateReadSessionRequest(
        parent="parent_value",
    )

    # Make the request
    response = client.create_read_session(request=request)

    # Handle the response
    print(response)

Parameters

request (Union[google.cloud.bigquery_storage_v1beta2.types.CreateReadSessionRequest, dict]) – The request object. Request message for CreateReadSession.
parent (str) –
Required. The request project that owns the session, in the form of projects/{project_id}.

This corresponds to the parent field on the request instance; if request is provided, this should not be set.
read_session (google.cloud.bigquery_storage_v1beta2.types.ReadSession) – Required. Session to be created. This corresponds to the read_session field on the request instance; if request is provided, this should not be set.
max_stream_count (int) –
Max initial number of streams. If unset or zero, the server will provide a value of streams so as to produce reasonable throughput. Must be non-negative. The number of streams may be lower than the requested number, depending on the amount parallelism that is reasonable for the table. Error will be returned if the max count is greater than the current system max limit of 1,000.

Streams must be read starting from offset 0.

This corresponds to the max_stream_count field on the request instance; if request is provided, this should not be set.
retry (google.api_core.retry.Retry) – Designation of what errors, if any, should be retried.
timeout (float) – The timeout for this request.
metadata (Sequence[Tuple[str, str]]) – Strings which should be sent along with the request as metadata.

Returns

Information about the ReadSession.

Return type

google.cloud.bigquery_storage_v1beta2.types.ReadSession

classmethod from_service_account_file(filename: str, *args, **kwargs)[source]¶

Creates an instance of this client using the provided credentials: file.

Parameters

filename (str) – The path to the service account private key json file.
args – Additional arguments to pass to the constructor.
kwargs – Additional arguments to pass to the constructor.

Returns

The constructed client.

Return type

BigQueryReadClient

classmethod from_service_account_info(info: dict, *args, **kwargs)[source]¶

Creates an instance of this client using the provided credentials: info.

Parameters

info (dict) – The service account private key info.
args – Additional arguments to pass to the constructor.
kwargs – Additional arguments to pass to the constructor.

Returns

The constructed client.

Return type

BigQueryReadClient

classmethod from_service_account_json(filename: str, *args, **kwargs)¶

Creates an instance of this client using the provided credentials: file.

Parameters

filename (str) – The path to the service account private key json file.
args – Additional arguments to pass to the constructor.
kwargs – Additional arguments to pass to the constructor.

Returns

The constructed client.

Return type

BigQueryReadClient

classmethod get_mtls_endpoint_and_cert_source(client_options: Optional[google.api_core.client_options.ClientOptions] = None)[source]¶

Deprecated. Return the API endpoint and client cert source for mutual TLS.

The client cert source is determined in the following order: (1) if GOOGLE_API_USE_CLIENT_CERTIFICATE environment variable is not “true”, the client cert source is None. (2) if client_options.client_cert_source is provided, use the provided one; if the default client cert source exists, use the default one; otherwise the client cert source is None.

The API endpoint is determined in the following order: (1) if client_options.api_endpoint if provided, use the provided one. (2) if GOOGLE_API_USE_CLIENT_CERTIFICATE environment variable is “always”, use the default mTLS endpoint; if the environment variable is “never”, use the default API endpoint; otherwise if client cert source exists, use the default mTLS endpoint, otherwise use the default API endpoint.

More details can be found at https://google.aip.dev/auth/4114.

Parameters

client_options (google.api_core.client_options.ClientOptions) – Custom options for the client. Only the api_endpoint and client_cert_source properties may be used in this method.

Returns

returns the API endpoint and the: client cert source to use.

Return type

Tuple[str, Callable[[], Tuple[bytes, bytes]]]

Raises

google.auth.exceptions.MutualTLSChannelError – If any errors happen.

static parse_common_billing_account_path(path: str) → Dict[str, str][source]¶: Parse a billing_account path into its component segments.

static parse_common_folder_path(path: str) → Dict[str, str][source]¶: Parse a folder path into its component segments.

static parse_common_location_path(path: str) → Dict[str, str][source]¶: Parse a location path into its component segments.

static parse_common_organization_path(path: str) → Dict[str, str][source]¶: Parse a organization path into its component segments.

static parse_common_project_path(path: str) → Dict[str, str][source]¶: Parse a project path into its component segments.

static parse_read_session_path(path: str) → Dict[str, str][source]¶: Parses a read_session path into its component segments.

static parse_read_stream_path(path: str) → Dict[str, str][source]¶: Parses a read_stream path into its component segments.

static parse_table_path(path: str) → Dict[str, str][source]¶: Parses a table path into its component segments.

read_rows(name, offset=0, retry=_MethodDefault._DEFAULT_VALUE, timeout=_MethodDefault._DEFAULT_VALUE, metadata=(), retry_delay_callback=None)[source]¶

Reads rows from the table in the format prescribed by the read session. Each response contains one or more table rows, up to a maximum of 10 MiB per response; read requests which attempt to read individual rows larger than this will fail.

Each request also returns a set of stream statistics reflecting the estimated total number of rows in the read stream. This number is computed based on the total table size and the number of active streams in the read session, and may change as other streams continue to read data.

Example

>>> from google.cloud import bigquery_storage
>>>
>>> client = bigquery_storage.BigQueryReadClient()
>>>
>>> # TODO: Initialize ``table``:
>>> table = "projects/{}/datasets/{}/tables/{}".format(
...     'project_id': 'your-data-project-id',
...     'dataset_id': 'your_dataset_id',
...     'table_id': 'your_table_id',
... )
>>>
>>> # TODO: Initialize `parent`:
>>> parent = 'projects/your-billing-project-id'
>>>
>>> requested_session = bigquery_storage.types.ReadSession(
...     table=table,
...     data_format=bigquery_storage.types.DataFormat.AVRO,
... )
>>> session = client.create_read_session(
...     parent=parent, read_session=requested_session
... )
>>>
>>> stream = session.streams[0],  # TODO: Also read any other streams.
>>> read_rows_stream = client.read_rows(stream.name)
>>>
>>> for element in read_rows_stream.rows(session):
...     # process element
...     pass

Parameters

name (str) – Required. Name of the stream to start reading from, of the form projects/{project_id}/locations/{location}/sessions/{session_id}/streams/{stream_id}
offset (Optional[int]) – The starting offset from which to begin reading rows from in the stream. The offset requested must be less than the last row read from ReadRows. Requesting a larger offset is undefined.
retry (Optional[google.api_core.retry.Retry]) – A retry object used to retry requests. If None is specified, requests will not be retried.
timeout (Optional[float]) – The amount of time, in seconds, to wait for the request to complete. Note that if retry is specified, the timeout applies to each individual attempt.
metadata (Optional[Sequence[Tuple[str, str]]]) – Additional metadata that is provided to the method.
retry_delay_callback (Optional[Callable[[float], None]]) – If the client receives a retryable error that asks the client to delay its next attempt and retry_delay_callback is not None, BigQueryReadClient will call retry_delay_callback with the delay duration (in seconds) before it starts sleeping until the next attempt.

Returns

An iterable of ReadRowsResponse.

Return type

ReadRowsStream

Raises

google.api_core.exceptions.GoogleAPICallError – If the request failed for any reason.
google.api_core.exceptions.RetryError – If the request failed due to a retryable error and retry attempts failed.
ValueError – If the parameters are invalid.

static read_session_path(project: str, location: str, session: str) → str[source]¶: Returns a fully-qualified read_session string.

static read_stream_path(project: str, location: str, session: str, stream: str) → str[source]¶: Returns a fully-qualified read_stream string.

split_read_stream(request: Optional[Union[google.cloud.bigquery_storage_v1beta2.types.storage.SplitReadStreamRequest, dict]] = None, *, retry: Optional[Union[google.api_core.retry.retry_unary.Retry, google.api_core.gapic_v1.method._MethodDefault]] = _MethodDefault._DEFAULT_VALUE, timeout: Union[float, object] = _MethodDefault._DEFAULT_VALUE, metadata: Sequence[Tuple[str, str]] = ()) → google.cloud.bigquery_storage_v1beta2.types.storage.SplitReadStreamResponse[source]¶

Splits a given ReadStream into two ReadStream objects. These ReadStream objects are referred to as the primary and the residual streams of the split. The original ReadStream can still be read from in the same manner as before. Both of the returned ReadStream objects can also be read from, and the rows returned by both child streams will be the same as the rows read from the original stream.

Moreover, the two child streams will be allocated back-to-back in the original ReadStream. Concretely, it is guaranteed that for streams original, primary, and residual, that original[0-j] = primary[0-j] and original[j-n] = residual[0-m] once the streams have been read to completion.

# This snippet has been automatically generated and should be regarded as a
# code template only.
# It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
#   client as shown in:
#   https://googleapis.dev/python/google-api-core/latest/client_options.html
from google.cloud import bigquery_storage_v1beta2

def sample_split_read_stream():
    # Create a client
    client = bigquery_storage_v1beta2.BigQueryReadClient()

    # Initialize request argument(s)
    request = bigquery_storage_v1beta2.SplitReadStreamRequest(
        name="name_value",
    )

    # Make the request
    response = client.split_read_stream(request=request)

    # Handle the response
    print(response)

Parameters

request (Union[google.cloud.bigquery_storage_v1beta2.types.SplitReadStreamRequest, dict]) – The request object. Request message for SplitReadStream.
retry (google.api_core.retry.Retry) – Designation of what errors, if any, should be retried.
timeout (float) – The timeout for this request.
metadata (Sequence[Tuple[str, str]]) – Strings which should be sent along with the request as metadata.

Return type

google.cloud.bigquery_storage_v1beta2.types.SplitReadStreamResponse

static table_path(project: str, dataset: str, table: str) → str[source]¶: Returns a fully-qualified table string.

property transport: google.cloud.bigquery_storage_v1beta2.services.big_query_read.transports.base.BigQueryReadTransport¶

Returns the transport used by the client instance.

Returns

The transport used by the client: instance.

Return type

BigQueryReadTransport

property universe_domain: str¶

Return the universe domain used by the client instance.

Returns: The universe domain used by the client instance.
Return type: str

class google.cloud.bigquery_storage_v1beta2.client.BigQueryWriteClient(*, credentials: typing.Optional[google.auth.credentials.Credentials] = None, transport: typing.Optional[typing.Union[str, google.cloud.bigquery_storage_v1beta2.services.big_query_write.transports.base.BigQueryWriteTransport, typing.Callable[[...], google.cloud.bigquery_storage_v1beta2.services.big_query_write.transports.base.BigQueryWriteTransport]]] = None, client_options: typing.Optional[typing.Union[google.api_core.client_options.ClientOptions, dict]] = None, client_info: google.api_core.gapic_v1.client_info.ClientInfo = <google.api_core.gapic_v1.client_info.ClientInfo object>)[source]¶

BigQuery Write API.

The Write API can be used to write data to BigQuery.

The google.cloud.bigquery.storage.v1 API should be used instead of the v1beta2 API for BigQueryWrite operations.

Instantiates the big query write client.

Parameters

credentials (Optional[google.auth.credentials.Credentials]) – The authorization credentials to attach to requests. These credentials identify the application to the service; if none are specified, the client will attempt to ascertain the credentials from the environment.
transport (Optional[Union[str,BigQueryWriteTransport,Callable[..., BigQueryWriteTransport]]]) – The transport to use, or a Callable that constructs and returns a new transport. If a Callable is given, it will be called with the same set of initialization arguments as used in the BigQueryWriteTransport constructor. If set to None, a transport is chosen automatically.
client_options (Optional[Union[google.api_core.client_options.ClientOptions, dict]]) –
Custom options for the client.

1. The api_endpoint property can be used to override the default endpoint provided by the client when transport is not explicitly provided. Only if this property is not set and transport was not explicitly provided, the endpoint is determined by the GOOGLE_API_USE_MTLS_ENDPOINT environment variable, which have one of the following values: “always” (always use the default mTLS endpoint), “never” (always use the default regular endpoint) and “auto” (auto-switch to the default mTLS endpoint if client certificate is present; this is the default value).

2. If the GOOGLE_API_USE_CLIENT_CERTIFICATE environment variable is “true”, then the client_cert_source property can be used to provide a client certificate for mTLS transport. If not provided, the default SSL client certificate will be used if present. If GOOGLE_API_USE_CLIENT_CERTIFICATE is “false” or not set, no client certificate will be used.

3. The universe_domain property can be used to override the default “googleapis.com” universe. Note that the api_endpoint property still takes precedence; and universe_domain is currently not supported for mTLS.
client_info (google.api_core.gapic_v1.client_info.ClientInfo) – The client info used to send a user-agent string along with API requests. If None, then default info will be used. Generally, you only need to set this if you’re developing your own client library.

Raises

google.auth.exceptions.MutualTLSChannelError – If mutual TLS transport creation failed for any reason.

__exit__(type, value, traceback)[source]¶: Releases underlying transport’s resources.

Warning

ONLY use as a context manager if the transport is NOT shared with other clients! Exiting the with block will CLOSE the transport and may cause errors in other clients!

property api_endpoint¶

Return the API endpoint used by the client instance.

Returns: The API endpoint used by the client instance.
Return type: str

append_rows(requests: Optional[Iterator[google.cloud.bigquery_storage_v1beta2.types.storage.AppendRowsRequest]] = None, *, retry: Optional[Union[google.api_core.retry.retry_unary.Retry, google.api_core.gapic_v1.method._MethodDefault]] = _MethodDefault._DEFAULT_VALUE, timeout: Union[float, object] = _MethodDefault._DEFAULT_VALUE, metadata: Sequence[Tuple[str, str]] = ()) → Iterable[google.cloud.bigquery_storage_v1beta2.types.storage.AppendRowsResponse][source]¶

Appends data to the given stream.

If offset is specified, the offset is checked against the end of stream. The server returns OUT_OF_RANGE in AppendRowsResponse if an attempt is made to append to an offset beyond the current end of the stream or ALREADY_EXISTS if user provids an offset that has already been written to. User can retry with adjusted offset within the same RPC stream. If offset is not specified, append happens at the end of the stream.

The response contains the offset at which the append happened. Responses are received in the same order in which requests are sent. There will be one response for each successful request. If the offset is not set in response, it means append didn’t happen due to some errors. If one request fails, all the subsequent requests will also fail until a success request is made again.

If the stream is of PENDING type, data will only be available for read operations after the stream is committed.

# This snippet has been automatically generated and should be regarded as a
# code template only.
# It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
#   client as shown in:
#   https://googleapis.dev/python/google-api-core/latest/client_options.html
from google.cloud import bigquery_storage_v1beta2

def sample_append_rows():
    # Create a client
    client = bigquery_storage_v1beta2.BigQueryWriteClient()

    # Initialize request argument(s)
    request = bigquery_storage_v1beta2.AppendRowsRequest(
        write_stream="write_stream_value",
    )

    # This method expects an iterator which contains
    # 'bigquery_storage_v1beta2.AppendRowsRequest' objects
    # Here we create a generator that yields a single `request` for
    # demonstrative purposes.
    requests = [request]

    def request_generator():
        for request in requests:
            yield request

    # Make the request
    stream = client.append_rows(requests=request_generator())

    # Handle the response
    for response in stream:
        print(response)

Parameters

requests (Iterator[google.cloud.bigquery_storage_v1beta2.types.AppendRowsRequest]) – The request object iterator. Request message for AppendRows.
retry (google.api_core.retry.Retry) – Designation of what errors, if any, should be retried.
timeout (float) – The timeout for this request.
metadata (Sequence[Tuple[str, str]]) – Strings which should be sent along with the request as metadata.

Returns

Response message for AppendRows.

Return type

Iterable[google.cloud.bigquery_storage_v1beta2.types.AppendRowsResponse]

batch_commit_write_streams(request: Optional[Union[google.cloud.bigquery_storage_v1beta2.types.storage.BatchCommitWriteStreamsRequest, dict]] = None, *, parent: Optional[str] = None, retry: Optional[Union[google.api_core.retry.retry_unary.Retry, google.api_core.gapic_v1.method._MethodDefault]] = _MethodDefault._DEFAULT_VALUE, timeout: Union[float, object] = _MethodDefault._DEFAULT_VALUE, metadata: Sequence[Tuple[str, str]] = ()) → google.cloud.bigquery_storage_v1beta2.types.storage.BatchCommitWriteStreamsResponse[source]¶

Atomically commits a group of PENDING streams that belong to the same parent table. Streams must be finalized before commit and cannot be committed multiple times. Once a stream is committed, data in the stream becomes available for read operations.

# This snippet has been automatically generated and should be regarded as a
# code template only.
# It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
#   client as shown in:
#   https://googleapis.dev/python/google-api-core/latest/client_options.html
from google.cloud import bigquery_storage_v1beta2

def sample_batch_commit_write_streams():
    # Create a client
    client = bigquery_storage_v1beta2.BigQueryWriteClient()

    # Initialize request argument(s)
    request = bigquery_storage_v1beta2.BatchCommitWriteStreamsRequest(
        parent="parent_value",
        write_streams=['write_streams_value1', 'write_streams_value2'],
    )

    # Make the request
    response = client.batch_commit_write_streams(request=request)

    # Handle the response
    print(response)

Parameters

request (Union[google.cloud.bigquery_storage_v1beta2.types.BatchCommitWriteStreamsRequest, dict]) – The request object. Request message for BatchCommitWriteStreams.
parent (str) –
Required. Parent table that all the streams should belong to, in the form of projects/{project}/datasets/{dataset}/tables/{table}.

This corresponds to the parent field on the request instance; if request is provided, this should not be set.
retry (google.api_core.retry.Retry) – Designation of what errors, if any, should be retried.
timeout (float) – The timeout for this request.
metadata (Sequence[Tuple[str, str]]) – Strings which should be sent along with the request as metadata.

Returns

Response message for BatchCommitWriteStreams.

Return type

google.cloud.bigquery_storage_v1beta2.types.BatchCommitWriteStreamsResponse

static common_billing_account_path(billing_account: str) → str[source]¶: Returns a fully-qualified billing_account string.

static common_folder_path(folder: str) → str[source]¶: Returns a fully-qualified folder string.

static common_location_path(project: str, location: str) → str[source]¶: Returns a fully-qualified location string.

static common_organization_path(organization: str) → str[source]¶: Returns a fully-qualified organization string.

static common_project_path(project: str) → str[source]¶: Returns a fully-qualified project string.

create_write_stream(request: Optional[Union[google.cloud.bigquery_storage_v1beta2.types.storage.CreateWriteStreamRequest, dict]] = None, *, parent: Optional[str] = None, write_stream: Optional[google.cloud.bigquery_storage_v1beta2.types.stream.WriteStream] = None, retry: Optional[Union[google.api_core.retry.retry_unary.Retry, google.api_core.gapic_v1.method._MethodDefault]] = _MethodDefault._DEFAULT_VALUE, timeout: Union[float, object] = _MethodDefault._DEFAULT_VALUE, metadata: Sequence[Tuple[str, str]] = ()) → google.cloud.bigquery_storage_v1beta2.types.stream.WriteStream[source]¶

Creates a write stream to the given table. Additionally, every table has a special COMMITTED stream named ‘_default’ to which data can be written. This stream doesn’t need to be created using CreateWriteStream. It is a stream that can be used simultaneously by any number of clients. Data written to this stream is considered committed as soon as an acknowledgement is received.

# This snippet has been automatically generated and should be regarded as a
# code template only.
# It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
#   client as shown in:
#   https://googleapis.dev/python/google-api-core/latest/client_options.html
from google.cloud import bigquery_storage_v1beta2

def sample_create_write_stream():
    # Create a client
    client = bigquery_storage_v1beta2.BigQueryWriteClient()

    # Initialize request argument(s)
    request = bigquery_storage_v1beta2.CreateWriteStreamRequest(
        parent="parent_value",
    )

    # Make the request
    response = client.create_write_stream(request=request)

    # Handle the response
    print(response)

Parameters

request (Union[google.cloud.bigquery_storage_v1beta2.types.CreateWriteStreamRequest, dict]) – The request object. Request message for CreateWriteStream.
parent (str) –
Required. Reference to the table to which the stream belongs, in the format of projects/{project}/datasets/{dataset}/tables/{table}.

This corresponds to the parent field on the request instance; if request is provided, this should not be set.
write_stream (google.cloud.bigquery_storage_v1beta2.types.WriteStream) – Required. Stream to be created. This corresponds to the write_stream field on the request instance; if request is provided, this should not be set.
retry (google.api_core.retry.Retry) – Designation of what errors, if any, should be retried.
timeout (float) – The timeout for this request.
metadata (Sequence[Tuple[str, str]]) – Strings which should be sent along with the request as metadata.

Returns

Information about a single stream that gets data inside the storage system.

Return type

google.cloud.bigquery_storage_v1beta2.types.WriteStream

finalize_write_stream(request: Optional[Union[google.cloud.bigquery_storage_v1beta2.types.storage.FinalizeWriteStreamRequest, dict]] = None, *, name: Optional[str] = None, retry: Optional[Union[google.api_core.retry.retry_unary.Retry, google.api_core.gapic_v1.method._MethodDefault]] = _MethodDefault._DEFAULT_VALUE, timeout: Union[float, object] = _MethodDefault._DEFAULT_VALUE, metadata: Sequence[Tuple[str, str]] = ()) → google.cloud.bigquery_storage_v1beta2.types.storage.FinalizeWriteStreamResponse[source]¶

Finalize a write stream so that no new data can be appended to the stream. Finalize is not supported on the ‘_default’ stream.

# This snippet has been automatically generated and should be regarded as a
# code template only.
# It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
#   client as shown in:
#   https://googleapis.dev/python/google-api-core/latest/client_options.html
from google.cloud import bigquery_storage_v1beta2

def sample_finalize_write_stream():
    # Create a client
    client = bigquery_storage_v1beta2.BigQueryWriteClient()

    # Initialize request argument(s)
    request = bigquery_storage_v1beta2.FinalizeWriteStreamRequest(
        name="name_value",
    )

    # Make the request
    response = client.finalize_write_stream(request=request)

    # Handle the response
    print(response)

Parameters

request (Union[google.cloud.bigquery_storage_v1beta2.types.FinalizeWriteStreamRequest, dict]) – The request object. Request message for invoking FinalizeWriteStream.
name (str) –
Required. Name of the stream to finalize, in the form of projects/{project}/datasets/{dataset}/tables/{table}/streams/{stream}.

This corresponds to the name field on the request instance; if request is provided, this should not be set.
retry (google.api_core.retry.Retry) – Designation of what errors, if any, should be retried.
timeout (float) – The timeout for this request.
metadata (Sequence[Tuple[str, str]]) – Strings which should be sent along with the request as metadata.

Returns

Response message for FinalizeWriteStream.

Return type

google.cloud.bigquery_storage_v1beta2.types.FinalizeWriteStreamResponse

flush_rows(request: Optional[Union[google.cloud.bigquery_storage_v1beta2.types.storage.FlushRowsRequest, dict]] = None, *, write_stream: Optional[str] = None, retry: Optional[Union[google.api_core.retry.retry_unary.Retry, google.api_core.gapic_v1.method._MethodDefault]] = _MethodDefault._DEFAULT_VALUE, timeout: Union[float, object] = _MethodDefault._DEFAULT_VALUE, metadata: Sequence[Tuple[str, str]] = ()) → google.cloud.bigquery_storage_v1beta2.types.storage.FlushRowsResponse[source]¶

Flushes rows to a BUFFERED stream. If users are appending rows to BUFFERED stream, flush operation is required in order for the rows to become available for reading. A Flush operation flushes up to any previously flushed offset in a BUFFERED stream, to the offset specified in the request. Flush is not supported on the _default stream, since it is not BUFFERED.

# This snippet has been automatically generated and should be regarded as a
# code template only.
# It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
#   client as shown in:
#   https://googleapis.dev/python/google-api-core/latest/client_options.html
from google.cloud import bigquery_storage_v1beta2

def sample_flush_rows():
    # Create a client
    client = bigquery_storage_v1beta2.BigQueryWriteClient()

    # Initialize request argument(s)
    request = bigquery_storage_v1beta2.FlushRowsRequest(
        write_stream="write_stream_value",
    )

    # Make the request
    response = client.flush_rows(request=request)

    # Handle the response
    print(response)

Parameters

request (Union[google.cloud.bigquery_storage_v1beta2.types.FlushRowsRequest, dict]) – The request object. Request message for FlushRows.
write_stream (str) –
Required. The stream that is the target of the flush operation.

This corresponds to the write_stream field on the request instance; if request is provided, this should not be set.
retry (google.api_core.retry.Retry) – Designation of what errors, if any, should be retried.
timeout (float) – The timeout for this request.
metadata (Sequence[Tuple[str, str]]) – Strings which should be sent along with the request as metadata.

Returns

Respond message for FlushRows.

Return type

google.cloud.bigquery_storage_v1beta2.types.FlushRowsResponse

classmethod from_service_account_file(filename: str, *args, **kwargs)[source]¶

Creates an instance of this client using the provided credentials: file.

Parameters

filename (str) – The path to the service account private key json file.
args – Additional arguments to pass to the constructor.
kwargs – Additional arguments to pass to the constructor.

Returns

The constructed client.

Return type

BigQueryWriteClient

classmethod from_service_account_info(info: dict, *args, **kwargs)[source]¶

Creates an instance of this client using the provided credentials: info.

Parameters

info (dict) – The service account private key info.
args – Additional arguments to pass to the constructor.
kwargs – Additional arguments to pass to the constructor.

Returns

The constructed client.

Return type

BigQueryWriteClient

classmethod from_service_account_json(filename: str, *args, **kwargs)¶

Creates an instance of this client using the provided credentials: file.

Parameters

filename (str) – The path to the service account private key json file.
args – Additional arguments to pass to the constructor.
kwargs – Additional arguments to pass to the constructor.

Returns

The constructed client.

Return type

BigQueryWriteClient

classmethod get_mtls_endpoint_and_cert_source(client_options: Optional[google.api_core.client_options.ClientOptions] = None)[source]¶

Deprecated. Return the API endpoint and client cert source for mutual TLS.

The client cert source is determined in the following order: (1) if GOOGLE_API_USE_CLIENT_CERTIFICATE environment variable is not “true”, the client cert source is None. (2) if client_options.client_cert_source is provided, use the provided one; if the default client cert source exists, use the default one; otherwise the client cert source is None.

The API endpoint is determined in the following order: (1) if client_options.api_endpoint if provided, use the provided one. (2) if GOOGLE_API_USE_CLIENT_CERTIFICATE environment variable is “always”, use the default mTLS endpoint; if the environment variable is “never”, use the default API endpoint; otherwise if client cert source exists, use the default mTLS endpoint, otherwise use the default API endpoint.

More details can be found at https://google.aip.dev/auth/4114.

Parameters

client_options (google.api_core.client_options.ClientOptions) – Custom options for the client. Only the api_endpoint and client_cert_source properties may be used in this method.

Returns

returns the API endpoint and the: client cert source to use.

Return type

Tuple[str, Callable[[], Tuple[bytes, bytes]]]

Raises

google.auth.exceptions.MutualTLSChannelError – If any errors happen.

get_write_stream(request: Optional[Union[google.cloud.bigquery_storage_v1beta2.types.storage.GetWriteStreamRequest, dict]] = None, *, name: Optional[str] = None, retry: Optional[Union[google.api_core.retry.retry_unary.Retry, google.api_core.gapic_v1.method._MethodDefault]] = _MethodDefault._DEFAULT_VALUE, timeout: Union[float, object] = _MethodDefault._DEFAULT_VALUE, metadata: Sequence[Tuple[str, str]] = ()) → google.cloud.bigquery_storage_v1beta2.types.stream.WriteStream[source]¶

Gets a write stream.

# This snippet has been automatically generated and should be regarded as a
# code template only.
# It will require modifications to work:
# - It may require correct/in-range values for request initialization.
# - It may require specifying regional endpoints when creating the service
#   client as shown in:
#   https://googleapis.dev/python/google-api-core/latest/client_options.html
from google.cloud import bigquery_storage_v1beta2

def sample_get_write_stream():
    # Create a client
    client = bigquery_storage_v1beta2.BigQueryWriteClient()

    # Initialize request argument(s)
    request = bigquery_storage_v1beta2.GetWriteStreamRequest(
        name="name_value",
    )

    # Make the request
    response = client.get_write_stream(request=request)

    # Handle the response
    print(response)

Parameters

request (Union[google.cloud.bigquery_storage_v1beta2.types.GetWriteStreamRequest, dict]) – The request object. Request message for GetWriteStreamRequest.
name (str) –
Required. Name of the stream to get, in the form of projects/{project}/datasets/{dataset}/tables/{table}/streams/{stream}.

This corresponds to the name field on the request instance; if request is provided, this should not be set.
retry (google.api_core.retry.Retry) – Designation of what errors, if any, should be retried.
timeout (float) – The timeout for this request.
metadata (Sequence[Tuple[str, str]]) – Strings which should be sent along with the request as metadata.

Returns

Information about a single stream that gets data inside the storage system.

Return type

google.cloud.bigquery_storage_v1beta2.types.WriteStream

static parse_common_billing_account_path(path: str) → Dict[str, str][source]¶: Parse a billing_account path into its component segments.

static parse_common_folder_path(path: str) → Dict[str, str][source]¶: Parse a folder path into its component segments.

static parse_common_location_path(path: str) → Dict[str, str][source]¶: Parse a location path into its component segments.

static parse_common_organization_path(path: str) → Dict[str, str][source]¶: Parse a organization path into its component segments.

static parse_common_project_path(path: str) → Dict[str, str][source]¶: Parse a project path into its component segments.

static parse_table_path(path: str) → Dict[str, str][source]¶: Parses a table path into its component segments.

static parse_write_stream_path(path: str) → Dict[str, str][source]¶: Parses a write_stream path into its component segments.

static table_path(project: str, dataset: str, table: str) → str[source]¶: Returns a fully-qualified table string.

property transport: google.cloud.bigquery_storage_v1beta2.services.big_query_write.transports.base.BigQueryWriteTransport¶

Returns the transport used by the client instance.

Returns

The transport used by the client: instance.

Return type

BigQueryWriteTransport

property universe_domain: str¶

Return the universe domain used by the client instance.

Returns: The universe domain used by the client instance.
Return type: str

static write_stream_path(project: str, dataset: str, table: str, stream: str) → str[source]¶: Returns a fully-qualified write_stream string.

class google.cloud.bigquery_storage_v1beta2.writer.AppendRowsFuture(manager: google.cloud.bigquery_storage_v1beta2.writer.AppendRowsStream)[source]¶

Encapsulation of the asynchronous execution of an action.

This object is returned from long-running BigQuery Storage API calls, and is the interface to determine the status of those calls.

This object should not be created directly, but is returned by other methods in this library.

add_done_callback(fn)¶

Add a callback to be executed when the operation is complete.

If the operation is not already complete, this will start a helper thread to poll for the status of the operation in the background.

Parameters: fn (Callable[Future]) – The callback to execute when the operation is complete.

cancel()[source]¶

Stops pulling messages and shutdowns the background thread consuming: messages.

The method does not block, it just triggers the shutdown and returns immediately. To block until the background stream is terminated, call result() after cancelling the future.

cancelled()[source]¶

Returns: True if the write stream has been cancelled.
Return type: bool

done(retry: Optional[google.api_core.retry.retry_unary.Retry] = None) → bool[source]¶

Check the status of the future.

Parameters: retry – Not used. Included for compatibility with base clase. Future status is updated by a background thread.
Returns: True if the request has finished, otherwise False.

exception(timeout=<object object>)¶

Get the exception from the operation, blocking if necessary.

See the documentation for the result() method for details on how this method operates, as both result and this method rely on the exact same polling logic. The only difference is that this method does not accept retry and polling arguments but relies on the default ones instead.

Parameters

timeout (int) – How long to wait for the operation to complete.
None (If) –
indefinitely. (wait) –

Returns

The operation’s: error.

Return type

Optional[google.api_core.GoogleAPICallError]

result(timeout=<object object>, retry=None, polling=None)¶

Get the result of the operation.

This method will poll for operation status periodically, blocking if necessary. If you just want to make sure that this method does not block for more than X seconds and you do not care about the nitty-gritty of how this method operates, just call it with result(timeout=X). The other parameters are for advanced use only.

Every call to this method is controlled by the following three parameters, each of which has a specific, distinct role, even though all three may look very similar: timeout, retry and polling. In most cases users do not need to specify any custom values for any of these parameters and may simply rely on default ones instead.

If you choose to specify custom parameters, please make sure you’ve read the documentation below carefully.

First, please check google.api_core.retry.Retry class documentation for the proper definition of timeout and deadline terms and for the definition the three different types of timeouts. This class operates in terms of Retry Timeout and Polling Timeout. It does not let customizing RPC timeout and the user is expected to rely on default behavior for it.

The roles of each argument of this method are as follows:

timeout (int): (Optional) The Polling Timeout as defined in google.api_core.retry.Retry. If the operation does not complete within this timeout an exception will be thrown. This parameter affects neither Retry Timeout nor RPC Timeout.

retry (google.api_core.retry.Retry): (Optional) How to retry the polling RPC. The retry.timeout property of this parameter is the Retry Timeout as defined in google.api_core.retry.Retry. This parameter defines ONLY how the polling RPC call is retried (i.e. what to do if the RPC we used for polling returned an error). It does NOT define how the polling is done (i.e. how frequently and for how long to call the polling RPC); use the polling parameter for that. If a polling RPC throws and error and retrying it fails, the whole future fails with the corresponding exception. If you want to tune which server response error codes are not fatal for operation polling, use this parameter to control that (retry.predicate in particular).

polling (google.api_core.retry.Retry): (Optional) How often and for how long to call the polling RPC periodically (i.e. what to do if a polling rpc returned successfully but its returned result indicates that the long running operation is not completed yet, so we need to check it again at some point in future). This parameter does NOT define how to retry each individual polling RPC in case of an error; use the retry parameter for that. The polling.timeout of this parameter is Polling Timeout as defined in as defined in google.api_core.retry.Retry.

For each of the arguments, there are also default values in place, which will be used if a user does not specify their own. The default values for the three parameters are not to be confused with the default values for the corresponding arguments in this method (those serve as “not set” markers for the resolution logic).

If timeout is provided (i.e.``timeout is not _DEFAULT VALUE``; note the None value means “infinite timeout”), it will be used to control the actual Polling Timeout. Otherwise, the polling.timeout value will be used instead (see below for how the polling config itself gets resolved). In other words, this parameter effectively overrides the polling.timeout value if specified. This is so to preserve backward compatibility.

If retry is provided (i.e. retry is not None) it will be used to control retry behavior for the polling RPC and the retry.timeout will determine the Retry Timeout. If not provided, the polling RPC will be called with whichever default retry config was specified for the polling RPC at the moment of the construction of the polling RPC’s client. For example, if the polling RPC is operations_client.get_operation(), the retry parameter will be controlling its retry behavior (not polling behavior) and, if not specified, that specific method (operations_client.get_operation()) will be retried according to the default retry config provided during creation of operations_client client instead. This argument exists mainly for backward compatibility; users are very unlikely to ever need to set this parameter explicitly.

If polling is provided (i.e. polling is not None), it will be used to control the overall polling behavior and polling.timeout will control Polling Timeout unless it is overridden by timeout parameter as described above. If not provided, the``polling`` parameter specified during construction of this future (the polling argument in the constructor) will be used instead. Note: since the timeout argument may override polling.timeout value, this parameter should be viewed as coupled with the timeout parameter as described above.

Parameters

timeout (int) – (Optional) How long (in seconds) to wait for the operation to complete. If None, wait indefinitely.
retry (google.api_core.retry.Retry) – (Optional) How to retry the polling RPC. This defines ONLY how the polling RPC call is retried (i.e. what to do if the RPC we used for polling returned an error). It does NOT define how the polling is done (i.e. how frequently and for how long to call the polling RPC).
polling (google.api_core.retry.Retry) – (Optional) How often and for how long to call polling RPC periodically. This parameter does NOT define how to retry each individual polling RPC call (use the retry parameter for that).

Returns

The Operation’s result.

Return type

google.protobuf.Message

Raises

google.api_core.GoogleAPICallError – If the operation errors or if the timeout is reached before the operation completes.

running()¶: True if the operation is currently running.

set_exception(exception)[source]¶

Set the result of the future as being the given exception.

Do not use this method, it should only be used internally by the library and its unit tests.

set_result(result)[source]¶

Set the return value of work associated with the future.

Do not use this method, it should only be used internally by the library and its unit tests.

class google.cloud.bigquery_storage_v1beta2.writer.AppendRowsStream(client: google.cloud.bigquery_storage_v1beta2.services.big_query_write.client.BigQueryWriteClient, initial_request_template: google.cloud.bigquery_storage_v1beta2.types.storage.AppendRowsRequest, metadata: Sequence[Tuple[str, str]] = ())[source]¶

A manager object which can append rows to a stream.

Construct a stream manager.

Parameters

client – Client responsible for making requests.
initial_request_template – Data to include in the first request sent to the stream. This must contain google.cloud.bigquery_storage_v1beta2.types.AppendRowsRequest.write_stream and google.cloud.bigquery_storage_v1beta2.types.AppendRowsRequest.ProtoData.writer_schema.
metadata – Extra headers to include when sending the streaming request.

add_close_callback(callback: Callable)[source]¶: Schedules a callable when the manager closes. :param callback: The method to call. :type callback: Callable

close(reason: Optional[Exception] = None)[source]¶

Stop consuming messages and shutdown all helper threads.

This method is idempotent. Additional calls will have no effect.

Parameters: reason – The reason to close this. If None, this is considered an “intentional” shutdown. This is passed to the callbacks specified via add_close_callback().

property is_active: bool¶

True if this manager is actively streaming.

Note that False does not indicate this is complete shut down, just that it stopped getting new messages.

Type: bool

send(request: google.cloud.bigquery_storage_v1beta2.types.storage.AppendRowsRequest) → google.cloud.bigquery_storage_v1beta2.writer.AppendRowsFuture[source]¶

Send an append rows request to the open stream.

Parameters: request – The request to add to the stream.
Returns: A future, which can be used to process the response when it arrives.

Bigquery Storage v1beta2 API Library¶

google-cloud-bigquery-storage

Navigation

Related Topics