As of January 1, 2020 this library no longer supports Python 2 on the latest released version. Library versions released prior to that date will continue to be available. For more information please visit Python 2 support on Google Cloud.

google.cloud.bigquery.dataset.Dataset

class google.cloud.bigquery.dataset.Dataset(dataset_ref)[source]

Datasets are containers for tables.

See https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets#resource-dataset

Parameters

dataset_ref (Union[google.cloud.bigquery.dataset.DatasetReference, str]) – A pointer to a dataset. If dataset_ref is a string, it must include both the project ID and the dataset ID, separated by ..

__init__(dataset_ref)None[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(dataset_ref)

Initialize self.

from_api_repr(resource)

Factory: construct a dataset given its API representation

from_string(full_dataset_id)

Construct a dataset from fully-qualified dataset ID.

model(model_id)

Constructs a ModelReference.

routine(routine_id)

Constructs a RoutineReference.

table(table_id)

Constructs a TableReference.

to_api_repr()

Construct the API resource representation of this dataset

Attributes

access_entries

Dataset’s access entries.

created

Datetime at which the dataset was created (None until set from the server).

dataset_id

Dataset ID.

default_encryption_configuration

Custom encryption configuration for all tables in the dataset.

default_partition_expiration_ms

The default partition expiration for all partitioned tables in the dataset, in milliseconds.

default_rounding_mode

defaultRoundingMode of the dataset as set by the user (defaults to None).

default_table_expiration_ms

Default expiration time for tables in the dataset (defaults to None).

description

Description of the dataset as set by the user (defaults to None).

etag

ETag for the dataset resource (None until set from the server).

friendly_name

Title of the dataset as set by the user (defaults to None).

full_dataset_id

ID for the dataset resource (None until set from the server)

is_case_insensitive

True if the dataset and its table names are case-insensitive, otherwise False.

labels

Labels for the dataset.

location

Location in which the dataset is hosted as set by the user (defaults to None).

max_time_travel_hours

Defines the time travel window in hours.

modified

Datetime at which the dataset was last modified (None until set from the server).

path

URL path for the dataset based on project and dataset ID.

project

Project ID of the project bound to the dataset.

reference

A reference to this dataset.

self_link

URL for the dataset resource (None until set from the server).

storage_billing_model

StorageBillingModel of the dataset as set by the user (defaults to None).

property access_entries

Dataset’s access entries.

role augments the entity type and must be present unless the entity type is view or routine.

Raises
Type

List[google.cloud.bigquery.dataset.AccessEntry]

property created

Datetime at which the dataset was created (None until set from the server).

Type

Union[datetime.datetime, None]

property dataset_id

Dataset ID.

Type

str

property default_encryption_configuration

Custom encryption configuration for all tables in the dataset.

Custom encryption configuration (e.g., Cloud KMS keys) or None if using default encryption.

See protecting data with Cloud KMS keys in the BigQuery documentation.

Type

google.cloud.bigquery.encryption_configuration.EncryptionConfiguration

property default_partition_expiration_ms

The default partition expiration for all partitioned tables in the dataset, in milliseconds.

Once this property is set, all newly-created partitioned tables in the dataset will have an time_paritioning.expiration_ms property set to this value, and changing the value will only affect new tables, not existing ones. The storage in a partition will have an expiration time of its partition time plus this value.

Setting this property overrides the use of default_table_expiration_ms for partitioned tables: only one of default_table_expiration_ms and default_partition_expiration_ms will be used for any new partitioned table. If you provide an explicit time_partitioning.expiration_ms when creating or updating a partitioned table, that value takes precedence over the default partition expiration time indicated by this property.

Type

Optional[int]

property default_rounding_mode

defaultRoundingMode of the dataset as set by the user (defaults to None).

Set the value to one of 'ROUND_HALF_AWAY_FROM_ZERO', 'ROUND_HALF_EVEN', or 'ROUNDING_MODE_UNSPECIFIED'.

See default rounding mode in REST API docs and updating the default rounding model guide.

Raises

ValueError – for invalid value types.

Type

Union[str, None]

property default_table_expiration_ms

Default expiration time for tables in the dataset (defaults to None).

Raises

ValueError – For invalid value types.

Type

Union[int, None]

property description

Description of the dataset as set by the user (defaults to None).

Raises

ValueError – for invalid value types.

Type

Optional[str]

property etag

ETag for the dataset resource (None until set from the server).

Type

Union[str, None]

property friendly_name

Title of the dataset as set by the user (defaults to None).

Raises

ValueError – for invalid value types.

Type

Union[str, None]

classmethod from_api_repr(resource: dict)google.cloud.bigquery.dataset.Dataset[source]

Factory: construct a dataset given its API representation

Parameters

(Dict[str (resource) – object]): Dataset resource representation returned from the API

Returns

Dataset parsed from resource.

Return type

google.cloud.bigquery.dataset.Dataset

classmethod from_string(full_dataset_id: str)google.cloud.bigquery.dataset.Dataset[source]

Construct a dataset from fully-qualified dataset ID.

Parameters

full_dataset_id (str) – A fully-qualified dataset ID in standard SQL format. Must include both the project ID and the dataset ID, separated by ..

Returns

Dataset parsed from full_dataset_id.

Return type

Dataset

Examples

>>> Dataset.from_string('my-project-id.some_dataset')
Dataset(DatasetReference('my-project-id', 'some_dataset'))
Raises

ValueError – If full_dataset_id is not a fully-qualified dataset ID in standard SQL format.

property full_dataset_id

ID for the dataset resource (None until set from the server)

In the format project_id:dataset_id.

Type

Union[str, None]

property is_case_insensitive

True if the dataset and its table names are case-insensitive, otherwise False. By default, this is False, which means the dataset and its table names are case-sensitive. This field does not affect routine references.

Raises

ValueError – for invalid value types.

Type

Optional[bool]

property labels

Labels for the dataset.

This method always returns a dict. To change a dataset’s labels, modify the dict, then call google.cloud.bigquery.client.Client.update_dataset(). To delete a label, set its value to None before updating.

Raises

ValueError – for invalid value types.

Type

Dict[str, str]

property location

Location in which the dataset is hosted as set by the user (defaults to None).

Raises

ValueError – for invalid value types.

Type

Union[str, None]

property max_time_travel_hours

Defines the time travel window in hours. The value can be from 48 to 168 hours (2 to 7 days), and in multiple of 24 hours (48, 72, 96, 120, 144, 168). The default value is 168 hours if this is not set.

Type

Optional[int]

model(model_id)

Constructs a ModelReference.

Parameters

model_id (str) – the ID of the model.

Returns

A ModelReference for a model in this dataset.

Return type

google.cloud.bigquery.model.ModelReference

property modified

Datetime at which the dataset was last modified (None until set from the server).

Type

Union[datetime.datetime, None]

property path

URL path for the dataset based on project and dataset ID.

Type

str

property project

Project ID of the project bound to the dataset.

Type

str

property reference

A reference to this dataset.

Type

google.cloud.bigquery.dataset.DatasetReference

routine(routine_id)

Constructs a RoutineReference.

Parameters

routine_id (str) – the ID of the routine.

Returns

A RoutineReference for a routine in this dataset.

Return type

google.cloud.bigquery.routine.RoutineReference

URL for the dataset resource (None until set from the server).

Type

Union[str, None]

property storage_billing_model

StorageBillingModel of the dataset as set by the user (defaults to None).

Set the value to one of 'LOGICAL', 'PHYSICAL', or 'STORAGE_BILLING_MODEL_UNSPECIFIED'. This change takes 24 hours to take effect and you must wait 14 days before you can change the storage billing model again.

See storage billing model in REST API docs and updating the storage billing model guide.

Raises

ValueError – for invalid value types.

Type

Union[str, None]

table(table_id: str)google.cloud.bigquery.table.TableReference

Constructs a TableReference.

Parameters

table_id (str) – The ID of the table.

Returns

A table reference for a table in this dataset.

Return type

google.cloud.bigquery.table.TableReference

to_api_repr()dict[source]

Construct the API resource representation of this dataset

Returns

The dataset represented as an API resource

Return type

Dict[str, object]