As of January 1, 2020 this library no longer supports Python 2 on the latest released version. Library versions released prior to that date will continue to be available. For more information please visit Python 2 support on Google Cloud.

google.cloud.bigquery.table.RowIterator¶

class google.cloud.bigquery.table.RowIterator(client, api_request, path, schema, page_token=None, max_results=None, page_size=None, extra_params=None, table=None, selected_fields=None, total_rows=None, first_page_response=None)[source]¶

A class for iterating through HTTP/JSON API row list responses.

Parameters

client (google.cloud.bigquery.Client) – The API client.
api_request (Callable[google.cloud._http.JSONConnection.api_request]) – The function to use to make API requests.
path (str) – The method path to query for the list of items.
schema (Sequence[Union[ SchemaField, Mapping[str, Any] ]]) – The table’s schema. If any item is a mapping, its content must be compatible with from_api_repr().
page_token (str) – A token identifying a page in a result set to start fetching results from.
max_results (Optional[int]) – The maximum number of results to fetch.
page_size (Optional[int]) – The maximum number of rows in each page of results from this request. Non-positive values are ignored. Defaults to a sensible value set by the API.
extra_params (Optional[Dict[str, object]]) – Extra query string parameters for the API call.
table (Optional[Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, ]]) – The table which these rows belong to, or a reference to it. Used to call the BigQuery Storage API to fetch rows.
selected_fields (Optional[Sequence[google.cloud.bigquery.schema.SchemaField]]) – A subset of columns to select from this table.
total_rows (Optional[int]) – Total number of rows in the table.
first_page_response (Optional[dict]) – API response for the first page of results. These are returned when the first page is requested.

__init__(client, api_request, path, schema, page_token=None, max_results=None, page_size=None, extra_params=None, table=None, selected_fields=None, total_rows=None, first_page_response=None)[source]¶: Initialize self. See help(type(self)) for accurate signature.

Methods

`__init__`(client, api_request, path, schema)	Initialize self.
`to_arrow`([progress_bar_type, …])	[Beta] Create a class:pyarrow.Table by loading all pages of a table or query.
`to_dataframe`([bqstorage_client, dtypes, …])	Create a pandas DataFrame by loading all pages of a query.
`to_dataframe_iterable`([bqstorage_client, dtypes])	Create an iterable of pandas DataFrames, to process the table as a stream.

Attributes

`pages`	Iterator of pages in the response.
`schema`	The subset of columns to be read from the table.
`total_rows`	The total number of rows in the table.

__iter__()¶

Iterator for each item returned.

Returns: A generator of items from the API.
Return type: types.GeneratorType[Any]
Raises: ValueError – If the iterator has already been started.

property pages¶

Iterator of pages in the response.

Returns

A: generator of page instances.

Return type

types.GeneratorType[google.api_core.page_iterator.Page]

Raises

ValueError – If the iterator has already been started.

property schema¶

The subset of columns to be read from the table.

Type: List[google.cloud.bigquery.schema.SchemaField]

to_arrow(progress_bar_type=None, bqstorage_client=None, create_bqstorage_client=True)[source]¶

[Beta] Create a class:pyarrow.Table by loading all pages of a table or query.

Parameters

progress_bar_type (Optional[str]) –
If set, use the tqdm library to display a progress bar while the data downloads. Install the tqdm package to use this feature.

Possible values of progress_bar_type include:

None
No progress bar.

'tqdm'
Use the tqdm.tqdm() function to print a progress bar to sys.stderr.

'tqdm_notebook'
Use the tqdm.tqdm_notebook() function to display a progress bar as a Jupyter notebook widget.

'tqdm_gui'
Use the tqdm.tqdm_gui() function to display a progress bar as a graphical dialog box.
bqstorage_client (Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]) –
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API.

This method requires the pyarrow and google-cloud-bigquery-storage libraries.

This method only exposes a subset of the capabilities of the BigQuery Storage API. For full access to all features (projections, filters, snapshots) use the Storage API directly.
create_bqstorage_client (Optional[bool]) –
If True (default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the bqstorage_client parameter for more information.

This argument does nothing if bqstorage_client is supplied.

..versionadded:: 1.24.0

Returns

pyarrow.Table: A pyarrow.Table populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.

Raises

ValueError – If the pyarrow library cannot be imported.

..versionadded:: 1.17.0

to_dataframe(bqstorage_client=None, dtypes=None, progress_bar_type=None, create_bqstorage_client=True, date_as_object=True)[source]¶

Create a pandas DataFrame by loading all pages of a query.

Parameters

bqstorage_client (Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]) –
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery.

This method requires the pyarrow and google-cloud-bigquery-storage libraries.

This method only exposes a subset of the capabilities of the BigQuery Storage API. For full access to all features (projections, filters, snapshots) use the Storage API directly.
dtypes (Optional[Map[str, Union[str, pandas.Series.dtype]]]) – A dictionary of column names pandas dtype``s. The provided ``dtype is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.
progress_bar_type (Optional[str]) –
If set, use the tqdm library to display a progress bar while the data downloads. Install the tqdm package to use this feature.

Possible values of progress_bar_type include:

None
No progress bar.

'tqdm'
Use the tqdm.tqdm() function to print a progress bar to sys.stderr.

'tqdm_notebook'
Use the tqdm.tqdm_notebook() function to display a progress bar as a Jupyter notebook widget.

'tqdm_gui'
Use the tqdm.tqdm_gui() function to display a progress bar as a graphical dialog box.

..versionadded:: 1.11.0
create_bqstorage_client (Optional[bool]) –
If True (default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See the bqstorage_client parameter for more information.

This argument does nothing if bqstorage_client is supplied.

..versionadded:: 1.24.0
date_as_object (Optional[bool]) –
If True (default), cast dates to objects. If False, convert to datetime64[ns] dtype.

..versionadded:: 1.26.0

Returns

A DataFrame populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.

Return type

pandas.DataFrame

Raises

ValueError – If the pandas library cannot be imported, or the google.cloud.bigquery_storage_v1 module is required but cannot be imported.

to_dataframe_iterable(bqstorage_client=None, dtypes=None)[source]¶

Create an iterable of pandas DataFrames, to process the table as a stream.

Parameters

bqstorage_client (Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]) –
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery.

This method requires the pyarrow and google-cloud-bigquery-storage libraries.

This method only exposes a subset of the capabilities of the BigQuery Storage API. For full access to all features (projections, filters, snapshots) use the Storage API directly.
dtypes (Optional[Map[str, Union[str, pandas.Series.dtype]]]) – A dictionary of column names pandas dtype``s. The provided ``dtype is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.

Returns

A generator of DataFrame.

Return type

pandas.DataFrame

Raises

ValueError – If the pandas library cannot be imported.

property total_rows¶

The total number of rows in the table.

Type: int

google.cloud.bigquery.table.RowIterator¶

google-cloud-bigquery

Navigation

Related Topics