google.cloud.bigquery.table.RowIterator¶
-
class
google.cloud.bigquery.table.
RowIterator
(client, api_request, path, schema, page_token=None, max_results=None, page_size=None, extra_params=None, table=None, selected_fields=None, total_rows=None, first_page_response=None)[source]¶ A class for iterating through HTTP/JSON API row list responses.
- Parameters
client (google.cloud.bigquery.Client) – The API client.
api_request (Callable[google.cloud._http.JSONConnection.api_request]) – The function to use to make API requests.
path (str) – The method path to query for the list of items.
schema (Sequence[Union[
SchemaField
, Mapping[str, Any] ]]) – The table’s schema. If any item is a mapping, its content must be compatible withfrom_api_repr()
.page_token (str) – A token identifying a page in a result set to start fetching results from.
max_results (Optional[int]) – The maximum number of results to fetch.
page_size (Optional[int]) – The maximum number of rows in each page of results from this request. Non-positive values are ignored. Defaults to a sensible value set by the API.
extra_params (Optional[Dict[str, object]]) – Extra query string parameters for the API call.
table (Optional[Union[ google.cloud.bigquery.table.Table, google.cloud.bigquery.table.TableReference, ]]) – The table which these rows belong to, or a reference to it. Used to call the BigQuery Storage API to fetch rows.
selected_fields (Optional[Sequence[google.cloud.bigquery.schema.SchemaField]]) – A subset of columns to select from this table.
total_rows (Optional[int]) – Total number of rows in the table.
first_page_response (Optional[dict]) – API response for the first page of results. These are returned when the first page is requested.
-
__init__
(client, api_request, path, schema, page_token=None, max_results=None, page_size=None, extra_params=None, table=None, selected_fields=None, total_rows=None, first_page_response=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(client, api_request, path, schema)Initialize self.
to_arrow
([progress_bar_type, …])[Beta] Create a class:pyarrow.Table by loading all pages of a table or query.
to_dataframe
([bqstorage_client, dtypes, …])Create a pandas DataFrame by loading all pages of a query.
to_dataframe_iterable
([bqstorage_client, dtypes])Create an iterable of pandas DataFrames, to process the table as a stream.
Attributes
Iterator of pages in the response.
The subset of columns to be read from the table.
The total number of rows in the table.
-
__iter__
()¶ Iterator for each item returned.
- Returns
A generator of items from the API.
- Return type
types.GeneratorType[Any]
- Raises
ValueError – If the iterator has already been started.
-
property
pages
¶ Iterator of pages in the response.
- Returns
- A
generator of page instances.
- Return type
types.GeneratorType[google.api_core.page_iterator.Page]
- Raises
ValueError – If the iterator has already been started.
-
property
schema
¶ The subset of columns to be read from the table.
- Type
-
to_arrow
(progress_bar_type=None, bqstorage_client=None, create_bqstorage_client=True)[source]¶ [Beta] Create a class:pyarrow.Table by loading all pages of a table or query.
- Parameters
progress_bar_type (Optional[str]) –
If set, use the tqdm library to display a progress bar while the data downloads. Install the
tqdm
package to use this feature.Possible values of
progress_bar_type
include:None
No progress bar.
'tqdm'
Use the
tqdm.tqdm()
function to print a progress bar tosys.stderr
.'tqdm_notebook'
Use the
tqdm.tqdm_notebook()
function to display a progress bar as a Jupyter notebook widget.'tqdm_gui'
Use the
tqdm.tqdm_gui()
function to display a progress bar as a graphical dialog box.
bqstorage_client (Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]) –
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery. This API is a billable API.
This method requires the
pyarrow
andgoogle-cloud-bigquery-storage
libraries.This method only exposes a subset of the capabilities of the BigQuery Storage API. For full access to all features (projections, filters, snapshots) use the Storage API directly.
create_bqstorage_client (Optional[bool]) –
If
True
(default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See thebqstorage_client
parameter for more information.This argument does nothing if
bqstorage_client
is supplied...versionadded:: 1.24.0
- Returns
- pyarrow.Table
A
pyarrow.Table
populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.
- Raises
ValueError – If the
pyarrow
library cannot be imported.
..versionadded:: 1.17.0
-
to_dataframe
(bqstorage_client=None, dtypes=None, progress_bar_type=None, create_bqstorage_client=True, date_as_object=True)[source]¶ Create a pandas DataFrame by loading all pages of a query.
- Parameters
bqstorage_client (Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]) –
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery.
This method requires the
pyarrow
andgoogle-cloud-bigquery-storage
libraries.This method only exposes a subset of the capabilities of the BigQuery Storage API. For full access to all features (projections, filters, snapshots) use the Storage API directly.
dtypes (Optional[Map[str, Union[str, pandas.Series.dtype]]]) – A dictionary of column names pandas
dtype``s. The provided ``dtype
is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.progress_bar_type (Optional[str]) –
If set, use the tqdm library to display a progress bar while the data downloads. Install the
tqdm
package to use this feature.Possible values of
progress_bar_type
include:None
No progress bar.
'tqdm'
Use the
tqdm.tqdm()
function to print a progress bar tosys.stderr
.'tqdm_notebook'
Use the
tqdm.tqdm_notebook()
function to display a progress bar as a Jupyter notebook widget.'tqdm_gui'
Use the
tqdm.tqdm_gui()
function to display a progress bar as a graphical dialog box.
..versionadded:: 1.11.0
create_bqstorage_client (Optional[bool]) –
If
True
(default), create a BigQuery Storage API client using the default API settings. The BigQuery Storage API is a faster way to fetch rows from BigQuery. See thebqstorage_client
parameter for more information.This argument does nothing if
bqstorage_client
is supplied...versionadded:: 1.24.0
date_as_object (Optional[bool]) –
If
True
(default), cast dates to objects. IfFalse
, convert to datetime64[ns] dtype...versionadded:: 1.26.0
- Returns
A
DataFrame
populated with row data and column headers from the query results. The column headers are derived from the destination table’s schema.- Return type
pandas.DataFrame
- Raises
ValueError – If the
pandas
library cannot be imported, or thegoogle.cloud.bigquery_storage_v1
module is required but cannot be imported.
-
to_dataframe_iterable
(bqstorage_client=None, dtypes=None)[source]¶ Create an iterable of pandas DataFrames, to process the table as a stream.
- Parameters
bqstorage_client (Optional[google.cloud.bigquery_storage_v1.BigQueryReadClient]) –
A BigQuery Storage API client. If supplied, use the faster BigQuery Storage API to fetch rows from BigQuery.
This method requires the
pyarrow
andgoogle-cloud-bigquery-storage
libraries.This method only exposes a subset of the capabilities of the BigQuery Storage API. For full access to all features (projections, filters, snapshots) use the Storage API directly.
dtypes (Optional[Map[str, Union[str, pandas.Series.dtype]]]) – A dictionary of column names pandas
dtype``s. The provided ``dtype
is used when constructing the series for the column specified. Otherwise, the default pandas behavior is used.
- Returns
A generator of
DataFrame
.- Return type
pandas.DataFrame
- Raises
ValueError – If the
pandas
library cannot be imported.