Changelog¶
0.24.0 (2024-10-14)¶
⚠ BREAKING CHANGES¶
to_gbq
loads naive (no timezone) columns to BigQuery DATETIME instead of TIMESTAMP (#814)to_gbq
loads object column containing bool values to BOOLEAN instead of STRING (#814)to_gbq
loads object column containing dictionary values to STRUCT instead of STRING (#814)to_gbq
loadsunit8
columns to BigQuery INT64 instead of STRING (#814)
Features¶
Bug Fixes¶
to_gbq
loadsunit8
columns to BigQuery INT64 instead of STRING (#814) (107bb40)to_gbq
loads naive (no timezone) columns to BigQuery DATETIME instead of TIMESTAMP (#814) (107bb40)to_gbq
loads object column containing bool values to BOOLEAN instead of STRING (#814) (107bb40)to_gbq
loads object column containing dictionary values to STRUCT instead of STRING (#814) (107bb40)
Dependencies¶
0.23.2 (2024-09-20)¶
Bug Fixes¶
Documentation¶
0.23.1 (2024-06-07)¶
Bug Fixes¶
Documentation¶
0.23.0 (2024-05-20)¶
Features¶
0.22.0 (2024-03-05)¶
Features¶
Bug Fixes¶
0.21.0 (2024-01-25)¶
Features¶
Bug Fixes¶
0.20.0 (2023-12-10)¶
Features¶
Bug Fixes¶
Documentation¶
0.19.2 (2023-05-10)¶
Bug Fixes¶
Documentation¶
0.19.1 (2023-01-25)¶
Documentation¶
Updates the user instructions re OAuth (0c2b716)
0.19.0 (2023-01-11)¶
Features¶
0.18.1 (2022-11-28)¶
Dependencies¶
0.18.0 (2022-11-19)¶
Features¶
0.17.9 (2022-09-27)¶
Bug Fixes¶
0.17.8 (2022-08-09)¶
Bug Fixes¶
0.17.7 (2022-07-11)¶
Bug Fixes¶
0.17.6 (2022-06-03)¶
Documentation¶
0.17.5 (2022-05-09)¶
Bug Fixes¶
0.17.4 (2022-03-14)¶
Bug Fixes¶
0.17.3 (2022-03-05)¶
Bug Fixes¶
0.17.2 (2022-03-02)¶
Dependencies¶
0.17.1 (2022-02-24)¶
Bug Fixes¶
Documentation¶
0.17.0 (2022-01-19)¶
⚠ BREAKING CHANGES¶
Features¶
Bug Fixes¶
read_gbq
supports extreme DATETIME values such as0001-01-01 00:00:00
(#444) (d120f8f)to_gbq
allows strings for DATE and floats for NUMERIC withapi_method="load_parquet"
(#423) (2180836)allow extreme DATE values such as
datetime.date(1, 1, 1)
inload_gbq
(#442) (e13abaf)avoid iteritems deprecation in pandas prerelease (#469) (7379cdc)
Miscellaneous Chores¶
0.16.0 (2021-11-08)¶
Features¶
Miscellaneous Chores¶
Documentation¶
0.15.0 / 2021-03-30¶
Features¶
Bug fixes¶
Dependencies¶
0.14.1 / 2020-11-10¶
Bug fixes¶
0.14.0 / 2020-10-05¶
Add
dtypes
argument toread_gbq
. Use this argument to override the defaultdtype
for a particular column in the query results. For example, this can be used to select nullable integer columns as theInt64
nullable integer pandas extension type. (#242, #332)
df = gbq.read_gbq(
"SELECT CAST(NULL AS INT64) AS null_integer",
dtypes={"null_integer": "Int64"},
)
Dependency updates¶
Internal changes¶
Update tests to run against Python 3.8. (#331)
0.13.3 / 2020-09-30¶
0.13.2 / 2020-05-14¶
Fix
Provided Schema does not match Table
error when the existing table contains required fields. (#315)
0.13.1 / 2020-02-13¶
Fix
AttributeError
with BQ Storage API to download empty results. (#299)
0.13.0 / 2019-12-12¶
Raise
NotImplementedError
when the deprecatedprivate_key
argument is used. (#301)
0.12.0 / 2019-11-25¶
New features¶
Add
max_results
argument to~pandas_gbq.read_gbq()
. Use this argument to limit the number of rows in the results DataFrame. Setmax_results
to 0 to ignore query outputs, such as for DML or DDL queries. (#102)Add
progress_bar_type
argument to~pandas_gbq.read_gbq()
. Use this argument to display a progress bar when downloading data. (#182)
Bug fixes¶
Fix resource leak with
use_bqstorage_api
by closing BigQuery Storage API client after use. (#294)
Dependency updates¶
Update the minimum version of
google-cloud-bigquery
to 1.11.1. (#296)
Documentation¶
Add code samples to introduction and refactor howto guides. (#239)
0.11.0 / 2019-07-29¶
Breaking Change: Python 2 support has been dropped. This is to align with the pandas package which dropped Python 2 support at the end of 2019. (#268)
Enhancements¶
Ensure
table_schema
argument is not modified inplace. (#278)
Implementation changes¶
Use object dtype for
STRING
,ARRAY
, andSTRUCT
columns when there are zero rows. (#285)
Internal changes¶
0.10.0 / 2019-04-05¶
Breaking Change: Default SQL dialect is now
standard
. Usepandas_gbq.context.dialect
to override the default value. (#195, #245)
Documentation¶
Document
BigQuery data type to pandas dtype conversion <reading-dtypes>
forread_gbq
. (#269)
Dependency updates¶
Internal changes¶
Update the authentication credentials. Note: You may need to set
reauth=True
in order to update your credentials to the most recent version. This is required to use new functionality such as the BigQuery Storage API. (#267)Use
to_dataframe()
fromgoogle-cloud-bigquery
in theread_gbq()
function. (#247)
Enhancements¶
Fix a bug where pandas-gbq could not upload an empty DataFrame. (#237)
Allow
table_schema
into_gbq
to contain only a subset of columns, with the rest being populated using the DataFrame dtypes (#218) (contributed by @johnpaton)Read
project_id
into_gbq
from providedcredentials
if available (contributed by @daureg)read_gbq
uses the timezone-awareDatetimeTZDtype(unit='ns', tz='UTC')
dtype for BigQueryTIMESTAMP
columns. (#269)Add
use_bqstorage_api
toread_gbq
. The BigQuery Storage API can be used to download large query results (>125 MB) more quickly. If the BQ Storage API can’t be used, the BigQuery API is used instead. (#133, #270)
0.9.0 / 2019-01-11¶
0.8.0 / 2018-11-12¶
Breaking changes¶
Deprecate
private_key
parameter topandas_gbq.read_gbq
andpandas_gbq.to_gbq
in favor of newcredentials
argument. Instead, create a credentials object usinggoogle.oauth2.service_account.Credentials.from_service_account_info
orgoogle.oauth2.service_account.Credentials.from_service_account_file
. See theauthentication how-to guide <howto/authentication>
for examples. (#161, #231)
Enhancements¶
Internal changes¶
0.7.0 / 2018-10-19¶
int columns which contain NULL are now cast to float, rather than object type. (#174)
DATE, DATETIME and TIMESTAMP columns are now parsed as pandas’ timestamp objects (#224)
Add
pandas_gbq.Context
to cache credentials in-memory, across calls toread_gbq
andto_gbq
. (#198, #208)Fast queries now do not log above
DEBUG
level. (#204) With BigQuery’s release of clustering querying smaller samples of data is now faster and cheaper.Don’t load credentials from disk if reauth is
True
. (#212) This fixes a bug where pandas-gbq could not refresh credentials if the cached credentials were invalid, revoked, or expired, even whenreauth=True
.Catch RefreshError when trying credentials. (#226)
Internal changes¶
0.6.1 / 2018-09-11¶
Improved
read_gbq
performance and memory consumption by delegatingDataFrame
construction to the Pandas library, radically reducing the number of loops that execute in python (#128)Reduced verbosity of logging from
read_gbq
, particularly for short queries. (#201)Avoid
SELECT 1
query when runningto_gbq
. (#202)
0.6.0 / 2018-08-15¶
0.5.0 / 2018-06-15¶
Project ID parameter is optional in
read_gbq
andto_gbq
when it can inferred from the environment. Note: you must still pass in a project ID when using user-based authentication. (#103)Progress bar added for
to_gbq
, through an optional library tqdm as dependency. (#162)Add location parameter to
read_gbq
andto_gbq
so that pandas-gbq can work with datasets in the Tokyo region. (#177)
Documentation¶
Internal changes¶
0.4.1 / 2018-04-05¶
Only show
verbose
deprecation warning if Pandas version does not populate it. (#157)
0.4.0 / 2018-04-03¶
Fix bug in read_gbq when building a dataframe with integer columns on Windows. Explicitly use 64bit integers when converting from BQ types. (#119)
Fix bug in read_gbq when querying for an array of floats (#123)
Fix bug in read_gbq with configuration argument. Updates read_gbq to account for breaking change in the way
google-cloud-python
version 0.32.0+ handles query configuration API representation. (#152)Fix bug in to_gbq where seconds were discarded in timestamp columns. (#148)
Fix bug in to_gbq when supplying a user-defined schema (#150)
Deprecate the
verbose
parameter in read_gbq and to_gbq. Messages use the logging module instead of printing progress directly to standard output. (#12)
0.3.1 / 2018-02-13¶
Fix an issue where Unicode couldn’t be uploaded in Python 2 (#106)
Add support for a passed schema in
`to_gbq
instead inferring the schema from the passed DataFrame with DataFrame.dtypes (#46 <https://github.com/googleapis/python-bigquery-pandas/issues/46>`_)Fix an issue where a dataframe containing both integer and floating point columns could not be uploaded with
to_gbq
(#116)to_gbq
now usesto_csv
to avoid manually looping over rows in a dataframe (should result in faster table uploads) (#96)
0.3.0 / 2018-01-03¶
Use the google-cloud-bigquery library for API calls. The
google-cloud-bigquery
package is a new dependency, and dependencies ongoogle-api-python-client
andhttplib2
are removed. See the installation guide for more details. (#93)Structs and arrays are now named properly (#23) and BigQuery functions like
array_agg
no longer run into errors during type conversion (#22).to_gbq
now uses a load job instead of the streaming API. RemoveStreamingInsertError
class, as it is no longer used byto_gbq
. (#7, #75)
0.2.1 / 2017-11-27¶
read_gbq
now raisesQueryTimeout
if the request exceeds thequery.timeoutMs
value specified in the BigQuery configuration. (#76)Environment variable
PANDAS_GBQ_CREDENTIALS_FILE
can now be used to override the default location where the BigQuery user account credentials are stored. (#86)BigQuery user account credentials are now stored in an application-specific hidden user folder on the operating system. (#41)
0.2.0 / 2017-07-24¶
Drop support for Python 3.4 (#40)
The dataframe passed to
`.to_gbq(...., if_exists='append')
needs to contain only a subset of the fields in the BigQuery schema. (#24 <https://github.com/googleapis/python-bigquery-pandas/issues/24>`_)Use the google-auth library for authentication because
oauth2client
is deprecated. (#39)read_gbq
now has aauth_local_webserver
boolean argument for controlling whether to use web server or console flow when getting user credentials. Replaces –noauth_local_webserver command line argument. (#35)read_gbq
now displays the BigQuery Job ID and standard price in verbose output. (#70 and #71)
0.1.6 / 2017-05-03¶
All gbq errors will simply be subclasses of
ValueError
and no longer inherit from the deprecatedPandasError
.
0.1.4 / 2017-03-17¶
InvalidIndexColumn
will be raised instead ofInvalidColumnOrder
inread_gbq
when the index column specified does not exist in the BigQuery schema. (#6)
0.1.3 / 2017-03-04¶
Bug with appending to a BigQuery table where fields have modes (NULLABLE,REQUIRED,REPEATED) specified. These modes were compared versus the remote schema and writing a table via
to_gbq
would previously raise. (#13)
0.1.2 / 2017-02-23¶
Initial release of transfered code from pandas
Includes patches since the 0.19.2 release on pandas with the following:
read_gbq
now allows query configuration preferences pandas-GH#14742read_gbq
now storesINTEGER
columns asdtype=object
if they containNULL
values. Otherwise they are stored asint64
. This prevents precision lost for integers greather than 2*53. Furthermore ``FLOAT`` columns with values above 10*4 are no longer casted toint64
which also caused precision loss pandas-GH#14064, and pandas-GH#14305