These are the changes in pandas 1.0.2. See Release notes for a full changelog including other versions of pandas.
Groupby
Fixed regression in groupby(..).agg() which was failing on frames with MultiIndex columns and a custom function (GH31777)
groupby(..).agg()
MultiIndex
Fixed regression in groupby(..).rolling(..).apply() (RollingGroupby) where the raw parameter was ignored (GH31754)
groupby(..).rolling(..).apply()
RollingGroupby
raw
Fixed regression in rolling(..).corr() when using a time offset (GH31789)
rolling(..).corr()
Fixed regression in groupby(..).nunique() which was modifying the original values if NaN values were present (GH31950)
groupby(..).nunique()
NaN
Fixed regression in DataFrame.groupby raising a ValueError from an internal operation (GH31802)
DataFrame.groupby
ValueError
Fixed regression in groupby(..).agg() calling a user-provided function an extra time on an empty input (GH31760)
I/O
Fixed regression in read_csv() in which the encoding option was not recognized with certain file-like objects (GH31819)
read_csv()
encoding
Fixed regression in DataFrame.to_excel() when the columns keyword argument is passed (GH31677)
DataFrame.to_excel()
columns
Fixed regression in ExcelFile where the stream passed into the function was closed by the destructor. (GH31467)
ExcelFile
Fixed regression where read_pickle() raised a UnicodeDecodeError when reading a py27 pickle with MultiIndex column (GH31988).
read_pickle()
UnicodeDecodeError
Reindexing/alignment
Fixed regression in Series.align() when other is a DataFrame and method is not None (GH31785)
Series.align()
other
DataFrame
method
None
Fixed regression in DataFrame.reindex() and Series.reindex() when reindexing with (tz-aware) index and method=nearest (GH26683)
DataFrame.reindex()
Series.reindex()
method=nearest
Fixed regression in DataFrame.reindex_like() on a DataFrame subclass raised an AssertionError (GH31925)
DataFrame.reindex_like()
AssertionError
Fixed regression in DataFrame arithmetic operations with mis-matched columns (GH31623)
Other
Fixed regression in joining on DatetimeIndex or TimedeltaIndex to preserve freq in simple cases (GH32166)
DatetimeIndex
TimedeltaIndex
freq
Fixed regression in Series.shift() with datetime64 dtype when passing an integer fill_value (GH32591)
Series.shift()
datetime64
fill_value
Fixed regression in the repr of an object-dtype Index with bools and missing values (GH32146)
Index
Previously indexing with a nullable Boolean array containing NA would raise a ValueError, however this is now permitted with NA being treated as False. (GH31503)
NA
False
In [1]: s = pd.Series([1, 2, 3, 4]) In [2]: mask = pd.array([True, True, False, None], dtype="boolean") In [3]: s Out[3]: 0 1 1 2 2 3 3 4 Length: 4, dtype: int64 In [4]: mask Out[4]: <BooleanArray> [True, True, False, <NA>] Length: 4, dtype: boolean
pandas 1.0.0-1.0.1
>>> s[mask] Traceback (most recent call last): ... ValueError: cannot mask with array containing NA / NaN values
pandas 1.0.2
In [5]: s[mask] Out[5]: 0 1 1 2 Length: 2, dtype: int64
Datetimelike
Bug in Series.astype() not copying for tz-naive and tz-aware datetime64 dtype (GH32490)
Series.astype()
Bug where to_datetime() would raise when passed pd.NA (GH32213)
to_datetime()
pd.NA
Improved error message when subtracting two Timestamp that result in an out-of-bounds Timedelta (GH31774)
Timestamp
Timedelta
Categorical
Fixed bug where Categorical.from_codes() improperly raised a ValueError when passed nullable integer codes. (GH31779)
Categorical.from_codes()
Fixed bug where Categorical() constructor would raise a TypeError when given a numpy array containing pd.NA. (GH31927)
Categorical()
TypeError
Bug in Categorical that would ignore or crash when calling Series.replace() with a list-like to_replace (GH31720)
Series.replace()
to_replace
Using pd.NA with DataFrame.to_json() now correctly outputs a null value instead of an empty object (GH31615)
DataFrame.to_json()
Bug in pandas.json_normalize() when value in meta path is not iterable (GH31507)
pandas.json_normalize()
Fixed pickling of pandas.NA. Previously a new object was returned, which broke computations relying on NA being a singleton (GH31847)
pandas.NA
Fixed bug in parquet roundtrip with nullable unsigned integer dtypes (GH31896).
Experimental dtypes
Fixed bug in DataFrame.convert_dtypes() for columns that were already using the "string" dtype (GH31731).
DataFrame.convert_dtypes()
"string"
Fixed bug in DataFrame.convert_dtypes() for series with mix of integers and strings (GH32117)
Fixed bug in DataFrame.convert_dtypes() where BooleanDtype columns were converted to Int64 (GH32287)
BooleanDtype
Int64
Fixed bug in setting values using a slice indexer with string dtype (GH31772)
Fixed bug where pandas.core.groupby.GroupBy.first() and pandas.core.groupby.GroupBy.last() would raise a TypeError when groups contained pd.NA in a column of object dtype (GH32123)
pandas.core.groupby.GroupBy.first()
pandas.core.groupby.GroupBy.last()
Fixed bug where DataFrameGroupBy.mean(), DataFrameGroupBy.median(), DataFrameGroupBy.var(), and DataFrameGroupBy.std() would raise a TypeError on Int64 dtype columns (GH32219)
DataFrameGroupBy.mean()
DataFrameGroupBy.median()
DataFrameGroupBy.var()
DataFrameGroupBy.std()
Strings
Using pd.NA with Series.str.repeat() now correctly outputs a null value instead of raising error for vector inputs (GH31632)
Series.str.repeat()
Rolling
Fixed rolling operations with variable window (defined by time duration) on decreasing time index (GH32385).
A total of 25 people contributed patches to this release. People with a “+” by their names contributed a patch for the first time.
Anna Daglis +
Daniel Saxton
Irv Lustig
Jan Škoda
Joris Van den Bossche
Justin Zheng
Kaiqi Dong
Kendall Masse
Marco Gorelli
Matthew Roeschke
MeeseeksMachine
MomIsBestFriend
Pandas Development Team
Pedro Reys +
Prakhar Pandey
Robert de Vries +
Rushabh Vasani
Simon Hawkins
Stijn Van Hoey
Terji Petersen
Tom Augspurger
William Ayd
alimcmaster1
gfyoung
jbrockmendel