Upgrading from PySpark 3.2 to 3.3¶
In Spark 3.3, the
pyspark.pandas.sql
method follows [the standard Python string formatter](https://docs.python.org/3/library/string.html#format-string-syntax). To restore the previous behavior, setPYSPARK_PANDAS_SQL_LEGACY
environment variable to1
.In Spark 3.3, the
drop
method of pandas API on Spark DataFrame supports dropping rows byindex
, and sets dropping by index instead of column by default.In Spark 3.3, PySpark upgrades Pandas version, the new minimum required version changes from 0.23.2 to 1.0.5.
In Spark 3.3, the
repr
return values of SQL DataTypes have been changed to yield an object with the same value when passed toeval
.