pyspark.sql.functions.max_by¶
- 
pyspark.sql.functions.max_by(col: ColumnOrName, ord: ColumnOrName) → pyspark.sql.column.Column[source]¶
- Returns the value associated with the maximum value of ord. - New in version 3.3.0. - Changed in version 3.4.0: Supports Spark Connect. - Parameters
- Returns
- Column
- value associated with the maximum value of ord. 
 
 - Examples - >>> df = spark.createDataFrame([ ... ("Java", 2012, 20000), ("dotNET", 2012, 5000), ... ("dotNET", 2013, 48000), ("Java", 2013, 30000)], ... schema=("course", "year", "earnings")) >>> df.groupby("course").agg(max_by("year", "earnings")).show() +------+----------------------+ |course|max_by(year, earnings)| +------+----------------------+ | Java| 2013| |dotNET| 2013| +------+----------------------+