BisectingKMeansSummary¶

class pyspark.ml.clustering.BisectingKMeansSummary(java_obj: Optional[JavaObject] = None)[source]¶

Bisecting KMeans clustering results for a given model.

New in version 2.1.0.

Attributes

`cluster`	DataFrame of predicted cluster centers for each training data point.
`clusterSizes`	Size of (number of data points in) each cluster.
`featuresCol`	Name for column of features in predictions.
`k`	The number of clusters the model was trained with.
`numIter`	Number of iterations.
`predictionCol`	Name for column of predicted clusters in predictions.
`predictions`	DataFrame produced by the model’s transform method.
`trainingCost`	Sum of squared distances to the nearest centroid for all points in the training dataset.

Attributes Documentation

cluster¶: DataFrame of predicted cluster centers for each training data point.

New in version 2.1.0.

clusterSizes¶: Size of (number of data points in) each cluster.

New in version 2.1.0.

featuresCol¶: Name for column of features in predictions.

New in version 2.1.0.

k¶: The number of clusters the model was trained with.

New in version 2.1.0.

numIter¶: Number of iterations.

New in version 2.4.0.

predictionCol¶: Name for column of predicted clusters in predictions.

New in version 2.1.0.

predictions¶: DataFrame produced by the model’s transform method.

New in version 2.1.0.

trainingCost¶: Sum of squared distances to the nearest centroid for all points in the training dataset. This is equivalent to sklearn’s inertia.

New in version 3.0.0.

previous

BisectingKMeansModel

next

KMeans