pyspark.pandas.DataFrame.plot.scatter#
- plot.scatter(x, y, **kwds)#
- Create a scatter plot with varying marker point size and color. - The coordinates of each point are defined by two dataframe columns and filled circles are used to represent each point. This kind of plot is useful to see complex correlations between two variables. Points could be for instance natural 2D coordinates like longitude and latitude in a map or, in general, any pair of metrics that can be plotted against each other. - Parameters
- xint or str
- The column name or column position to be used as horizontal coordinates for each point. 
- yint or str
- The column name or column position to be used as vertical coordinates for each point. 
- sscalar or array_like, optional
- (matplotlib-only). 
- cstr, int or array_like, optional
- (matplotlib-only). 
- **kwds: Optional
- Keyword arguments to pass on to - pyspark.pandas.DataFrame.plot().
 
- Returns
- plotly.graph_objs.Figure
- Return an custom object when - backend!=plotly. Return an ndarray when- subplots=True(matplotlib-only).
 
 - See also - plotly.express.scatter
- Scatter plot using multiple input data formats (plotly). 
- matplotlib.pyplot.scatter
- Scatter plot using multiple input data formats (matplotlib). 
 - Examples - Let’s see how to draw a scatter plot using coordinates from the values in a DataFrame’s columns. - >>> df = ps.DataFrame([[5.1, 3.5, 0], [4.9, 3.0, 0], [7.0, 3.2, 1], ... [6.4, 3.2, 1], [5.9, 3.0, 2]], ... columns=['length', 'width', 'species']) >>> df.plot.scatter(x='length', y='width') - And now with dark scheme: - >>> df = ps.DataFrame([[5.1, 3.5, 0], [4.9, 3.0, 0], [7.0, 3.2, 1], ... [6.4, 3.2, 1], [5.9, 3.0, 2]], ... columns=['length', 'width', 'species']) >>> fig = df.plot.scatter(x='length', y='width') >>> fig.update_layout(template="plotly_dark")