Skip to content

viz

plot_clustering_results(lst, title, metric, output, errorbar=None, annotation=None)

Function to plot goodness of fit indicators resulting from a clustering model. Resulting plot will be saved in the output folder specified in function’s arguments.

Parameters

lst : List of values to plot.

title : Title of the plot.

metric : Metric name.

output : Output filename.

errorbar : List of values to plot as errorbar (CI, SD, etc.). Defaults to None.

annotation : Annotation to add directly on the plot. Defaults to None.

plot_dendrogram(X, output, title=‘Dendrograms’, annotation=None)

Function to plot a dendrogram plot showing hierarchical clustering. Useful to visually determine the appropriate number of clusters. Adapted from: https://towardsdatascience.com/cheat-sheet-to-implementing-7-methods-for-selecting-optimal-number-of-clusters-in-python-898241e1d6ad

Parameters

X : Data on which clustering will be performed.

output : Output filename and path.

title : Title for the plot. Defaults to ‘Dendrograms’.

annotation : Annotation to add directly on the plot. Defaults to None.

plot_parallel_plot(X, labels, output, mean_values=False, cmap=‘magma’, title=‘Parallel Coordinates plot.’)

Function to plot a parallel coordinates plot to visualize differences between clusters. Useful to highlight significant changes between clusters and interpret them. Adapted from: https://towardsdatascience.com/the-art-of-effective-visualization-of-multi-dimensional-data-6c7202990c57

Parameters

X : Input dataset of shape (S, F).

labels : Array of hard membership value (S, ).

output : Filename of the png file.

mean_values : If true, will plot the mean values of each features for each clusters. Defaults to False.

cmap : Colormap to use for the plot. Defaults to ‘magma’. See https://matplotlib.org/stable/tutorials/colors/colormaps.html

title : Title of the plot. Defaults to ‘Parallel Coordinates plot.’

radar_plot(X, labels, output, frame=‘circle’, title=‘Radar plot’, cmap=‘magma’)

Function to plot a radar plot for all features in the original dataset stratified by clusters. T-test between clusters’ mean within a feature is also computed and annotated directly on the plot. When plotting a high number of clusters, plotting of significant annotation is polluting the plot, will be fixed in the future.

Parameters

X : Input dataset of shape (S, F).

labels : Array of hard membership value (S, ).

output : Filename of the png file.

frame : Shape of the radar plot. Defaults to ‘circle’. Choices are ‘circle’ or ‘polygon’.

title : Title of the plot. Defaults to ‘Radar plot’.

cmap : Colormap to use for the plot. Defaults to ‘magma’. See https://matplotlib.org/stable/tutorials/colors/colormaps.html

sort_int_labels_legend(ax, title=None)

Function automatically reorder numerically labels with matching handles in matplotlib legend.

Parameters

ax : Axes object.

title : Title of the legend. Defaults to None.

Returns

ax.legend : Axes legend object