simCluster can process any file that meets the Platform File Specifications. Unlike simClassify, ID is not a mandatory field for simCluster.
simCluster delivers results in several formats:
Download - Results can be downloaded as a CSV file, where they come in the structure of cluster ID in the first field, distance from the center of cluster in the second (0 - 1 scale, lower is closer), object ID in the third, and the object’ variables in the fields after that.
Graphs - simMachines provides two forms of graphs for visualizing clusters, Circular Graphs and Sunburst Graphs. Both views expose the clusters of factors behind a classification. Each cell of the graph is a factor that is important for differentiating that cluster from those around it. The innermost segments are the classifications. As you go further from the center the graph splits into more granular clusters, with the outermost ring being the most precise clustering possible given the parameters selected.
These graphs convey information in several ways. First, through the layers provided they display the hierarchical structure of the relationship between factors in data. The width of the cluster is determined by the count of the objects in the cluster. The color of the cluster is the frequency with which the factor appears in the data (this can be changed to represent the weight of the factor in the cluster by changing the Scale setting).
In either viewing method you can view the factors of a cluster by hovering your mouse over a cell. You can click on a cluster to take a detailed look at its member factors and objects.
The Sunburst view is simCluster’s default visualization. It focuses on presenting information closer to the center of the graph. You can return to this view by selecting “Home” from the Visualization Menu. There are two modes for this graph, explained below.
The Zoomable Sunburst Graph is a view that focuses on the factors on the outside of the graph. It can be viewed by selecting the “Sunburst Graph” option from the Visualization Menu.
Online View - Results can also be viewed through the ML Studio platform by selecting “All Clusters” from the Visualization Menu. It delivers results as a list, with the Cluster ID in the first field, the Size of Cluster (count of objects in cluster) in the second, and provides an option to see the elements of the cluster (both the factors and the component data objects) by selecting the “View” icon for the cluster.
Circular Graph Modes - The main visualization of simCluster has two basic modes with different functionalities: “View” and “Compare”.
“View” mode allows for selection of a single cluster or subcluster (a selection of the path of cells up to the point clicked). In addition to seeing more details, there are four possible actions to be performed once a cell has been chosen:
- Cluster labeling: In the input with the cluster’s current Identifier you can write an alphanumeric label for the cluster to ease the process of locating it later. To save the label, click on the Save Icon (first from left to right). Cluster labels are included in the downloaded output and you can use the same label in multiple clusters. This allows you to use the same label for more than one cluster and then sort the downloaded export file by labels to effectively combine the clusters.
- Cluster displaying: For modes that require more than one cluster visualization, the Eye Icon marks a cluster as a main cluster in the visualization. It works like a radio button.
- Cluster deselection: By clicking the X icon you deselect a cluster allowing for the selection of another to take its place.
- Cluster download: By clicking the Download Icon, you are presented with a selection of all available files from a single cluster to be downloaded, instead of a full output download like the one described in Download, above.
Comments
0 comments
Please sign in to leave a comment.