Interpreting Multinomial Fold Experiments

  • Updated

The results of a multinomial fold experiment are shown in a table, similar to grid results. Model metrics are displayed in the table.

The Fold Experiments List in the Fold Experiments page displays the Fold Experiments that have been run. You can download results or use the Visualization option to examine detailed metrics.


The Visualization link will take you to a page to view the result details, either showing metrics for binary classifiers:


Or, a visualization link for multinomial classifiers.


Selecting “Visualization” brings you to the “Fold Visualization” page.

Results are split based on the engine’s confidence in the prediction. These divisions are made at intervals of 0.10 confidence. P0 is the score when the engine is at least 0.00 confident - all of the scores. This ranges to P90 - predictions where confidence is at least 0.90.

At the top of the page you will be given a score for the model at the selected confidence level. The score is the model’s correct classifications divided by the total number of queries.

The first visualization is a confusion matrix, which shows the accuracy of the model for each class at various confidence levels. You can see the score in each cell of the matrix by hovering your mouse over it. The cells are also color-coded by accuracy, with darker blues being more accurate and lighter yellows being less accurate.



Accuracy scores are generated for each class. These are presented with the predicted values being columns and the actual values being rows. Therefore, in the above grid, the top left corner is predictions of class “THEFT/OTHER” that were actually class “THEFT/OTHER”, which has an accuracy score of 0.5961. To the right of that is a prediction of “THEFT F/AUTO” when the actual class was “THEFT/OTHER”, which has an accuracy score of 0.2645. To the right of that are predictions of “ASSAULT W/DANGEROUS WEAPON” when the actual class is “THEFT/OTHER”, which has an accuracy score of 0.0043, and so on.  The next row has predictions for the actual class of “THEFT F/AUTO”. The first cell represents predictions of “THEFT/OTHER” for the actual class of “THEFT F/AUTO, which has an accuracy score of 0.3155. Next are accurate predictions of “THEFT F/AUTO”, which has an accuracy score of 0.5757, and so on.


The confusion matrix also provides information on the amount of objects in the dataset that are covered by the matrix. This is visible just to the right of the matrix. We provide the Object Processed Percentage (OPP), the percentage of classes that were processed and the Object Count Percent on Dataset (OCD), the percentage of objects that were processed. These scores are cumulative. In the above confusion matrix there are nine classes. The first class (THEFT/OTHER) has 1/9 or 11.11% OPP and 31.45% OCD. The next class (THEFT F/AUTO) has added 11.11% to OPP so its value is 22.22%. The OCD has increased by 26.31%, so it has an OPP of 57.76%.  the other has 100%. So, these first two classes represent the majority of the classes in the dataset. When we get to the last row/class, all the classes and all the object classes have been processed, so the numbers are 100%.

Below the confusion matrix you can see a chart with the engine’s score at each confidence level and the percentage of objects which have predictions at that confidence.

Downloaded results will consist of a CSV file with the model’s average score and object insertion percentage at each confidence level.

The bottom of the page has a table with the data from the dataset and the predicted class. When the predicted class and actual class match, the classes are in blue, otherwise they are in orange.

Was this article helpful?

0 out of 0 found this helpful



Please sign in to leave a comment.