Thresholding, a feature used in simClassify and simClassify+ predictors, acts as a limit to split resulting confidence values into a true or false category for binomial predictions. Thresholds only apply to confidence levels for the positive class of a prediction (i.e. the class with the lower number of predictions). If the confidence value for the positive class exceeds the threshold value, the result is considered true and the positive class is predicted. If the confidence value is less than the threshold, the result is false and the negative class is predicted.
In the web user interface, this parameter can be set through the threshold input box (highlighted with a blue box) as shown in the following image:
The simClassify+ model creation page displays the threshold input box in a similar manner. However, for both simClassify and simClassify+, when creating a grid, an additional field is added to Advanced Options because the thresholding works differently in grids.
In a grid, after each experiment is run, the probability distributions are partitioned into the same number of sections as the value of this new field. The partitions will be set in a way that creates uniform quantiles across the training probabilities. Each partition in the distribution is essentially a threshold, and all are tested to return different result metrics. In theory, the confidence levels in which these thresholds are set will differ amongst each experiment. This way, a method to determine a good threshold for making predictions is automated.
Furthermore, the “Threshold” field in the grid creation tab is used to explicitly set threshold values to test. These threshold values, as well as the thresholds created from the “Number of Thresholds” field, will be used in union to determine experiment accuracy. In the above example, there will be 102 distinct thresholds set for the grid experiments (0.3, 0.5, + 100 determined from the probability distribution).
Additionally, in the Grid Results page, not only is there a tab displaying the different results for each experiment in the Grid Table, but also, there is a new Grid Result Table. This table displays information related to each tested threshold per experiment in the grid, including the accuracy column which evaluates how accurate is the tested threshold for making predictions:
You can specify the precision of the numbers displayed in this table on the Admin page.
Please sign in to leave a comment.