simClassify Model Specifications

  • Updated

Model Specifications modify a model’s behavior and access to server resources.

simClassify accepts the following Model Specifications:

Pivots

The number of primary search points in the engine. This may improve query speed at the cost of training time. (Range 256 to 1024, default 256)

Probability

Minimum accepted probability that the distance between the result and the query will be within the Accepted Error range.  Any result with lower probability will be discarded. (Range 0 to 1, default .95)

Accepted Error

Maximum accepted difference in distance between returned objects and the query object. (Minimum 1, default 1.2)

Bins

Specifies the number of ranges to be used in calculating the similarity of REAL columns. (Default 10)

simSearch K

Specify the k number of results for the nearest neighbor search. (Default 10)

Top columns

The number of columns to consider for each prediction. Note that columns with strings, such as Multi_English, can be divided into multiple columns for this purpose. (Default 20)

Length

The total number of classes to consider for each prediction. (Default 2)

Dense Mode

Sets the distance function used. Impacts weighting of factors. (Default SMART, also accepts: DEFAULT, MARQ3, EXCEEDS)

Energy Weight

Useful if one classification is expected to be a significantly larger proportion of the results. Accepts boolean values. (Default TRUE)

Classifier K

Classifier K, the number of nearest neighbors used in making the classification. (Default CK = Auto Detect)

Threshold

The confidence level above which a class is considered an acceptable prediction. Non-default values are useful for unbalanced class distributions.


simClassify’s parameters can be optimally selected using grids, folds, and Auto Tune. Please see Finding Optimal Parameter Values for Classification for details

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.