Data Type Specifications tell the model what form the data from each column is in so it knows how to properly compare values. Unlike other models, the class column will be set to “CLASS_ITEM_SET”. This item set must be duplicated in the data set, with the duplicated column type set to “ITEM_SET”. The item set being described here is what will tell the model the individual’s historical preferences. More detail on the format of the item set will come in later sections.
ID |
A mandatory field which uniquely identifies each object. |
CLASS_ITEM_SET |
A mandatory field which specifies the field to be classified. Item set format. A series of values with weights. (Formatted as item1:weight1;item2:weight2;item3:weight3) |
ITEM_SET |
Duplicate of the column of type CLASS_ITEM_SET |
REAL |
Numerical values. |
NOMINAL |
Values that do not bear a quantitative relationship with each other (i.e., strings and numbers which represent non-numerical information). |
MULTI_PLAIN |
Multiple NOMINAL values separated by spaces. Non-language specific. |
MULTI_ENGLISH |
Multiple NOMINAL values separated by spaces. The text is English language. |
MULTI_SPANISH |
Multiple NOMINAL values separated by spaces. The text is Spanish language. |
MULTI_JAPANESE |
Multiple NOMINAL values separated by spaces. The text is Japanese language. |
IGNORE |
The column shall be ignored by the program. |
NULL_INDICATOR |
This column type identifies the presence or absence of non-numerical data, assigning different weights to any cell with data versus those without data. |
Comments
0 comments
Please sign in to leave a comment.