Bias Detection

  • Updated

Socially harmful biases and other undesirable correlations can be learned by a model when training data is biased or otherwise unreliable. These biases do not reflect the desired outputs of the model, which for moral or legal/regulatory reasons should perform equally for people belonging to a protected class as those not.  

Semi-supervised clustering provides an excellent avenue for evaluating a machine learning model for potentially harmful bias. ML Studio provides statistical tests (One-Way ANOVA for REAL and META_REAL, Chi Square Goodness of Fit for NOMINAL and META_NOMINAL) to evaluate if there is a statistically significant difference in the distribution of a feature within and between groups. In this case, between the selected clusters and the full data set. This capability allows a model to be tested to confirm that it performs equally well for all people whom it processes. A model is “fair”,when it has similar precision for members of group and those not of that group.

Adding Bias Detection to a Model

Bias Detection works for NOMINAL and REAL features used in the model as well as those features with a Metadata spec type (META_NOMINAL and META_REAL).

Bias Detection requires the Specification Analyzer to have been completed on the training data set.

Accessing Bias Detection

Bias Detection is available through the Cluster Details page in either Cluster View Mode or Compare Mode.

Selected_Cluster_View.png

Navigate to Cluster Details, and select the BIAS tab. Here you will see the Bias Statistics. The column you want to see Bias Statistics for can be selected from the searchable dropdown. All NOMINAL and REAL columns used in the model are available, as well as any features set as META_NOMINAL and META_REAL.

Bias_Search_.pngReading and Interpreting Bias Test Results

ML Studio Bias Detection provides two sets of statistical test results, INTERCLUSTER and INTRACLUSTER, and a breakdown of the feature distribution within the cluster in the CATEGORY DISTRIBUTION tab.

Bias Testing runs two sets of statistical tests. INTERCLUSTER evaluates the distribution of values of the selected testing variable between the selected cluster and the entire data set, using either an ANOVA or Chi-Square test depending on data type. INTRACLUSTER evaluates the distribution of classification confidence values of the different values of the variable within the selected cluster with an ANOVA test.

Intercluster Nominal Value Bias Statistics

Bias_Test_Chi_Square_Test.png

For NOMINAL and META_NOMINAL variables, Bias Testing conducts a Chi Square Goodness of Fit Test, returning the 3 values that are produced in the test:

Chi Square: The Test result – lower values mean that it is more likely that the distribution observed in each sample comes from the same population of data.

P Value: The probability that the observed differences in the distribution are from random chance. Lower values mean it is less likely to be from random chance.

Total Categorical Values: The number of different values in the column across the entire data set.

Intercluster Real Value Bias Statistics

One_Way_ANOVA.png

For REAL and META_REAL variables, Bias Testing conducts a One Way ANOVA (Analysis of Variance), returning the 9 values that comprise the test and its components:

Degrees of Freedom Between Groups: The number of groups being tested -1

Degrees of Freedom Within Groups: The number of records within the group being tested -1

F: The Test result. Lower values mean it is more likely that the distributions are the same.

Mean Squares Between Groups: Average deviation from the mean between groups.

Mean Squares Within Groups: Average deviation from the mean within the group Omega Squared: The estimated average effect size (in whatever units the feature has).

P Value: The probability that the observed differences in the distribution are from random chance. Lower values mean it is less likely to be from random chance.

Sum of Squares Between Groups: Total deviation from the mean between groups

Sum of Squares Within Groups: Total deviation from the mean within groups

Total Sum of Squares: The sum of both Sum of Squares Between and Within Groups

007.png

Intracluster Bias Test Results

The INTRACLUSTER Bias Test conducts an ANOVA test comparing the distribution of Classification Scores within the tested cluster for the various feature values (for Categorical features) or feature value bins (for REAL features). The results of this test identify if the model performs differently for records with different values within the selected cluster.

Intra_Cluster_ANOVA.png

Intracluster Bias Statistics

Degrees of Freedom Between Groups: The number of groups being tested -1

Degrees of Freedom Within Groups: The number of records within the group being tested -1

F: The Test result. Lower values mean it is more likely that the distributions are the same.

Mean Squares Between Groups: Average deviation from the mean between groups

Mean Squares Within Groups: Average deviation from the mean within the groups

Omega Squared: The estimated average effect size (in points of prediction confidence).

P Value: The probability that the observed differences in the distribution are from random chance. Lower values mean it is less likely to be from random chance.

Sum of Squares Between Groups: Total deviation from the mean between groups

Sum of Squares Within Groups: Total deviation from the mean within groups

Total Sum of Squares: The sum of both Sum of Squares Between and Within Groups

Bias Category Distribution

The Category Distribution tab has a table for both the selected cluster and the full data set, with the distribution and the model’s precision for each value in the category. REAL values use the same binning as the clustering visualization.

Significant differences in model accuracy within a cluster suggest that the model is less accurate with one group than another and therefore may perform in a biased manner.

 

Was this article helpful?

0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.