Feature Stability

Mehmet Ugurbil

Unversity of Minnesota

02-Jul-2018

Contents

Aim

1. Find out when a feature selector finds at least one Markov blancket.
2. See how stable strong features are compared to other types of features.
3. Compare various feature selectors on various types of datasets.

Null Hypothesis

Strong features have higher stability than others.

Experiment Design

1. Feature selection on cross validation sets of small sample.
- - Calculation of stability metrics.
2. Construct AUC curve where features are samples ranked by their stability and target is whether they are strong or not.

Observations

1. Monotonic performance increase is observed as sample size increases in stability to detect strongness.
2. Performance is higher in the absence of multiplicity.

Dataset Descriptions

TIE-Net = Original simulated data - TIE near-faithful causal network.
TIE-Net-Reduced1 = TIE-Net with multiplicity removed according to the original graph.
TIE-Net-Reduced2 = TIE-Net with multiplicity removed using Tie* Algorithm.
- - Note that this is not one dataset, but one for each repeat per sample size (550 total).
- - This also implies that the feature stability doesn't make sense in this dataset, but included for completeness.
TIE-Net-Weak1 = TIE-Net with weak variables multiplied 50 times.
TIE-Net-Weak2 = TIE-Net-Weak1 with gaussian noise, uniformly random deviation.

Breakdown of Stability

Frame 1: Frame 2:

The End