"Extracting Models From Data Sets: An Experiment Using Notes-to-Self"
Abstract: We report results from an experiment designed to study how people extract patterns from their observations. The novel experimental design asks subjects to organize different sets of observations (data) with the goal of making predictions in similar situations. We study whether the predictions subjects make in each environment are consistent with them using some``model’’ that posits specific statistical relationships between different variables. We find that the predictions of most subjects can be rationalized by some model. Importantly, we find the most commonly used model is the optimal one in that it maximizes prediction accuracy. Deviations from the optimal model often involve use of simpler models that fail to account for statistically relevant correlations in the data. Variation in the set of observations presented to subjects across environments allows us to test whether the way subjects learn from data display a key aspect of causal reasoning: identification of conditional independence between variables. While we find strong evidence for this, we also observe that failures of this increase with the noise in the data. Complemented with ancillary non-choice data that emerges as a by-product of our design, our results provide insights into how people form models of the world by studying data and how they use these models to make predictions.