Inferring Arithmetic Expressions from Data

More Info
expand_more

Abstract

We present a framework for learning arithmetic expressions from a set of observations. Our intention is to introduce a Bayesian method for what is known as equation discovery. Our method is based on measuring a degree of belief (posterior probability) for a set of hypothesized expressions to find those which best explain the observed data. This measure is used as the basis for choosing one hypothesis over another. In our work we distinguish two tasks in the process of equation discovery, namely: the task of exploring the space of arithmetic expressions and that of evaluating the degree that an expression describes the data. Separating these two, allows us to investigate them independently. For the first task, we use a context-free grammar to construct a large set of expressions which we take as our hypothesis space. The set contains a large number of hypotheses (each an arithmetic expression) that should be tested against the data. We also evaluate complexity of for each expression using the grammar. The complexity is presented to the model in the form of a prior probability. Our main focus here is the second task: the posterior evaluation using a Bayes formulation. The method tests a hypothesized expression against a set of provided samples that have quantitative input features. It calculates a likelihood probability which expresses the degree that a hypothesis describes the data. A final posterior probability is calculated based on the prior and the likelihood, that is the measure of qualification for each expression.