1. When true distribution is outside the model
Singular learning theory currently assumes that the true distribution is inside the family of models under consideration. One problem is to extend or understand the theory when this assumption does not apply.-
Problem 1.1.
Let us assume that the true probability distribution $q$ of a data generating process is outside the family of models $\mathcal{M}$ that are being considered. Under what conditions does the posterior distribution converge to a distribution with smallest KL divergence to the true distribution $q$? -
Problem 1.2.
[Russell] Understand asymptotics of the stochastic complexity in the case where the true distribution is not in any model under consideration. -
Problem 1.3.
As a concrete example, suppose the data is generated from a log normal distribution, and suppose we try to fit two model classes - a) a normal distribution b) a gamma distribution
Understand the behavior of asymptotics in this case. -
Problem 1.4.
[Yongli Zhang] As another example, let us assume the true model is a linear regression. $Y = exp(X) + \epsilon$ where $\epsilon \sim N(\mu,\sigma^2)$. Let us assume that we try to fit a linear regression model to the data, i.e. $y=bx + \epsilon$.
Cite this as: AimPL: Singular learning theory, available at http://aimpl.org/singularlearning.