Information Separation and Sparsity Induced by VAE Object Function

Some finding from Discovering Influential Factors in VAEs paper.

We believe that data can be generated by a number of independent generators. In the work of discovering the influential generation factors of variational autoencoder, we found that the KL divergence term optimized by variational autoencoder promotes the independence of factors and promotes the sparsity of factor mutual information. The influential generative factors extracted by variational autoencoder can capture the variation of data well.

Information Separation

If we have $$p(z_1,\cdots,z_h|x)=p(z_1|x)\cdots p(z_h|x)$$ and $$p(z_1,\cdots,z_h)=p(z_1,\cdots,z_h),$$ according to the theorem in paper, we have $$I(z_1,\cdots,z_h;x)=I(z_1;x)+\cdots+I(z_h;x).$$ Therefore the second objective in VAE is $$E_{x}D_{KL}(q(\mathbf{z}|x)||p(\mathbf{z}))=\sum_{i=1}^{h} I(z_i;x) +D_{KL}(q(z_i)||p(z_i)).$$ Due to the separable nature of its mutual information, this causes the posterior collapse and allows us to use a small number of factor dimensions to achieve a better effect on subsequent classification or reconstruction tasks.

Information Sparsity

According to the properties of mutual information lasso, it yields