We introduce a variational inference framework for training the Gaussian process latent variable model and thus performing Bayesian nonlinear dimensionality reduction. This method allows us to variationally integrate out the input vari- ables of the Gaussian process and compute a lower bound on the exact marginal likelihood of the nonlinear latent variable model. The maxi- mization of the variational lower bound provides a Bayesian training procedure that is robust to overfitting and can automatically select the di- mensionality of the nonlinear latent space. We demonstrate our method on real world datasets. The focus in this paper is on dimensionality re- duction problems, but the methodology is more general. For example, our algorithm is imme- diately applicable for training Gaussian process models in the presence of missing or uncertain inputs.