Generalized Autoencoder：A Neural Network Framework for Dimensionality Reduction——1. Introduction 引言

“Real-world data, such as images of faces and digits, usually have a high dimension which leads to the well-known curse of dimensionality in statistical pattern recognition.”现实世界中的数据，如人脸和数字图像，通常具有很高的维

淘淘图兔兔呀

243人浏览 · 2020-12-03 11:48:48

淘淘图兔兔呀 · 2020-12-03 11:48:48 发布

“Real-world data, such as images of faces and digits, usually have a high dimension which leads to the well-known curse of dimensionality in statistical pattern recognition.”
现实世界中的数据，如人脸和数字图像，通常具有很高的维数，这就导致了统计模式识别中众所周知的维数诅咒。

“Various methods of dimensionality reduction have been proposed to discover the underlying manifold structure, which plays an important role in many tasks, such as pattern classification[9] and information visualization[16].”
为了发现底层流形结构，人们提出了各种降维方法，这些方法在模式分类[9]和信息可视化[16]等任务中发挥着重要作用。

“Principal Component Analysis (PCA) is one of the most popular linear dimensionality reduction techniques[10][17]. It projects the original data onto its principal directions with the maximal variance, and does not consider any data relation.”
主成分分析(PCA)是最常用的线性降维技术之一[10][17]。它将原始数据以最大方差投影到其主方向上，不考虑任何数据关系。

“Linear Discriminant Analysis (LDA) is a supervised method to find a linear subspace, which is optimal for discriminating data from different classes[2].”
线性判别分析(LDA)是一种寻找线性子空间的监督方法，该方法对不同类别的数据进行判别是最优的[2]。
“Marginal Fisher Analysis (MFA) extends LDA by characterizing the intraclass compactness and interclass separability[19].”
边际费雪分析(MFA)通过描述类内紧致性和类间可分性扩展了LDA[19]。
“These two methods use class label information as a weak data relation to seek a low-dimensional separating subspace.”
这两种方法利用类标签信息作为弱数据关系来寻找低维分离子空间。

“However, the low-dimensional manifold structure of real data is usually very complicated. Generally, just using a simple parametric model, such as PCA, it is not easy to capture such structures. Exploiting data relations has been proved to be a promising means to discover the underlying structure.”
然而，真实数据的低维流形结构往往非常复杂。一般来说，仅仅使用简单的参数化模型，如PCA，是不容易捕捉到这样的结构的。利用数据关系已经被证明是一种很有前途的方法来发现底层结构。
“For example, ISOMAP[15] learns a low-dimensional manifold by retaining the geodesic distance between pairwise data in the original space. In Locally Linear Embedding (LLE)[12], each data point is a linear combination of its neighbors. The linear relation is preserved in the projected low-dimensional space. Laplacian Eigenmaps (LE)[1] purses a low-dimensional manifold by minimizing the pairwise distance in the projected space weighted by the corresponding distance in the original space.”
例如，ISOMAP[15]通过保留原始空间中成对数据之间的测地线距离来学习低维流形。在局部线性嵌入(LLE)[12]中，每个数据点都是相邻数据点的线性组合。在低维投影空间中保持线性关系。拉普拉斯特征映射(LE)[1]通过最小化投影空间中的成对距离来加权原始空间中的相应距离来实现低维流形。
“They have a common weakness of suffering from the out-of-sample problem. Neighborhood Preserving Embedding (NPE)[3] and Locality Preserving Projection (LPP)[4] are linear approximations to LLE and LE to handle the out-of-sample problem, respectively.”
它们都有一个共同的弱点，即受到样本外问题的困扰。邻域保持嵌入(NPE)[3]和局域保持投影(LPP)[4]分别是LLE和LE处理样本外问题的线性逼近。

“The autoencoder algorithm[13] belongs to a special family of dimensionality reduction methods implemented using artificial neural networks. It aims to learn a compressed representation for an input through minimizing its reconstruction error.”
自动编码器算法[13]属于一类特殊的降维方法，这些方法是用人工神经网络实现的。它的目的是通过最小化重构误差来学习输入的压缩表示。
“Recently, the autoencoder algorithm and its extensions[8][18][11] demonstrate a promising ability to learn meaningful features from data, which could induce the “intrinsic data structure”. However, these methods just consider self-reconstruction and ignore to explicitly model the data relation.”
最近，自动编码器算法及其扩展[8][18][11]展示了一种很有前途的能力，从数据中学习有意义的特征，从而归纳出“内在的数据结构”。但这些方法只考虑了自重构，而忽略了对数据关系的显式建模。

参考资料：
[1] M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. Advances in Neural Information Processing Systems, 2002.
[2] R. Duda, P. Hart, and D. Stork. Pattern classification, 2nd edition. Wiley-Interscience, Hoboken, NJ, 2000.
[3] X. He, D. Cai, S. Yan, and H. Zhang. Neighborhood preserving embedding. International Conference on Computer Vision, 2005.
[4] X. He and P. Niyogi. Locality preserving projections. Advances in Neural Information Processing Systems, 2004.
[8] H. Lee, C. Ekanadham, and A. Ng. Sparse deep belief net model for visual area v2. Advances in Neural Information Processing Systems, 2008.
[9] K. Lee, J. Ho, M. Yang, and D. Kriegman. Video-based face recognition using probablistic appearance manifolds. IEEE Conference on Computer Vision and Pattern Recognition, 2003.
[10] K. Pearson. On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 1901.
[11] S. Rifai, P. Vincent, X. Muller, X. Glorot, and Y. Bengio. Contractive auto-encoders: Explicit invariance during feature extraction. International Conference on Machine Learning, 2011.
[12] S. Roweis and L. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 2000.
[13] D. Rumelhart, G. Hinton, and R. Williams. Learning internal representations by error propagation. Parallel Distributed Processing. Vol 1: Foundations. MIT Press, Cambridge, MA, 1986.
[15] J. Tenenbaum, V. de Silva, and J. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 2000.
[16] J. Venna, J. Peltonen, K. Nybo, H. Aidos, and S. Kaski. Information retrieval perspective to nonlinear dimensionality reduction for data visualization. Journal of Machine Learning Research, Vol. 11, 2010.
[17] R. Vidal, Y. Ma, and S. Sastry. Generalized principal component analysis (gpca). IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 12, pp. 1945-1959, 2005.
[18] P. Vincent, H. Larochelle, Y. Bengio, and P. Manzagol. Extracting and composing robust features with denoising autoencoders. International Conference on Machine Learning, 2008.
[19] S. Yan, D. Xu, B. Zhang, H. Zhang, Q. Yang, and S. Li. Graph embedding and extensions: A general framework for dimensionality reduction. IEEE Trans. on Pattern Analysis and Machine Intelligence, 2007.