Working with Subspace Embeddings part6(Machine Learning 2024)

Monodeep Mukherjee
2 min readApr 24, 2024
  1. Wijsman Hyperspaces: Subspaces and Embeddings(arXiv)

Author : Jiling Cao, Heikki J. K. Junnila, Warren B. Moors

Abstract : In this paper, topological properties of Wijsman hyperspaces are investigated. We study the existence of isolated points in Wijsman hyperspaces. We show that every Tychonoff space can be embedded as a closed subspace in the Wijsman hyperspace of a complete metric space which is locally R.

2. Low-distortion Subspace Embeddings in Input-sparsity Time and Applications to Robust Linear Regression(arXiv)

Author : Xiangrui Meng, Michael W. Mahoney

Abstract : Low-distortion embeddings are critical building blocks for developing random sampling and random projection algorithms for linear algebra problems. We show that, given a matrix $A \in \R^{n \times d}$ with n≫d and a p∈[1,2), with a constant probability, we can construct a low-distortion embedding matrix $Π\in \R^{O(\poly(d)) \times n}$ that embeds $\A_p$, the ℓp subspace spanned by A’s columns, into $(\R^{O(\poly(d))}, \| \cdot \|_p)$; the distortion of our embeddings is only $O(\poly(d))$, and we can compute ΠA in $O(\nnz(A))$ time, i.e., input-sparsity time. Our result generalizes the input-sparsity time ℓ2 subspace embedding by Clarkson and Woodruff [STOC’13]; and for completeness, we present a simpler and improved analysis of their construction for ℓ2. These input-sparsity time ℓp embeddings are optimal, up to constants, in terms of their running time; and the improved running time propagates to applications such as (1±ε)-distortion ℓp subspace embedding and relative-error ℓp regression. For ℓ2, we show that a (1+ε)-approximate solution to the ℓ2 regression problem specified by the matrix A and a vector $b \in \R^n$ can be computed in $O(\nnz(A) + d³ \log(d/ε) /ε²)$ time; and for ℓp, via a subspace-preserving sampling procedure, we show that a (1±ε)-distortion embedding of $\A_p$ into $\R^{O(\poly(d))}$ can be computed in $O(\nnz(A) \cdot \log n)$ time, and we also show that a (1+ε)-approximate solution to the ℓp regression problem $\min_{x \in \R^d} \|A x — b\|_p$ can be computed in $O(\nnz(A) \cdot \log n + \poly(d) \log(1/ε)/ε²)$ time. Moreover, we can improve the embedding dimension or equivalently the sample size to O(d3+p/2log(1/ε)/ε2) without increasing the complexity.

--

--

Monodeep Mukherjee

Universe Enthusiast. Writes about Computer Science, AI, Physics, Neuroscience and Technology,Front End and Backend Development