my experience is that while data keeps growing at an exponential rate, its information content does not. In finance at least, you can easily get 100 million data points per series per day if you want everything, and you might be dealing with thousands of series. That sample rate, and the number of series, is usually 99.99% redundant, because the eigenvalues drop off almost to zero very quickly after about 10 dimensions, and often far fewer. There's very little reason to store petabytes of ticks that you will never query. It's much more reasonable in many cases to do brutal (and yes, lossy) dimensionality reduction _at ingests time_, store the first few principal components + outliers, and monitor eigenvalue stability (in case some new, previously negligable, factor, starts increasing in importance). It results in a much smaller dataset that is tractable and in many cases revelatory, because it's actually usable.
you can store the main eigenvectors for a set rolling period and see how the space evolves along them, all the while also storing the new ones. In effect the whole idea is to get away from "individual security space" and into "factor space", which is much smaller, and see how the factors are moving. Also, a lot of the time you just care about the outliers -- those (small numbers of) instruments or clusters of instruments that are trading in an unusual way -- then you either try to explain it.... or trade against it. Also keep in mind that lower-order factors tend to be much more stationary so there's a lot of alpha there -- if you can execute the trades efficiently (which is why most successful quant shops like citadel and jane street are market MAKERS, not takers, btw).
Nog OP, but i think they are referring to the fact that you can use PCA (principal component analysis) on a matrix of datapoints to approximate it. Works out of the box in scikit-learn.
You can do (lossy) compression on rows of vectors (treated like a matrix) by taking the top N eigenvectors (largest N eigenvalues) and using them to approximate the original matrix with increasing accuracy (as N grows) by some simple linear operations. If the numbers are highly correlated, you can get a huge amount of compression with minor loss this way.
Personally I like to use it to visualize linear separability of a high dimensioned set of vectors by taking a 2-component PCA and plotting them as x/y values.