my experience is that while data keeps growing at an exponential rate, its infor...

bartart · on May 27, 2024

That’s very interesting, so thank you — how do you handle if the eigenvectors change over time?

vegabook · on May 27, 2024

you can store the main eigenvectors for a set rolling period and see how the space evolves along them, all the while also storing the new ones. In effect the whole idea is to get away from "individual security space" and into "factor space", which is much smaller, and see how the factors are moving. Also, a lot of the time you just care about the outliers -- those (small numbers of) instruments or clusters of instruments that are trading in an unusual way -- then you either try to explain it.... or trade against it. Also keep in mind that lower-order factors tend to be much more stationary so there's a lot of alpha there -- if you can execute the trades efficiently (which is why most successful quant shops like citadel and jane street are market MAKERS, not takers, btw).

CoastalCoder · on May 27, 2024

Could you point to something explaining that eigenvalue / dimensions topic?

It sounds interesting, but it's totally new to me.

beng-nl · on May 27, 2024

Nog OP, but i think they are referring to the fact that you can use PCA (principal component analysis) on a matrix of datapoints to approximate it. Works out of the box in scikit-learn.

You can do (lossy) compression on rows of vectors (treated like a matrix) by taking the top N eigenvectors (largest N eigenvalues) and using them to approximate the original matrix with increasing accuracy (as N grows) by some simple linear operations. If the numbers are highly correlated, you can get a huge amount of compression with minor loss this way.

Personally I like to use it to visualize linear separability of a high dimensioned set of vectors by taking a 2-component PCA and plotting them as x/y values.

mk67 · on May 27, 2024

https://en.wikipedia.org/wiki/Principal_component_analysis