Potentially dumb question, but why does “who viewed my profile” (the feature mentioned in the post as the original use case at LinkedIn) require a realtime OLAP datastore anyway?
That's a great question. A bit of history. Here's more on Pinot, back when it was invented at LinkedIn, before it was ASF-incubated as "Apache Pinot."
LinkedIn originally had been using traditional OLTP systems to power the "Who Viewed My Profile" app; they just simply hit the wall. So LinkedIn looked at other analytics databases at the time. Most just couldn't handle a large number of QPS — which, as a social media platform, they readily anticipated. This is why they defined the category as "user-facing, real-time analytics."
[This is terminology mostly specific to the Apache Pinot crowd, though I see that StarRocks / CelerData also recently started talking about user-facing analytics. So I wrote up an article explaining it here:
Other similar extant systems LinkedIn benchmarked at the time just couldn't give them the numbers they needed at the low latency, high concurrency and large scale they anticipated — terabytes to petabytes of data. So they wrote their own solution.
Pinot was originally intended for marketing purposes to capture live intent & action data.
The same Pinot infrastructure eventually grew to other real-time use cases. "Who Viewed My Profile" was followed by "Company Follow Analytics," then sales or recruiting, and even internal A/B testing.
[I wasn't at LinkedIn while this happened; this lore was passed down to me by others. Disclosure: yes I work at StarTree.
LinkedIn is just another social network, more work related. When you post something new, you do want to see how many likes/comments. Just like Ins. "Who viewed my profile" is real in LinkedIn. It can be someone who is hiring, someone who may watch your talk, someone who may buy your product. I personally started some business conversations when I realized someone viewed my profile, or at least add they as new connections.
Got it. To that end, Apache Pinot has a special index that allows certain dimensions to be drilled down on further, more granularly, than others, called the star-tree index. It's part of what makes Pinot so fast.