Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am curious about the motivation you choose Clickhouse over Apache Pinot, and Apache Druid? It could be helpful for other folks when choosing the OLAP db from one of them.


For us, a significant reason was the ClickHouse cloud-hosted offering, rather than having to manage a cluster ourselves. Their use of S3 as the backing storage medium means that large-scale data retention is quite affordable.

A good comparison we've referenced: https://leventov.medium.com/comparison-of-the-open-source-ol...


For reference, Apache Druid has an equivalent in Imply Polaris, and Apache Pinot has an equivalent in Startree. I can't speak for Startree, but Polaris similarly uses S3 for backing.


When I was highly engaged with Imply (Druid) a few years ago, S3 was also used as a backing storage. Is this not the case anymore?


I think both Pinot and Druid nowadays offer cloud-hosted solutions. Maybe you started early that only ClickHouse had that offering. Is cloud hosting the only reason you guys choose Clickhouse? I am also wondering is it possible to let users choose the data source?


Slight hijack. I / we went through a very similar tech selection process for timeseries metrics (not logging) ~1.5 years ago. We looked at Druid, ElasticSearch, TimeScale and a bunch of others.

Main takeaways were: the SQL flavor and its aggregations in CH are amazing. Running on a single node for dev laptops is trivial. It’s crazy fast with almost zero tuning.

It does not surprise me at all the CH is powering new products and startups.

Note: hosted CH did not exist yet. We are using Altinity to run our cluster.


> Note: hosted CH did not exist yet. We are using Altinity to run our cluster.

It exists now actually. We (highlight) are on hosted clickhouse, which went in GA a few months ago. https://clickhouse.com/cloud


Thanks for the shout out! "Altinity" in this case means Altinity.Cloud, which is a high-performance cloud ClickHouse. It's been around for over 2.5 years.

Disclaimer: I work at Altinity.


if you can afford SQL then you're not really doing timeseries in any meaningful sense


Clickhouse is fast and doesn’t have absurd architectural complexity.


+1


Not having to deal with a JVM is a major plus tbh.


I've seen so many variations of this comment on HN and I'm still not sure why not having to deal with the JVM is a major plus.


I'm perfectly fine with JVMs, but at a guess, some of it is the usual snobbery for anything strange. But some of it is due to associating JVMs with enterprise nightmares. And some is that JVM tuning is a bit of a dark art. I've made some very good money going in and turning JVM knobs that others were afraid to touch. (The secret, by the way, is to hack together some decent load simulation and then measure not just median numbers but things like 99th percentile latency.)


Have you ever operated a fleet of critical JVM instances and needed more memory? Don’t go over 32GB ram in an instance or the operating characteristics of you entire app change. Compressed memory pointers - oops. They are a blast to debug / operate!

https://stackoverflow.com/questions/25120546/trick-behind-jv...


JVM runtimes have a relatively high startup cost, are not often good 'citizens' in an instance running multiple types of software, and the build processes for a lot of JVM deliverables is an ungodly mess.

Many of those bells and whistles are near-necessary in the enterprise world, but you have the accumulated mass of 'red zones' and developmental landmines in that ecosystem that can quickly turn you off it as a whole if you want to understand the whole system.


I still don't understand some of this -- I developed in Java for 5+ years.

>JVM runtimes have a relatively high startup cost I think many people are okay with that when developing server software that's going to run weeks at a time. It can get a bit annoying with trying to rapidly iterate. And I think things are changing pretty quickly with AOT builds and general improvements.

>and the build processes for a lot of JVM deliverables is an ungodly mess.

I recall using "mvn package." That's it. This was on two different systems that served a good bit of traffic and weren't simple trivial projects.


I don't know if it's a standard Java thing or just an IntelliJ thing, but there's a setting that will hot-patch a running JVM when you change code. Things can get messy if you (or your dependencies) make assumptions about the ClassLoader being used, but other than that it works great.

Still not as good as C#'s debugger in Visual Studio (hit a breakpoint, edit the code, drag the execution back before the problem, resume and run the patched version) but nothing I've seen really is.

Setting up Gradle projects is a bit more involved depending on your setup, but in the end it's still a single command to build an executable JAR.


Yeah, it's been a second since I've used IntelliJ/Spring, but I recall that being the case as well.

Gradle is something I've never messed with, but that makes sense.


I take it you haven't experienced the hell that is to deal with Hadoop JARs. It's absolutely ridiculous.


Having to worry about GC in a database is a pretty bad experience. It also tends to require way more resources than necessary, and just a pretty complex configuration


gc isn't the issue, the jvm is the issue


basically, the jvm is technically sophisticated but operationally complicated

it sucks to use

many people believe otherwise, but those people have rich jvm experience, which is not easy to get


Druid has like 9 different node types and inherits the whole Hadoop configuration mess and complexity


3, and there's absolutely no need for hadoop, particularly with MSQ


Anecdata: tried out Druid and Clickhouse for my SaaS. Couldn’t get Druid working. CH ran in 2 minutes.


Interesting, just found another post from yesterday about the comparison: https://news.ycombinator.com/item?id=35642522, though the comparison is coming from Pinot team.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: