The important difference is that we used a more realistic temperature profile, w...

delusional · on Jan 6, 2020

That's what you wanted to show, but what you ended up showing is that if you have different data, then the query performance can be quite good.

I get the desire to critique the temperature profile, but completely changing it makes the comparison worthless. From a data perspective it's like saying "if all the sensors just report 1 for temperature every reading, computing the min, max, and average is super fast". No shit, that wasn't the task though.

jayleeg · on Jan 7, 2020

But they didn't set the temperature reading to anything that would advantage their tests. Without access to the original data they simply generated a dataset as close to the original dataset and volume as possible. The fact they took a few sentences talking about the temperature doesn't equate to invalidating the test.

Looking at this your way - Scylla used an INT, Altinity used a Decimal type with specialized compression (T64). I can tell you that this would have hampered ClickHouse and advantaged Scylla. It's the opposite of what you're saying. They actually performed this benchmark with one arm tied behind their back.

It's a funny benchmark anyway because the two systems have very different use cases but it doesn't invalidate the result.

patelh · on Jan 6, 2020

Then you should provide results for both test datasets to make the point of using a more realistic approach. Materialized views are not news, nor is properly designed analytics applications. For me the importance is how click house is better and why.

manigandham · on Jan 7, 2020

A column-store will be magnitudes faster at analytical queries than any rowstore system. This is fundamental architecture and the data used makes little to no difference. You could use the exact ScyllaDB dataset duplicated to trillions of rows and still arrive at the same relative performance figures.