Update: Wow. Reading your solutions was a real eye-opener for me. It never struck me that one can exploit the fact that unmaterialized CTE's will not be evaluated for rows that is not needed by another SELECT and one can use this the same way one uses laziness in Haskell. This is great stuff, thanks again for sharing!
Many libraries for python, Rust, golang support COPY BINARY.
The times I've tested it, the improvement is very small as compared to plain copy, or copy with CSV, whereas it does require more work and thought upfront to ensure the binary actually works correctly.
Timescaler here, if you configure the timescaledb.compress_segmentby well, and the data suits the compression, you can achieve 20x or more compression.
(On some metrics data internally, I have 98% reduction in size of the data).
One of the reasons this works is due to only having to pay the per-tuple overhead once per grouped row, which could be as much as a 1000 rows.
The other is the compression algorithm, which can be TimescaleDB or plain PostgreSQL TOAST
You can put it upright on a piano and put sheet music at eye level.
I personally use flowkey + an electrical piano connected through USB. Beats a tablet for me!
The oldest I can find is from 1998 (PostgreSQL 6.3), but it was probably in use even before.
> Postgres offers substantial additional power by incorporating the following four additional basic concepts in such a way that users can easily extend the system:
classes
inheritance
types
functions
Other features provide additional power and flexibility:
constraints
triggers
rules
transaction integrity
These features put Postgres into the category of databases referred to as object-relational
A large number of unsoundness bugs only work if you have access to the stdlib, because they're flaws in stdlib types and functions that use `unsafe` internally, and are supposed to present a safe interface around it.
If you a) don't have access to unsafe, and b) don't have access to the stdlib that lets you do powerful things without unsafe, then you're very limited in what you can do.
https://smallcultfollowing.com/babysteps/blog/2016/10/02/obs... discusses this further. Conceptually, you can think of "entirely Safe Rust" to be a very limited language, which you then progressively add "capabilites" to by exposing safe interfaces implemented with unsafe code. For example, Vec and Box (which require unsafe) grant safe code the ability to do heap allocations.
It's true that this is not designed as a security boundary. As I note in my comment above, the PL/Rust devs also make that clear. That doesn't mean it has no value as part of a defence in depth strategy.
We do this too for PostgreSQL: to ensure the tests are really fast:
- we create a template database using the migrations
- for *every* integration test we do `CREATE DATABASE test123 TEMPLATE test_template;`
- we tune the PostgreSQL instance inside Docker to speed up things, for exampling disabling synchronous_commit
On a successful test, we drop the test123 database. On a failed test, we keep the database around, so we can inspect it a bit.
The really great thing about this approach (IMHO), is that you can validate certain constraint violations.
For example, exclusion constraints are great for modelling certain use cases where overlapping ranges should be avoided. In our (go) code, the test cases can use the sqlstate code, or the constraint name to figure out if we hit the error we expect to hit.
This approach is pretty much as fast as our unit tests (your mileage may vary), but it prevents way more bugs from being merged into our codebase.
Just did a sequential run (to get some better measurements), and this is an excerpt of the things happening in the PostgreSQL instance inside the Docker container, for creating and dropping the databases:
08:25:37.114 UTC [1456] LOG: statement: CREATE DATABASE "test_1675239937111796557" WITH template = test_template
[noise]
08:25:48.002 UTC [1486] LOG: statement: DROP DATABASE "test_1675239947937354435"
Start time of first test:
2023-02-01 08:25:03.633 UTC
Finish time of last test:
2023-02-01 08:26:13.861 UTC
82 tests, or 0.856 seconds per test (sequentially).
In parallel, we take 6.941 seconds for 82 tests, or 0.085 seconds per test.
We apply the overall same strategy (individual DB for each test created from a template).
Our whole test suite (integration + unit tests) takes ~80 seconds to run for ~800 integration tests (each with their own DB) and 300 unit tests. And that's on my dev laptop (T14s, cpu: i7-1185G7) without much optimization (mainly fsync = off in postgresql.conf).
In fact, I just ran a quick test, and just putting the DB on a tmpfs cuts that time to ~40 seconds.
So overall 0.1 to 0.05 second per test on average, same ballpark as parent (and it's kind of an over estimation actually since we have a dozen or so of slow tests taking 5 to 10 seconds).
A single query can write to multiple tables, using CTE's in PostgreSQL for example.
You could compose a SQL query that allows you to map multiple resultsets to 1 resultset, although that feels a bit awkward.
WITH a AS (
insert into a (k, v) values ('a', 1.0) returning *
), b AS (
insert into b (k, v) values ('b', 2.0) returning *
)
SELECT
row_to_json(a)
FROM
a
UNION ALL
SELECT
row_to_json(b)
FROM
b;
You can only select, you can't nest cte, the query planner has no understanding of joins that cross cte boundaries so they are totally unoptimized, you can't use distinct or group by or etcetc. Really the CTE implementation in SqlServer is basically a parser level hack.
Show me any sql dialect that will allow that CTE you give here (edit; and what on earth is that supposed to actually mean)
> Or can you only use SELECT for WITH queries
you can only use a select inside a cte (or should be able to) because the 'e' stands for 'expression'. It seems postgres does allow an insert with and output which sort of makes sense but I doubt it's in the standard.
Your last sentence makes no sense to me. Give an example.
Also you failed to give an example that group by/distiinct weren't allowed in ctes.
Good to know, I wasn't aware PostgreSQL supports this.
I'm currently on SQL Server and it doesn't support INSERT as a CTE (and I think most DBMSes out there still don't). It would definitely make my life easier if it did...