Something I've never been able to find satisfactory information on (and unfortunately this article also declares it out of scope), is what is the actual hard on-the-wire and on-disk differences between SDR and HDR? Like yes, I know HDR = high dynamic range = bigger difference between light and dark, but what technical changes were needed to accomplish this?
The way I understand it, we've got the YCbCr that is being converted to an RGB value which directly corresponds to how bright we drive the R, G, and B subpixels. So wouldn't the entire range already be available? As in, post-conversion to RGB you've got 256 levels for each channel which can be anywhere from 0 to 255 or 0% to 100%? We could go to 10-bit color which would then give you finer control with 1024 levels per channel instead of 256, but you still have the same range of 0% to 100%. Does the YCbCr -> RGB conversion not use the full 0-255 range in RGB?
Naturally, we can stick brighter backlights in our monitors to make the difference between light and dark more significant, but that wouldn't change the on-disk or on-the-wire formats. Those formats have changed (video files are specifically HDR or SDR and operating systems need to support HDR to drive HDR monitors), so clearly I am missing something but all of my searches only find people comparing the final image without digging into the technical details behind the shift. Anyone care to explain or have links to a good source of information on the topic?
The keywords you're missing are color spaces and gamma curves. For a given bandwidth, we want to efficiently allocate color encoding as well as brightness (logarithmically to capture the huge dynamic range of perceptible light). sRGB is one such standard that we've all agreed upon, and output devices all ostensibly shoot for the sRGB target, but may also interpret the signal however they'd like. This is inevitable, to account for the fact that not all output devices are equally capable. HDR is another set of standards that aims to expand the dynamic range, while also pinning those values to actual real-life brightness values. But again, TVs and such may interpret those signals in wildly different ways, as evidenced by the wide range of TVs that claim to have "HDR" support.
This was probably not the most accurate explanation, but hopefully it's enough to point you in the right direction.
> Naturally, we can stick brighter backlights in our monitors to make the difference between light and dark more significant,
It's actually the opposite that makes the biggest difference with the physical monitor. CRTs always had a residual glow that caused blacks to be grays. It was very hard to get true black on a CRT unless it was off and had been for some time. It wasn't until you could actually have no light from a pixel where black was actually black.
Sony did a demo when they released their OLED monitors where they had the top of each monitor type side by side: CRT, LCD, OLED. The CRT was just gray while the OLED was actually black. To the point that I was thinking in my head that surely this is a joke and the OLED wasn't actually on. That's precisely when the narrator said "and just to show that the monitors are all on" as the video switched to a test pattern.
As for the true question you're getting at, TFA mentions things like color matrix, primaries, and transfer settings in the file. Depending on the values, the decoder makes decision on the math used to calculate the values. You can use any of the values on the same video and arrive at different results. Using the wrong ones will make your video look bad, so ensuring your file has the correct values is important.
Note that crt's did not have bad blacks, they were far better than lcd displays. I am currently using an ips display and it has pretty good blacks, notably better than a normal lcd display. But I remember crt's being even better(probably just me being nostalgic for the good ol days when we were staring straight into an electron beam with only an inch of leaded glass to protect us). I Don't think they were lying, oleds are very very good(except for the burn in issue, but that's solvable), but I would be wary about the conclusions of a demo designed to sell something.
For what it's worth, the display I liked best was a monochrome terminal, a vt220, Let me explain, a crt does not really have pixels as we think of them on an modern display, but they do have a shadow mask which is nearly the same thing. however a monochrome crt(as found in a terminal or oscilloscope) has no shadow mask, the text of those vt220 was tight, it was a surprisingly good reading experience.
Clicking the "Limit to SDR" and "Allow Full HDR (as supported)" should show a significant difference if you device supports HDR. If you don't see a difference then your device doesn't support HDR (or your browser)
For these images, there's a specific extension to JPEG where they store the original JPEG like you've always seen, and then a separate embedded gain map to add brightness if the device supports it. That's for stills (JPEGs) though, not video but the "on the wire difference" is that gain map
I'm not an expert but for videos, ATM, afaict, they switched them to 10bits (SDR is 8bits), and added metadata to map that 10 bits to values > "white" where white = 100nits. This metadata (PQ or HLG) can map those 10 bits up to 10000 nits.
HDR is nothing more than metadata about the color spaces. The way the underlying pixel data is encoded does not change. HDR consists of
1. A larger color space, allowing for more colors (through different color primaries) and a higher brightness range (though a different gamma function)
2. Metadata (either static or per-scene or per-frame) like a scene's peak brightness concrete tonemapping settinsg, which can help players and displays map the video's colors to the set of colors it can display.
I actually have a more advanced but more compact "list of resources" on video stuff in another gist; that has a section on color spaces and HDR:
> 10 bits per sample Rec. 2020 uses video levels where the black level is defined as code 64 and the nominal peak is defined as code 940. Codes 0–3 and 1,020–1,023 are used for the timing reference. Codes 4 through 63 provide video data below the black level while codes 941 through 1,019 provide video data above the nominal peak.
The way I understand it, we've got the YCbCr that is being converted to an RGB value which directly corresponds to how bright we drive the R, G, and B subpixels. So wouldn't the entire range already be available? As in, post-conversion to RGB you've got 256 levels for each channel which can be anywhere from 0 to 255 or 0% to 100%? We could go to 10-bit color which would then give you finer control with 1024 levels per channel instead of 256, but you still have the same range of 0% to 100%. Does the YCbCr -> RGB conversion not use the full 0-255 range in RGB?
Naturally, we can stick brighter backlights in our monitors to make the difference between light and dark more significant, but that wouldn't change the on-disk or on-the-wire formats. Those formats have changed (video files are specifically HDR or SDR and operating systems need to support HDR to drive HDR monitors), so clearly I am missing something but all of my searches only find people comparing the final image without digging into the technical details behind the shift. Anyone care to explain or have links to a good source of information on the topic?