Just a small detail that isn't mentioned in the article: in NFC form, "base char...

lelf · on Nov 27, 2013

1. It's decompose, reorder, compose. So you can see some weird stuff like ḍ̇=ḋ○̣ → NFD=d○̣○̇ → NFD=ḍ○̇

2. It's not compression, it's normalisation. So it's not compose everything you can. I cannot tell you exact the algorithm off the top of my head, but:

the reason for U+1D160 — it's in CompositionExclusions list.

berdario · on Nov 27, 2013

Thanks, after looking up CompositionExclusions I discovered the rationale:

http://unicode.org/reports/tr15/#Primary_Exclusion_List_Tabl...

> When a character with a canonical decomposition is added to Unicode, it must be added to the composition exclusion table if there is at least one character in its decomposition that existed in a previous version of Unicode. If there are no such characters, then it is possible for it to be added or omitted from the composition exclusion table. The choice of whether to do so or not rests upon whether it is generally used in the precomposed form or not.