Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Letters in Arabic have different rendering depending on where they lie in the word; at the start, middle, end, or following/preceding certain characters.

This is the main problem you see sometimes in movies where they try to show something in Arabic and they get the rendering wrong. They probably get each letter on its own and try to construct the words like that, where the letters do not join and the whole thing looks like a mess.



Fwiw, Greek also has one letter (sigma) that differs in rendering depending on where in the word it appears. It's the same letter, it's just that when it appears word-final, it looks different. But Unicode decided to split it into two codepoints, rather than treat it as a rendering issue. Therefore rendering of Greek codepoints never depends on position within the word, even though rendering of Greek letters can. Instead it's up to the user to make sure that whenever a lowercase sigma appears word-final, it should be encoded with a different codepoint, GREEK SMALL LETTER FINAL SIGMA (U+03C2), 'ς', rather than the usual codepoint, GREEK SMALL LETTER SIGMA (U+03C3), 'σ'.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: