Ask / Submit
33

More than two bytes wide Unicode characters are broken

asked 2014-02-26 19:40:56 +0300

Penguin gravatar image

updated 2015-02-03 02:08:48 +0300

Nicolas gravatar image

There seems to be some kind of miss handling for Unicode encoded characters wider than two bytes, meaning all characters beyond code point U+07FF U+FF00 (not exact, but anyway before U+FFFF). Those are broken into two or more character - one or two bytes wide. I have found this issue is in virtual keyboard (Popper.qml), Android text input and SMS sending. (e.g. ๐Ÿ˜„๐Ÿ˜ƒ๐Ÿ˜Š๐Ÿ˜‰๐Ÿ˜๐Ÿ˜˜๐Ÿ˜š)

Sailfish OS does not have stock input method to type such characters except by copy-pasting received text or by installing third party vkb that has such symbols (e.g. The Emoji Keyboard, https://openrepos.net/content/penguin/emoji-keyboard)

SailfishOS does not either handle properly variant selector, even if font defines those properly. In SailfishOS variant selector is shown as square instead of selecting alternative variant for the glyph or even ignoring the selector. Similarly Symbols constructed from two chacters by setting them on top of each other won't display correctly and depending on symbol are shown either as two parallel symbols or as parallel symbol and square. (e.g. 1โƒฃ2โƒฃ3โƒฃ4โƒฃ5โƒฃ6โƒฃ7โƒฃ8โƒฃ9โƒฃ0โƒฃ)

  • SailfishOS version 1.0.4.20 update did not fix this issue
  • Still unfixed in SailfishOS version 1.1.1.27
edit retag flag offensive close delete

1 Answer

Sort by ยป oldest newest most voted
6

answered 2014-06-18 14:34:01 +0300

jbrooks gravatar image

There is quite a bit of confusion here, so I'm going to start a separate answer to try to clarify it.

First of all, UTF16 (and by extension, QString, QChar, and Qt as a whole) can encode the entire unicode character set, including the characters that can't be represented in 16 bits. Surrogate pairs are supported, and the characters do render correctly in Qt applications. They are often ignored in text parsing and that can result in some issues, but it's generally trivial.

So, breaking down the specific problems here:

  • Virtual keyboard (Popper.qml) not handling surrogate pairs

None of the built-in keyboards have characters outside the BMP that would require more than one QChar to encode. The keyboard QML is not considered a real API at present. Unfortunately, I don't think we can accept community patches to it yet.

  • Android text input

I'm not sure what you mean by this.

  • SMS sending

Our modem is failing to encode unicode surrogate pairs in outgoing SMS properly. I will file a bug and look into this.

  • Variant selector

I am not very familiar with unicode variant selectors. Can you please try a simple QML example rendering a unicode character with a variant selector using a specific font, and either:

1) Demonstrate that the same font renders the same character correctly using other non-Qt text engines, or 2) Demonstrate that the same QML with the same font works on desktop Qt but not on Sailfish

My guess is that this is a feature missing in Qt's text rendering. If you can make a test case for that, we can report it upstream.

I hope this helps.

edit flag offensive delete publish link more

Comments

For the curious, outgoing SMS containing Emoji characters are fixed by https://github.com/nemomobile-packages/ofono/pull/236

jbrooks ( 2014-06-18 15:14:56 +0300 )edit

@jbrooks: would that indicate that we would see it in an update soon?

Mohjive ( 2014-07-14 15:25:18 +0300 )edit

@Mohjive: It won't be in the July update (update 8), but it's integrated for the one after that.

jbrooks ( 2014-07-14 15:30:39 +0300 )edit
Login/Signup to Answer

Question tools

Follow
3 followers

Stats

Asked: 2014-02-26 19:40:56 +0300

Seen: 1,139 times

Last updated: Feb 03 '15