Ask / Submit
2

Broken handling of finnish/swedish alphabets in the People/contacts app

asked 2014-01-03 10:54:28 +0300

Rauha gravatar image

updated 2014-01-03 15:18:53 +0300

MattVogt gravatar image

When searching contacts, the People app fails to make correct distinction between letters A/O and Å/Ä/ /Ö/. This causes following problems:

Broken and amateurish grammar

These letters are not umlauts or marks for emphasis. They are independet members of the alphabet that have their own pronounciation rules and are in no way interchangeable.

Unlike Scandinavian languages, finnish also follows vowel harmony, and A/Ä & O/Ö pairs clash strongly with each other. They cannot even be in the same syllable. Breaking vowel harmony gives, for lack of better way of describing it, a yucky feeling in the brain for a native finnish speaker.

Adds clutter to search results

No grammar rule, nor any native speaker, would ever mix them the way app currently handles them. It’s annoying and pointless clutter.

Internal inconsistency

Bug only handles searches with A and O incorrectly. Searching with Ä and Ö works correctly. Keyboard doesn’t suggest names starting with A, if I press search for Ä.

This also seems to affect the non-english/dutch letters only in the finnish and swedish alphabets. Danish/norwegian letters Æ and Ø are handled correctly, despite the fact that they are equavalents of fininish/swedish Ä and Ö. Further confusion for anyone with lots of friends around the Nordics.

External inconsistency

Localisation in the rest of Sailfish and its core apps handle these letters grammaticly correctly. Typing something on keyboard starting with A does not (correctly) suggest words starting with Ä.

edit retag flag offensive close delete

Comments

'Æ and Ø are handled correctly' - this is true from the perspective of the bug reported here, but the converse is true for searching from another locale. These characters need special handling for diacritic-insensitive search, since they do not have unicode decompositions starting with 'A' and 'O' respectively. We will need this for naive searching to work. I guess the German 'sz' ligature will also need special handling, and we can add all the others as they're reported :)

MattVogt ( 2014-01-07 09:33:36 +0300 )edit

3 Answers

Sort by » oldest newest most voted
2

answered 2014-01-04 21:44:26 +0300

Rauha gravatar image

updated 2014-01-04 21:56:07 +0300

@sp3000

Well I've been using computing devices since I got my Commodore-64 in 1984. This is the first localised software I've used in those 30 years that does not handle finnish alphabets correctly (including every contacts search in email or phone contacs apps). Not only do European and Asian software makers handle this correctly, but even American ones in their anglophile bubble know how to get this right.

I supposed it was bound that happen eventually that I would find software so badly localised, but I'm mightily surprised to see it happen in a product from a finnish corporation that currently only has resellers in Finland. I suppose most worrying aspect of your answer to this question and few related questions on other threads is that, it seems that there's not even any interest to fix this this problem. Ever thought that handling of search should be based on language settings, as it seems to be in just about every competing product ?!?

And please don't anyone 'fix' the 'bug' with æ and ø. At least let us non-english speakers have those letters handled correctly and you english speakers can continue writing software based on your own alphabets.

edit flag offensive delete publish link more

Comments

@Rauha: to the best of my knowledge, sp3000 is not a representative of Jolla. Accusing him/her of disinterest in fixing the problem is verging on offensive behaviour, and certainly not helping to build a collaborative community on this site.

MattVogt ( 2014-01-07 07:47:34 +0300 )edit
1

answered 2014-01-07 07:43:18 +0300

MattVogt gravatar image

Currently, the People app search works by normalizing unicode characters to decomposed form. Then, if the search term includes diacritics, we test that they are present in the match, and if the search term does not include diacritics, we perform a diacritic-insensitive match.

I think this issue can be corrected by testing whether the original unicode characters are members of the current locale's alphabet, prior to decomposition; if they are alphabet members in their composed form, we will not decompose them, and thus insist upon diacritic-sensitive matching. This would still allow users to match foreign names without the diacritics, as long as their device is in a language that does not consider those characters to be proper alphabetic members.

edit flag offensive delete publish link more

Comments

Now, this is exactly what I meant by my previous comment.

jsiren ( 2014-01-09 08:37:18 +0300 )edit

The change suggested in this comment has been implemented and released in version 1.0.3.8. Please report if the new behavior is still below expectations.

MattVogt ( 2014-02-02 09:33:18 +0300 )edit
0

answered 2014-01-03 14:10:09 +0300

sp3000 gravatar image

Matching extended latin characters in the data by common latin subset characters in the search spec is necessary for speakers of one language to find names in another language without being overtly familiar with the target alphabet. I'd be pretty well stumped remembering offhand which way an e might be pointing in a French name or whether to dot the i in a Turkish name, and I imagine the dotted variety of letters in some nordic languages are equally obscure to the common English-speaker for example. (The bit about æ and ø not being matched by ae and o sounds like a bug?)

edit flag offensive delete publish link more

Comments

1

Downvoted because it's not that simple (if you meant just direct mapping). The search tool needs to respect the user's locale, and collate accordingly. In Finnish distinguish between o and ö (Koli/köli), in English not ("coördinator"). This is a non-trivial problem; however, solutions exist.

jsiren ( 2014-01-04 23:37:13 +0300 )edit

Thank you so much Jsiren for your comment. Finally some support for this problem from someone not used to english(/dutch) alphabets.

Rauha ( 2014-01-05 00:08:21 +0300 )edit

@jsiren well yeah, I try not to suggest non-trivial things ;) but sure, you can exclude mapping to those data chars directly available on the user's keyboard for instance, and that'll reduce some noise. (Note æ and ø on the fi kb are rather hidden, and ...entirely unavailable in en, oops?)

sp3000 ( 2014-01-05 11:15:49 +0300 )edit
Login/Signup to Answer

Question tools

Follow
2 followers

Stats

Asked: 2014-01-03 10:54:28 +0300

Seen: 249 times

Last updated: Jan 07 '14