Cannot (easily) find songs/albums with turkish letter İ

asked 2016-01-05 14:07:07 +0300

ossi1967 gravatar image

How to reproduce

Tag media files (in my case: MP3) with a string that contains the Turkish/Azerbaijani character İ (Unicode 'Latin capital letter I with dot above' U+0130) in either the title or the album string. (I guess it also happens with the Artist, but I didn't check.)

Wait for the file to be indexed.

Open stock Media Player app and search for the audio file, using case insensitive search (instead of İ type lower case i, which is the regular ASCII character present on most keyboards).

Expected results

The string containing the capital İ is found when typing lower case i, just as lower case a matches upper case A in the search.

Actual results

Nothing is found.

Possible workarounds

Install and activate turkish keyboard to type upper case İ in the search or search for other substrings. - For both workarounds you need to know what the actual problem is. People who're just used to typing lower-case strings in the search fields will just believe that "İsyankar" by Mustafa Sandal was accidentally deleted and won't try a workaround.

Additional information

Unicode defines i as the lower case equivalent of İ. On the other hand, the upper case equivalent of i can either be I (most western languages) or İ (Turkish/Azerbaijani). This is because the upper case I in these languages matches the dotless lower case ı. It is possible that the comparison fails because the lower case search input is converted to upper case and i becomes I instead of İ. Don't know in which layer this happens, though.

edit retag flag offensive close delete

Comments

1

As far as I know, it happens in the UI/QML. The search works by checking if the search term can be found in the title of the song and then appending the the song into the list. So, this issue is in JavaScript (or whatever compares the strings) which doesn't think lowecased dot-I's and i's are equal. A work-around could be to replace all the dot-I's with normal i's (not in the song name that is being shown, I mean in the code where the song name is lowercased). However, this would slow down the search (thought not by a noticeable amount if you don't have tons of music)

jollailija ( 2016-01-05 14:58:19 +0300 )edit
1

I do not use the stock Media Player but I would expect pressing let's say 'a' during a search to match all variants: aAàÀáÁâÂäÄãÃåÅāĀ...

That should not be too difficult to do in QML. I did something similar in my SMS counter patch to optionally convert accented characters into plain ASCII. Creating the mapping was a bit arduous but anyone is free to borrow the ready made map from my patch :)

pichlo ( 2016-01-05 23:41:05 +0300 )edit

Okay, the component they use for sorting looks like this:

FilterModel {
    id: filteredPluginList
    sourceModel: pluginList

    // Filter out if the value is not "1"
    filterRegExp: mainPageHeader.searchText !== "" ? /^1$/ : RegExp("")
}

So, now we know where the problem is?

jollailija ( 2016-01-06 11:01:28 +0300 )edit