descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Bugs which are supposed to be fixed in the next test version (not available yet)

Moderators: XnTriq, helmut, xnview, Dreamer

Ddavid
Posts: 11
Joined: Sun Oct 13, 2019 6:44 pm

descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by Ddavid »

I'm under Japan Windows and I have a folder with Simplified Chinese "蓝" where does not in JP fonts.

When I edit the description (by ctrl-d) of this folder, in descript.ion it shows "xx?xx" instead of "xx蓝xx", and cannot be showed in XnView MP.
User avatar
xnview
Author of XnView
Posts: 38817
Joined: Mon Oct 13, 2003 7:31 am
Location: France

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by xnview »

by default, descript.ion doesn't use unicode, please create a descript.ion text file and save it as unicode...
Pierre.
User avatar
helmut
Moderator
Posts: 8688
Joined: Sun Oct 12, 2003 6:47 pm
Location: Frankfurt, Germany

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by helmut »

Hmm, I thought the big advantage of XnView MP compared to XnView Classic is its Unicode support. For descriptions (files) Unicode should be also supported, I think.
User avatar
XnTriq
Moderator & Librarian
Posts: 6138
Joined: Sun Sep 25, 2005 3:00 am
Location: Ref Desk

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by XnTriq »

My findings (so far):
xnview wrote: Tue Sep 01, 2015 4:44 am
waily wrote: Mon Aug 31, 2015 2:37 amIs it possible to keep old descript.ion but internel use unicode for display and modify ?
Something like add a layer to transencoding ANSI to unicode in file io.
This way you can keep all internel XNViewMP deal with unicode and keep old descript.ion.
i don't understand, i can't change the encoding for description without breaking compatibility.
xnview wrote: Mon Jan 13, 2020 10:04 amBy default, descript.ion file created is ANSI (for compatibility). If you create an empty unicode descript.ion file, it will be used.
xnview wrote: Thu Oct 29, 2020 2:14 pm
xephu wrote: Sun Oct 25, 2020 8:15 am Browser - Metadata - Encoding - Comment is also set to UTF-8.
This setting is for embedded comment.
By default, if descript.ion doesn't exist, ansi is used. You can force UTF8 by editing

Code: Select all

useUtf8ForDescription=true
from xnview.ini
User avatar
helmut
Moderator
Posts: 8688
Joined: Sun Oct 12, 2003 6:47 pm
Location: Frankfurt, Germany

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by helmut »

Thank you for your gathering valuable links and info, XnTriq.

Code: Select all

useUtf8ForDescription=true
This setting sounds very promising. AFAIK, UTF-8 has partial ASCII compatibility, so I wonder why UTF-8 isn't activated by default and why this should break compatibility. The way it is now (UTF-8 disabled) is pretty useless for many Asian users.
User avatar
XnTriq
Moderator & Librarian
Posts: 6138
Joined: Sun Sep 25, 2005 3:00 am
Location: Ref Desk

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by XnTriq »

helmut wrote: Tue Jul 26, 2022 7:49 pmThis setting sounds very promising. AFAIK, UTF-8 has partial ASCII compatibility, so I wonder why UTF-8 isn't activated by default and why this should break compatibility. The way it is now (UTF-8 disabled) is pretty useless for many Asian users.
Pierre's remarks on UTF-8 with BOM:
xnview wrote: Mon Dec 28, 2020 1:11 pmBy default, text file must be in the system codec. I can only detect UTF8 text file with BOM
I'm assuming he refers to iptc.def and/or iptc.ini in this case.
https://en.wikipedia.org/wiki/UTF-8#Byte_order_mark wrote:If the UTF-16 Unicode byte order mark (BOM, U+FEFF) character is at the start of a UTF-8 file, the first three bytes will be 0xEF, 0xBB, 0xBF.

The Unicode Standard neither requires nor recommends the use of the BOM for UTF-8, but warns that it may be encountered at the start of a file trans-coded from another encoding. While ASCII text encoded using UTF-8 is backward compatible with ASCII, this is not true when Unicode Standard recommendations are ignored and a BOM is added. A BOM can confuse software that isn't prepared for it but can otherwise accept UTF-8, e.g. programming languages that permit non-ASCII bytes in string literals but not at the start of the file. Nevertheless, there was and still is software that always inserts a BOM when writing UTF-8, and refuses to correctly interpret UTF-8 unless the first character is a BOM (or the file only contains ASCII).
User avatar
xnview
Author of XnView
Posts: 38817
Joined: Mon Oct 13, 2003 7:31 am
Location: France

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by xnview »

helmut wrote: Tue Jul 26, 2022 7:49 pm The way it is now (UTF-8 disabled) is pretty useless for many Asian users.
it's to be compatible with old description file used by others apps
Pierre.
User avatar
helmut
Moderator
Posts: 8688
Joined: Sun Oct 12, 2003 6:47 pm
Location: Frankfurt, Germany

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by helmut »

xnview wrote: Wed Jul 27, 2022 7:33 am
helmut wrote: Tue Jul 26, 2022 7:49 pm The way it is now (UTF-8 disabled) is pretty useless for many Asian users.
it's to be compatible with old description file used by others apps
O.k.. But couldn't there be a smart logic in XnView that recognizes if the user uses non-ASCII characters in the description and then saves description in UTF-8? There could be a message telling that description saved in UTF-8 might not be readable by older or other software but I guess most users won't care because they don't use other software than XnView for reading description files.
Ddavid
Posts: 11
Joined: Sun Oct 13, 2019 6:44 pm

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by Ddavid »

helmut wrote: Wed Jul 27, 2022 8:19 pm
xnview wrote: Wed Jul 27, 2022 7:33 am
helmut wrote: Tue Jul 26, 2022 7:49 pm The way it is now (UTF-8 disabled) is pretty useless for many Asian users.
it's to be compatible with old description file used by others apps
O.k.. But couldn't there be a smart logic in XnView that recognizes if the user uses non-ASCII characters in the description and then saves description in UTF-8? There could be a message telling that description saved in UTF-8 might not be readable by older or other software but I guess most users won't care because they don't use other software than XnView for reading description files.
That's right. Why I need to switch from XnView Classic to XnView MP? Because I want to build a totally Unicode environment. I don't care about the other apps which cannot handle Unicode, because I will also leave them.
Unicode support is one of the most important selling points of XnView MP. XnView official page says "Compared to XnView Classic: World-Wide compatible. XnView MP offers Unicode support. Enhanced translations for many languages as well as a brand new and convenient modular interface." Make Unicode default like what OS does.
You still can leave the option for the non-Unicode app users. If they turn off Unicode, note them "The description/filename/folder name/content in zip file contains Unicode characters, please edit again" every time they type any Unicode char.
User avatar
xnview
Author of XnView
Posts: 38817
Joined: Mon Oct 13, 2003 7:31 am
Location: France

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by xnview »

Ddavid wrote: Wed Jul 27, 2022 10:59 pm You still can leave the option for the non-Unicode app users. If they turn off Unicode, note them "The description/filename/folder name/content in zip file contains Unicode characters, please edit again" every time they type any Unicode char.
do you have tried

Code: Select all

useUtf8ForDescription=true
?
Pierre.
Ddavid
Posts: 11
Joined: Sun Oct 13, 2019 6:44 pm

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by Ddavid »

xnview wrote: Thu Jul 28, 2022 7:56 am
Ddavid wrote: Wed Jul 27, 2022 10:59 pm You still can leave the option for the non-Unicode app users. If they turn off Unicode, note them "The description/filename/folder name/content in zip file contains Unicode characters, please edit again" every time they type any Unicode char.
do you have tried

Code: Select all

useUtf8ForDescription=true
?
It "basically" works.
The problem is that even I use

Code: Select all

useUtf8ForDescription=true
, old descript.ion files which are not Unicode will not be replaced by Unicode when they are modified. Only brand new created descript.ion will be Unicode.
User avatar
xnview
Author of XnView
Posts: 38817
Joined: Mon Oct 13, 2003 7:31 am
Location: France

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by xnview »

yes for compatibility
Pierre.
User avatar
helmut
Moderator
Posts: 8688
Joined: Sun Oct 12, 2003 6:47 pm
Location: Frankfurt, Germany

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by helmut »

xnview wrote: Sun Jul 31, 2022 11:05 am yes for compatibility
If compatibility restricts users and usage that's no good. Perhaps an additional setting

Code: Select all

forceUtf8ForDescription=true
would be a good thing.

And XnView MP's default for setting "useUtf8ForDescription" should be "true" so that new users really benefit from UTF-8 and are not restricted.

From my point of view, software should be smart and support user in best way. As written in a post above there could be warning message whenever user uses UTF-8 characters and/or overwrites a non-UTF-8 description file. The way it is now is cumbersome.
User avatar
xnview
Author of XnView
Posts: 38817
Joined: Mon Oct 13, 2003 7:31 am
Location: France

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by xnview »

ok
Pierre.
User avatar
xnview
Author of XnView
Posts: 38817
Joined: Mon Oct 13, 2003 7:31 am
Location: France

Re: descript.ion does not support unicode, Version 1.0 64bits (Apr 28 2022)

Post by xnview »

See issue for current status and some details.
Pierre.