Jump to content

Problem with German Umlauts


Herb

Recommended Posts

Hello,

 

I am working with Daminion standalone version 0.9.8 build 716 and have some problems with German umlaut characters. I see these problems within the IPTC section of metadata.

It occurs when I write keywords into an image that has no IPTC tags.

 

When I write a keyword that contains a German umlaut (which is not an ASCII character, but it is contained in the codepage of my windows system) an IPTC section is created and the umlaut is coded in the system codepage.

 

When I write a keyword with e.g. a chinese unicode character and a German umlaut both are coded in UTF8.

 

Therefore I would like to propose that all IPTC tags are always coded in UTF8 or there is a preference that defines the used codepage.

 

Best regards

Herb

Link to comment
Share on other sites

I am working with Daminion standalone version 0.9.8 build 716 and have some problems with German umlaut characters. I see these problems within the IPTC section of metadata. It occurs when I write keywords into an image that has no IPTC tags.

 

When I write a keyword that contains a German umlaut (which is not an ASCII character, but it is contained in the codepage of my windows system) an IPTC section is created and the umlaut is coded in the system codepage.

 

If I correctly understood the problem is "umlaut is coded in the system codepage"? But why this can be a problem? Please explain?

Link to comment
Share on other sites

Hello Murat,

 

thanks for your quick reply.

 

If I correctly understood the problem is "umlaut is coded in the system codepage"? But why this can be a problem? Please explain?

 

I see the following problems:

1) Lets assume I add a tag (tag1) that contains a German umlaut.

I am working with a German windows; so the umlaut is stored in system codepage: "Ö" ist stored as H'F6.

Then somebody else adds a chinese character into tag2. This is stored in UTF8.

(I assume that tag1 remains unchanged.)

Now I send this picture to you and you will have the problem to read the IPTC metadata, because you do not you whether to read as UTF8-coded or as Latin2-coded.

 

2) On the other hand, when you use a windows system that e.g. runs with a cyrillic codepage, I do not know what H'F6 is as character on your system.

Even to read the IPTC metadata <system-codepage>-coded does not solve the problem.

You need to have the possibility to read UTF8-coded and to read as Latin2-coded on your cyrillic system.

 

Always to use UTF8-coding may reduce the problem. But it does not solve all (theoretic) possibilities.

 

Thanks and

Best regards

Herb

Link to comment
Share on other sites

This problem can be partially solved by IPTC: CodeCharPage field (1:90). But there will be a problem if we update just one Unicode field while all other fields were stored in your system code page. We need convert all fields from (? what code page) to UTF8.

 

 

It's better to use XMP only, to avoid all the encoding issues (nightmares) above, because:

- it's faster

- it's Unicode compliant

- it doesn't have limitations of IPTC (for example for categories and keywords length limitation)

- it offers more fields than IPTC

- it supports more formats

Link to comment
Share on other sites

Hello Murat,

 

thanks for your reply.

Yes I agree, your arguments are all ok.

Of course I will use XMP,

but I do not know how to avoid that an IPTC section is created.

 

Please see also my other thread "Add compatibility to old programs"

 

As given in my latest post #5 in case of creating an IPTC section within an *.orf file

- UTF8 coding is used and

- tag CodedCharacterSet is set explicitely to UTF8.

 

Why is UTF8 encoding not used in case of JPEG files?

 

Best regards

Herb

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...