IPTC import: language and encoding

Hi,

Hi,

I have two problems when importing images with IPTC fields.

a) I want my site to be available both in French and in English, but since I'm French all my titles, descriptions and so on have been completed in French in Lightroom. The zenphoto gallery options are on "HTTP Accept Language" and "Multilingue". Unfortunately when uploading the pictures the french texts are imported into the "English (United States)" fields while the "French (France)" fields remain blank. On my localhost with MAMP it works correctly (french texts into french fields).

b) Just like many people before me, I can't manage to import accents correctly. All é,à,è, etc turn into squares with "008E" inside, or cut the string, depending on the encoding I choose for IPTC fields in zenphoto options. Here again the descriptions and keywords were entered in Lightroom with a mac. I can correct manually after import but ...

Anyone can help with these two issues ?

Many thanks,

C.

Comments

  • a) Actually when there is only one string, such as the case of the IPTC data, it will show up in whatever the "default" language is. So it should be possible for you to change your language from "HTTP Accept Language" to French and have the fields appear on the French section. Then make the English versions of the text. Once there is more than one language version the text will stick with its language.

    b) This is a bit more problematical. The best solution is to set Lightroom to use UTF-8 encoding. Then it should flag the IPTC data as UTF-8 and Zenphoto will know how to deal with it. Unfortunately the flags for the data seem not strictly applied. So if Your data is stored as UTF-8 but not identified as such, Zenphoto will error in its import. I'd try telling Zenphoto that the encoding is UTF-8 and see if that turns out correct. Otherwise you will have to find out what encoding is being used.
  • cmjn Member
    Thank you sbillard,

    So let's forget about the b) for the moment, I've tried telling Zenphoto it was UTF-8 but the output is still wrong. I'll investigate this later and see what Lightroom is doing.

    For a), I've changed my language to French, leaving the "multilingue" option ticked to have all my language fields, but nothing has changed even for new uploads: IPTC's keep on being imported into "English" fields. The problem comes probably from my web hosting (Nuxit) since the same scripts work correctly on my localhost ? If that helps, here is an excerpt from my phpinfo:
    HTTP_ACCEPT_CHARSET ISO-8859-1,utf-8;q=0.7,*;q=0.7
    HTTP_ACCEPT_ENCODING gzip,deflate
    HTTP_ACCEPT_LANGUAGE fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3

    Is there a way to modify a script to have everything imported into French fields ? I'd like the website to fit visitors' languages but my admin will always be in French, maybe this should depend on install language and not gallery language options ?
  • a) It should be working the way I described. I will figure out why it is not and fix it.
  • Fix is up and will be included in the nightly build. With this fix you should not have to set your language to French--HTTP Accept Language should work as will given your php info.
  • cmjn Member
    Many thanks for fixing it so quickly, I'll try it tomorrow !
  • cmjn Member
    It works :-)

    for b) I confirm IPTC's look OK when images don't come from Lightroom.
  • Thanks for the report. Wish I could help more on the lightroom bit.
  • olihar Member
    This is a known problem with Lightroom, Adobe knows about it and does not give a s*** about it.

    Lightroom is using MacOs charset on the Mac side and then a Windows charset on the Windows side. Instead of using UTF-8 across the board.

    We can only hope they start to listen soon, I will not buy Lightroom until they fix this terrible problem. I will still be using it though.
  • Can Lightroom be configured to use straight UTF-8? I know that Adobe Bridge can.

    EDIT: If there is a translation table for MAC to UTF-8, we could add that as IPTC translation selection.
  • olihar Member
    No Lightroom can't be changed to UTF-8, on the Mac everything coming out of Lightroom seems to have Mac OS Roman Charset.

    I have tried to use Photo Mechanic to change the charset to UTF-8 and that works fine.

    It is just pretty expensive way of changing IPTC to UTF-8 so I only used the trial of the program.

    I can supply some JPG's with the Mac OS Roman Charset in the IPTC exported from Lightroom for you to have a look at and try out.
  • Have you tried the `Western European (MAC)` encoding? I'm not sure it will be supported by the UTF-8 conversion software. You can dump the results of `mb_list_encodings()` to see what your server supports.
  • olihar Member
    I have not tried this in a pretty long time, and yes I tried Western European (MAC) back then.

    Here are 2 photos from Lightroom that might be of interest to try out.
    http://www.olihar.com/junk/zen/charset/

    2005_01_22_161733R.jpg seems to work fine.

    http://img.skitch.com/20090530-tnmgamk19jppc6gstcmfa3p9up.png

    2005_06_11_174058R.jpg does have some problems at the letter í, everything after that letter does not go into the database.

    http://img.skitch.com/20090530-trkeqb9u9grpr6ukg3466g5csk.png

    Hope these 2 files can light some things on this problem. They are identical exports they only different is some different letters in the IPTC. As I have have discovered before and I have posted about it few times before here on the forum. when the letters ð or þ are used in the IPTC field the áíéóúæö seem to be left alone if ð and þ are note used in the field the sentence gets cut off right where the áíéóúæö are.
  • Sorry, but neither of these downloaded with IPTC data. Maybe you can make a ticket and attach the files?
  • Also, it would be useful to see the dump of the mb_list_encoding() array. At least on my server Western European (MAC) is not a supported character set. If that is so on your server, you could to modify the lib-utf8 convert() function to prefer `iconv()` to `mb_convert_encoding()`. At least on my site that function says it does support the MAC characters.
  • cmjn Member
    Thanks for the trick !

    In lib-utf8.php I have switched lines 52-54 and lines 55-58 so that iconv() is chosen first.
    Then I have changed the Image options > IPTC encoding to Western European (Mac) and now all my IPTC fields with french characters are fine :-)))) Hope this will work for you as well olihar.

    On the other hand I have downgraded back to stable 1.2.4 because of sometimes unstable behaviour of the nightly build where you fixed the language issue (the a) above). Will the fix be available in the next stable version ? There's no emergency for me, I'm working on the design and not populating the gallery yet.
  • Great. After the 1.2.5 release we will formalize this so that you can make the choice through an option.

    But what is the problem you are having with the nightly build? This will soon be the 1.2.5 release.

    NOTE: you can also apply your change to the 1.2.4 release.
Sign In or Register to comment.