I use my Zenphoto setup (v. 1.6) to host photos with titles/tags in several languages, main interface is in English and I don't use multi-lingual mode.
I've discovered that when Zenphoto is importing metadata from images, it limits photo max-length of Title to 36 symbols and max-length of single Tag to 33 symbols IF this metadata was written in Cyrillic script (Russian, Ukrainian etc). Description is imported without shortening.
You can manually edit Title for such images to restore full length, but it's impossible to rename Tag to longer version, which is very frustrating.
Those limits are undocumented and never mentioned, I wonder if it's possible to remove them?
@acrylian
In Database info:
character_set_client: utf8mb4
character_set_connection: utf8mb4
character_set_database: utf8mb4
character_set_filesystem: binary
character_set_results: utf8mb4
character_set_server: utf8mb3
character_set_system: utf8mb3
character_sets_dir: /usr/share/mysql/charsets/
collation_connection: utf8mb4_0900_ai_ci
collation_database: utf8mb4_general_ci
collation_server: utf8_unicode_ci
and then, in tables for specific fields utf8mb4_unicode_520_ci is listed.
@acrylian
Sure!
A few examples of both:
Clipped title: "Петропавловская церковь в Петерго" instead of "Петропавловская церковь в Петергофе"
direct image link: https://www.photo.private-universe.net/albums/travel/russia/peterhof/petropavlovskaya-tserkov-v-petergofe.jpg
Clipped tags:
tag: "Государственный музей истории рел" instead of "Государственный музей истории религии"
direct image link: https://www.photo.private-universe.net/albums/365-projects/2016/12-december/11.12.2016-find-buddha-and-turn-right.jpg
Meanwhile I tried a third party library for reading metadata and get the same results. I also checked my local server, everything is utf-8 (the "mb4" extra is just a mysql thing) so it "should" work. Also tried our live server and also the same.
Perhaps one or both of the native php functions are not multibyte save for some reasons. Could not find info about that except general encoding setting we cover as intended actually.
Perhaps also check the encoding of the data written to the image itsself. The IPTC keywords are stored binary so a wild guess - as I have no knowledge how tools might write such data to images - is that for some reaons the are not in the proper encoding before being converted or something.
I did my checks and some googling.
First here is post from exiftool author on limitations of various standards for metadata EXIF, IPTC, XMP, where he talks specifically about encoding and implications for various languages.
https://exiftool.org/gui/articles/where_what.html
IPTC section has imposed limitations on field length, which is a source of problem.
There is no settings for encoding used in lightroom 5.3, but I also use geotagger app and this is set to write everything in UTF-8, even if original metadata is encoded differently.
I checked metadata for my photos using their GUI for exiftool.
EXIF has no fields for Title of Keywords, just Description.
XMP has fields for Title, Keywords (field named Subject) and Description and all my metadata is preserved in full.
IPTC has fields for Title (field Object name), Keywords and Description and here we can see shortening of longer entries in Object name and Keywords.
https://www.photo.private-universe.net/albums/-temp/iptc-1.jpg
https://www.photo.private-universe.net/albums/-temp/iptc-2.jpg
So, maybe the solution is to use XMP field first (data is similar)?
Yes, it is a IPTC issue it seems. But actually the image stores the values correctly as I can see in image editors. So it is either the PHP function limiting or doing something wrong. WE just first thought we had something but…
This is probably a general issue with other non Western European chars + encoding as well.
Yes, in your case of course try XMP using the plugin.
It worked!
Enabling xmpMetadata plugin and refreshing metadata for my site allowed to reload full versions of Title and Keywords.
Some notes:
https://www.zenphoto.org/news/xmpmetadata/ - wrong status and dead link, as plugin is included with zenphoto.
Maybe it will be helpful to other users to include info in xmpMetadata plugin description, Help files and Admin on possible benefits for non-latin based languages in tags, so people will enable it right away?
Great that worked! I also discovered that with the other library I mentioned that iptc is truncated while its xmp values - which I was not aware of somehow - are also correct. So it is perhaps indeed the php iptcparse() function here just following some "official standard" limits.
Thanks for the note about the wrong link on the plugin page. Seems that applies to all official plugins.
Thanks, we'll also think about your suggest about a note.