ZenphotoCMS Forum
xmpMetadata and (X)HTML character references - Printable Version

+- ZenphotoCMS Forum (https://forum.zenphoto.org)
+-- Forum: Support (https://forum.zenphoto.org/forum-1.html)
+--- Forum: Plugins (https://forum.zenphoto.org/forum-6.html)
+--- Thread: xmpMetadata and (X)HTML character references (/thread-9460.html)

Pages: 1 2


xmpMetadata and (X)HTML character references - guilhem - 08-01-2012

Hello,

I'm wondering, since XMP is XML-friendly, shouldn't xmpMetadata decode HTML characters references? exiftool -tagsfromfile img.jpg img.xmp produces a XMP where & (&), ' ('), " ("), > (>), and < (




xmpMetadata and (X)HTML character references - sbillard - 08-01-2012

The plugin probably should be decoding HTML entities. We will add that to the list.




xmpMetadata and (X)HTML character references - guilhem - 09-01-2012

All right thanks If you want me to open a ticket just let me know.




xmpMetadata and (X)HTML character references - sbillard - 09-01-2012

Normally, yes, but I have made the change and it will be in the nightly tonight. I would appreciate some testing, though.




xmpMetadata and (X)HTML character references - guilhem - 09-01-2012

Wow, that was fast!
I checked it against https://en.wikipedia.org/wiki/Character_entity_reference, and as far I saw the entity and numeric (decimal and hexadecimal) references are rendered properly, except one thing: the & (&, &#38 and &#x26) "eats" one character too many in some (!) cases: try eg, to render &'&&apos;&x&§&e. On the other hand, &a&a looks fine




xmpMetadata and (X)HTML character references - sbillard - 10-01-2012

It is hard to read/write html entities on a website. But it seems to me that what you are describing is that the translation fails when you have a naked ampersand preceding an entity. That is, of ocurse, not legal--ampersand is supposed to be represented by &``amp;




xmpMetadata and (X)HTML character references - guilhem - 10-01-2012

Oops, sorry for the mess. No I mean, if you write an entity that represents the ampersand, then in some cases the character that immediately follows is ignored. Try e.g., to render https://pastebin.com/raw.php?i=y77sUUcB: the first line is messed up, while the second is fine.




xmpMetadata and (X)HTML character references - sbillard - 10-01-2012

It looks like it is rendering correctly to me. However remember that the output may cause you issues: &``§ is not valid HTML




xmpMetadata and (X)HTML character references - guilhem - 11-01-2012

Ah? I know that is not valid HTML, but with the first line of my above paste (it's raw, there is no translation), I would expect &'&'&x&§&e, but I get http://i.imgur.com/hYVsx.png.
Don't you get the same result?




xmpMetadata and (X)HTML character references - sbillard - 11-01-2012

No, I do not, I get as you expect. I am guessing that what you see is a result of the browser tyring to interpret the &``§

Anyway, I did my tests by saving the result to a disk file to keep the browser out of the picture.

But I did notice that &``apos; does not get converted. So maybe we really need a full XML character table and not just the PHP html_entities_decode() I'll work on that.