The simpler media website CMS
After upgrading my local copy (WAMP 3.1.4, PHP 7.10, MySQL 5.7.23) of a site from ZP1.4.14 to 1.5, I have discovered that pages containing Cyrillic characters are not displayed correctly. I've repeated clean installation/upgrade as described below with reproducible loss of Cyrillic support:
I don't dare at the moment to test it on production site, so I do not exclude a possibility that this can be related to something related to WAMP configuration, but as I have no problems with the older versions of Zp, I think there should be a bug.
Comments
Actually nothing changed from 1.4.14 regarding the database. I just tried on my local MAMP install and "проверка" works fine here and also on our own site. Here is a test article: https://www.zenphoto.org/news/proverka/
You said you setup a new database. Please check that the encoding should generally be
utf8_unicode_ci
which is what Zenphoto will actually set. Also check that not only the tables and columns but the database itself is set correctly. And also the encoding options in Zenphoto itself. Also try with the tinyMCE text editor enabled.Thanks for checking it. I confirm I use
utf8_unicode_ci
for the database, tables and columns and that tinyMCE is enabled. If it works in MAMP, it might be WAMP-related isssue. what is strange for me, that I see correct characters in Edit page, Content field, but not in the php-rendered page,Can you take a look at the database columbs on each directly? Are the strings both stored the same way? If all is right they should be directly stored (besides some serialized array stuff around them that belong to multilingual storage).
this is what I see in the database in phpMyAdmin:
a:1:{s:5:"fr_FR";s:16:"проверка";}
for title fielda:1:{s:5:"fr_FR";s:23:"<p>проверка</p>";}
for content.fr_FR
wasen_US
before editing the record with timyMCE.A record made with TinyMCE switched of reads as:
a:1:{s:5:"fr_FR";s:17:"проверка2";}
,a:1:{s:5:"fr_FR";s:24:"проверка tést 2";}
and cyrrillics and é are then displayed as �If "fr_FR" changes from "en_US" that means you switch the language on the backend.
tinymce does do some encoding of special chars when saving. Does the exact same happen on the other install?
I just tried the same locally without tinymce and it still works for me, even with a freshly created page. I would assume there is some tiny detail off somehwere…
Ok, I've found a fix:
file
functions-common
, line 358 reads in zp1.5 as$str = tidyHTML($str);
.After I've changed to what it (approximately) was in zp 1.4.14 (see below), everything works fine.
Thanks, will take a look at that. There had been some changes because of issues with truncated text and broken html. Btw, do you actually have the tidy extension on WAMP (I do in MAMP)?
My goodness, how easy it was! Yes, I have it, and it was disabled. After enabling it, I've got all my letters back.
Thanks! I can only imagine how excellent your paid support should be
Thanks ;-) Well, the issue should not happen without tidy so I quickly tested with that line change and at least for me then all works even with tidy as before. So probably we can re-add that line with 1.5.1.
that line only prevented evaluation of
tidyHTML($str)
, andtidyHTML()
itself checks the presence of tidy class and if it is absent,in zp1.5:
in zp1.4:
So the issue is most probably related to
htmLawed()
.This probably should be tested with a longer text using Cyrillic chars and how that all works on actual truncation. The missing comparison probably should be re-added as it is not really necessary if the string is the same anyway.
And yes, htmLawed is a kind of workaround if tidy is not there as tidy is superior.
Ok, I've tested it with tidy extension switched off. xdebug shows that the string (independent of its length, by the way) changes its value to unreadable charachters in line 677 of lib-htmLawed.php saying (not clear yet for me):
htmlawed is a third party library which we generally don't touch and have no hand in. I will try to reproduce this later on.