Member
Member
olejorik   2018-11-04, 19:36
#1

After upgrading my local copy (WAMP 3.1.4, PHP 7.10, MySQL 5.7.23) of a site from ZP1.4.14 to 1.5, I have discovered that pages containing Cyrillic characters are not displayed correctly. I've repeated clean installation/upgrade as described below with reproducible loss of Cyrillic support:

  1. Clean installation of Zp 1.4.14
  2. make clean database
  3. choose Zenpage theme
  4. activate zenpage plugin
    create new page with unicode content (I used "проверка" both as title and content)
  5. everything works OK:
  6. upgrade to Zp1.5 (from github) https://github.com/zenphoto/zenphoto
    with or without removing the unknown files
  7. Content of the page is not shown correctly, while the title is ok:
  8. but the article content is not lost (it's visible in Edit Page menu)

I don't dare at the moment to test it on production site, so I do not exclude a possibility that this can be related to something related to WAMP configuration, but as I have no problems with the older versions of Zp, I think there should be a bug.

Administrator
Administrator
acrylian   2018-11-04, 21:16
#2

Actually nothing changed from 1.4.14 regarding the database. I just tried on my local MAMP install and "проверка" works fine here and also on our own site. Here is a test article: https://www.zenphoto.org/news/proverka/

You said you setup a new database. Please check that the encoding should generally be utf8_unicode_ci which is what Zenphoto will actually set. Also check that not only the tables and columns but the database itself is set correctly. And also the encoding options in Zenphoto itself. Also try with the tinyMCE text editor enabled.

Member
Member
olejorik   2018-11-04, 22:08
#3

Thanks for checking it. I confirm I use utf8_unicode_ci for the database, tables and columns and that tinyMCE is enabled. If it works in MAMP, it might be WAMP-related isssue. what is strange for me, that I see correct characters in Edit page, Content field, but not in the php-rendered page,

Administrator
Administrator
acrylian   2018-11-04, 22:34
#4

Can you take a look at the database columbs on each directly? Are the strings both stored the same way? If all is right they should be directly stored (besides some serialized array stuff around them that belong to multilingual storage).

Member
Member
olejorik   2018-11-04, 22:54
#5

this is what I see in the database in phpMyAdmin:
a:1:{s:5:"fr_FR";s:16:"проверка";} for title field
a:1:{s:5:"fr_FR";s:23:"проверка ";} for content.

fr_FR was en_US before editing the record with timyMCE.

A record made with TinyMCE switched of reads as:
a:1:{s:5:"fr_FR";s:17:"проверка2";},
a:1:{s:5:"fr_FR";s:24:"проверка tést 2";} and cyrrillics and é are then displayed as �

Administrator
Administrator
acrylian   2018-11-04, 23:22
#6

If "fr_FR" changes from "en_US" that means you switch the language on the backend.

tinymce does do some encoding of special chars when saving. Does the exact same happen on the other install?

I just tried the same locally without tinymce and it still works for me, even with a freshly created page. I would assume there is some tiny detail off somehwere…

Member
Member
olejorik   2018-11-05, 10:20
#7

Ok, I've found a fix:
file functions-common, line 358 reads in zp1.5 as
$str = tidyHTML($str);.

After I've changed to what it (approximately) was in zp 1.4.14 (see below), everything works fine.

if ($str != $original) {
    $str = tidyHTML($str);
 }
Administrator
Administrator
acrylian   2018-11-05, 10:36
#8

Thanks, will take a look at that. There had been some changes because of issues with truncated text and broken html. Btw, do you actually have the tidy extension on WAMP (I do in MAMP)?

Member
Member
olejorik   2018-11-05, 11:23
#9

My goodness, how easy it was! Yes, I have it, and it was disabled. After enabling it, I've got all my letters back.

Thanks! I can only imagine how excellent your paid support should be

Administrator
Administrator
acrylian   2018-11-05, 11:27
#10

Thanks ;-) Well, the issue should not happen without tidy so I quickly tested with that line change and at least for me then all works even with tidy as before. So probably we can re-add that line with 1.5.1.

Member
Member
olejorik   2018-11-05, 13:11
#11

that line only prevented evaluation of tidyHTML($str), and tidyHTML() itself checks the presence of tidy class and if it is absent,
in zp1.5:

return trim(htmLawed($html, array('tidy' => '2s2n')))

in zp1.4:

return $html;

So the issue is most probably related to htmLawed().

Administrator
Administrator
acrylian   2018-11-05, 13:14
#12

This probably should be tested with a longer text using Cyrillic chars and how that all works on actual truncation. The missing comparison probably should be re-added as it is not really necessary if the string is the same anyway.

And yes, htmLawed is a kind of workaround if tidy is not there as tidy is superior.

Member
Member
olejorik   2018-11-05, 21:32
#13

Ok, I've tested it with tidy extension switched off. xdebug shows that the string (independent of its length, by the way) changes its value to unreadable charachters in line 677 of lib-htmLawed.php saying (not clear yet for me):

$t = preg_replace(array('`(]*(?)\s+`', '`\s+`', '`(]*(?) `'), array(' $1', ' ', '$1'), preg_replace_callback(array('`()`sm', '`()`sm', '`(]*?>)(.+?)()`sm'), 'hl_aux2', $t));
Administrator
Administrator
acrylian   2018-11-06, 08:53
#14

htmlawed is a third party library which we generally don't touch and have no hand in. I will try to reproduce this later on.

  
Powered By MyBB, © 2002-2026 MyBB Group.
Made with by Curves UI.