Problems with static HTML Plugin and Charsets

Hi,

i updated my zenphoto from an older version and activated the static html plugin. It all works fine except some chars (which are in my template) like "é" look like "é" after the page is cached. When i deactivate the static html cache plugin, everything looks fine.

Do you have a clue what i can do to avoid this?

Thanks for your help!

Comments

  • acrylian Administrator, Developer
    If you use a non standard theme look if your theme files are utf8 encoded.
  • how can i see that?
  • You would have to have a text editor that can show you the format. For Windows Notepad++ is an excellent choice.
    http://notepad-plus-plus.org/
  • acrylian Administrator, Developer
    And for Mac I recommend the free Textwrangler:
    http://www.barebones.com/products/textwrangler/
  • Thanks very much for your help! The encoding seems to work well now with the static html plugin! But now i get a new error:

    On top of each page appears:

    Warning: Cannot modify header information - headers already sent by (output started at ..path../themes/Images/functions.php:1) in ..path../index.php on line 153

    I tried making all files of the theme utf-8 - but the warning appears again and again. It also appears without static html plugin.

    What can i do to remove the warning?
  • The warning tells you that at line 1 of the functions.php file some kind of output has happened. Most likey the first thing in the file is not `<?php` but perhaps a blank or such.
  • Yes that what i would normally expect but there is no blank space or something. If i change the encoding of functions.php back to Ansi / UTF8 without BOM, it works perfectly. Only if i change the encoding to utf-8 the warning appears. I change the encoding with Notepad++ as you recommended.
  • UTF 8 (with BOM) would do that. The BOM is a character that lets software know the file is UTF-8. Unfortunately, that does cause problems for PHP. So UTF-8 without BOM is the correct setting.

    [edit] Note that without the BOM, the file could be either UTF-8 or Ascii. Normally that does not matter, but if your browser thinks the file is ascii and you add text with diacritical marks, they will be Ascii characters. What we usually do is deliberately place a UTF-8 diacritical character in the comments in the front of the file. This "forces" them into UTF-8 mode.

    In addition, the editors we use can be set to default to UTF-8 for the character set in case we forget the above.
  • Thanks a lot for your help. Sadly this doesnt work for me. If i use Ansi as UTF-8 (without BOM) and place //äöü--// in the top of my file, the diacritical characters are damaged as soon as i activate the static html plugin.

    Without the plugin it works perfectly. The only way i can get the diacritical characters to be displayed correctly with the static html plugin is to make the file utf-8 - but then the PHP Warning appears (even without static html plugin).

    Maybe the static html plugin modifies the page in a way that the characters are not displayed correctly?
  • Probably you are not really getting UTF-8 characters into your file. The editor has to have an option to use UTF-8 encoding for diacritical characters without resorting to a BOM in the file. Nothing else is going to work.

    But I do not really understand one thing. You say things work correctly when not cached? The results really should not be different as all that is happening is that Zenphoto captures the output and stores it exactly. Then re-serves the same later.

    Please try the following:

    Remove all the files in your html cache. Then view the page in question. Check the HTML with the browser's view source. At the bottom there will be no reference to caching. Save this file. Note also if your chacters show up correctly.

    Refresh the browser page and view again. This time there will be a reference to the cache serving. Save under a different name.

    Compare the two files with a file comparison utility. There should be only the last line about cache serving as a difference.
  • i just saved and compared the uncached and cached version. The comparison utility i used (comparison suite) said there is no difference between the two files. But the files differ in size - the uncached version is 29.442 bytes long, the cached version has a size of 29.762 bytes.

    I opened both files in Notepad++ and took a look at the encoding - its utf8 as ansi in both cases.

    Do you have a clue what this could mean? Which comparison tool do you use?

    Thanks a lot for your help and a happy new year :)
  • So they both look identical but have a different size?
    I use notepad++ as well v5.9.4 (UNICODE) and the only time that you see a difference like that is when the line endings are different. Changing from Unix to Dos/Windows (from edit->EOL Conversion) will show a difference like that. Changing from utf-8 to ansi will of course reduce your length (in notepad++) by one per character that can no longer be displayed. For me it will display a question mark if I do this.

    HTH
    edit: ahh I suppose the difference in the math could be from a few hundred characters that get crushed into ansi (and lost) but how the files would compare the same is beyond me...
  • I think I have found the difference. Can you try the 1.4.2 RC1 build (after tonight's build) and see if the problm is fixed. The HTML cache plugin was not sending the standard headers as part of the page delivery, so perhaps the browser did think the character set was ansi and not UTF-8.
  • awesome, this worked! Thanks a lot for your help sbillard!
  • Thanks for the report back.
Sign In or Register to comment.