Major Flaw with server DOS

Hi

1) I am a new zen user - and I am testing the nightly build from 0501-trunk.

So far I like it - its clean and simple to set up - allowed me to rsync the album directory from gallery (I'm looking to change :-) - that was a really nice touch to help converting. At least the images.

2) However - there is a very serious flaw in zenphoto - pre-caching. Perhaps I didnt set up something correctly, but even on a single album - it seems to fork 1 httpd process per image - this caused the server load to go through the roof, and then memory goes under pressure - and eventually server starts to swap. I killed it before the machine died.

I had tried this with 1.2.9 and the server kernel ran out of swap and started killing processes ... eventually machine had to be rebooted.

This is a pretty major flaw - it would be very very good to do these more serially and not attempt to do them all in parallel - would it be possible to please add an option (set to 4 by default perhaps) to cap the number of parallel threads.

I had same problem with 1.2.9 - so this is not new to 1.3.0.

I'll keep evaluating - but this DOS button really is nasty ...

Server is running apache on linux.

Because of the DOS nature of this, Is there a way to diaable this pre-caching until it can be fixed ? (I saw http://www.zenphoto.org/trac/changeset/5329 but did not see an option in the admin toolbox .. is it there already or perhaps this is not for pre-caching?)

3) After the gallery was created, i logged out and back in - and it refused to accept the password - I had set up mail fortunately so the password reset worked - have not logged out again so not sure if I will need to reset my password every time I logout - but I had similar problem with 1.2.9 (using chrome on linux).

Thank you for any help ... should i file a bug on Trac for this ?

gene

Comments

  • acrylian Administrator, Developer
    You are really talking about the "precache " buttons on the overview and the album admin pages? Nothing is precached by default or automatically, you always have to press a button.

    Anyway, as we have this topic regulary please read here why precaching is not necessary: http://www.zenphoto.org/2009/03/troubleshooting-zenphoto/#22
  • gjunk Member
    Thank you - yep I did mean pre-caching as you say.

    I do understand it is not necessary - tho' there is a potential small benefit as you point out.

    Please can it be disabled as it can crash the server - or cap the number of images processed in parallel. It means anyone downloading & installing zenphoto - can with good intent, crash their host server (real or virtual). This is still bad no?

    If its not necessary, isnt very useful and on the downside can DOS/crash the server - then why have it at all ?

    Gene/
  • The quickest way to disable it would be to cut off your fingers. Or, as the doctor says "if it hurts, don't do it".

    The function has proved useful for some users. If not for you then you know what to do.
  • gjunk Member
    I'm looking at it from the perspective of a server hoster rather than a user .. but I get your point. (But I will adjust the web server settings to be far more restrictive - but that is not desirable either really ...

    Nonetheless ... would it be hard to serialize this ? Honestly it will run faster anyway .. once things are swapping like mad, they dont go faster ... they go much much slower - so it would improve functionality to limit this to 2 images at a time for example.
  • Looking quickly at the code before, it looked like it did process serially - unless I was looking at the wrong thing. It does make sense that serially processing would be faster on many servers, and certainly not kill the memory. I'd think it would have been more work to code it to process them in parallel, and for no particularly compelling reason. Are you sure they're all done at once? I know I had to increase my PHP memory limit to do process many of my images (4k x 3k pixels), maybe the memory limit you had was from the apparent inefficiency of DGLib versus something like ImageMagick (which doesn't seem to work for me, but is likely a server config issue on my end).

    I agree with sbillard that the feature is useful for some and should remain, and I also agree with you that processing serially makes more sense - but as I said, it seemed to me that it already did that.

    I guess I can play around with that while trying to figure out why IMagick doesn't work for me, so maybe I can kill two birds with one stone.
  • gjunk Member
    Hiya - what I see on the server looks like a fork bomb of httpd processes ... each one is not too bad ... first time I tried this was on a new install - I rsync'ed the gallery album tree (about 2,000 images) pressed the pre-cache - within a short period i think there were 10's if not 100's of httpd' processes - was hard to see coz the load average hit 140, and the machine was swapping out like mad.

    2nd time (using 1.3.0) I did it on a single album of about 40 images or so.

    So yeh - it sure looks like its going in parallel on the server - as a rule of thumb for these kind of processes, 2 processes per allocated cpu would probably work decently in my meek opinion.

    thanks for following up ...

    gene
  • Ahh, I see how it dies it - it prints image links out, so what really causes all those processes is all the incoming requests for the pictures from the page links. The cache isn't generated in the loop, it's generated when the final page is displayed.

    Is your browser set to initiate an abnormally high number of simultaneous connections to the target server? I think the default is usually around 4, so you shouldn't have more than 4 processes at once.
  • It actually depends on your server how it is processed. If you limit the threads that the browser can activate it will "serialize" the process. All that is happening is that the browser is requesting each image and its thumb.

    So, the moral of the story is that if you configure your server to allow unlimited thread activation when a browser requests items, then you run the possibility of someone overwhelming the server.
  • Yes, this would be an issue for too many users flooding the site, also - though a smaller number of image-processing threads would overload the server than image-serving threads. Still, the "DOS" aspect of the bug is still more related to server settings than to the image caching process itself.
  • gjunk Member
    I don't disagree that the server should protect itself - but overloading the server nonetheless is a design issue in my view and zen will benefit from some changes too!

    apache (pretty common server) defaults MaxClients to 256 - which is typically a non-issue unless those 256 processes are all running heavy php code as is the case here.

    I have lowered apache settings to prevent zen from taking down the server - tho I will need to run further tests to establish the right limits - this will penalize light weight processes however, so it is not ideal either

    Unfortunately apache does not offer the kind of fine controls necessary to allow 256 normal processes but limit the cpu/memory usage as well to prevent this kind of overload.

    That said - Perhaps processing the images serially in a php code loop would be better - and generally it would perform a lot faster than the current approach - and be a better house guest for the server too .. win win !!

    Thanks for followup ...
  • gjunk Member
    I apolagize for not being familiar with php or the zen code - but would something like this work:

    use inotify() to check for the existence of cache file (or better maybe, put a lock file and wait for it to be removed) before loading the next link to process the next image.

    Something like that ...

    gene/
  • There's already a loop that serially generates the image link for all images in all albums. When the resulting page is displayed in the browser, the browser attempts to load all the images. Since the images aren't yet cached, this happens automatically - but as you said, perhaps for too many images at once.

    Simply manually calling the cache creation functions from within that loop should serially create the images, though this may have the side effect of not beginning to send the image links to the browser until the loop is complete, thus appearing to be "hung". It might be necessary to push the content little by little as it processes. I should have some time to look into this tomorrow.
  • gjunk Member
    That makes sense ... do you think can inotify() can be used to help in managing the content flow somehow ...

    Thanks for looking at this ...

    gene
  • I have created a ticket to follow up on this: http://www.zenphoto.org/trac/ticket/1498
  • gjunk Member
    Thats great - thank you so much ...
Sign In or Register to comment.