How multiple caching layers can confuse the hell out of everybody a.k.a. when will my story publish!?!
When using a varnish/squid/nginx or any reverse proxy +cache / http accelerator on top of a modern web framework / web application that has it’s own caching layer you may run into some strange seemingly uncontrollable TTL’s on your content. I have come to rely heavily on varnish in the last year and squid before that. If your reverse proxy is setup defensively (and in my opinion it should be) then you probably tend to ignore somewhat the http headers coming out of the application layer and put an additional TTL on top of all of the (non-dynamic) content. A for instance: There is a certain editorial site that I maintain that I cache everything on the site in varnish for 30 mins… period. This means that once an editor pushes a publish button and the content is picked up into the cache it won’t change for a half an hour. OR WILL IT? Turns out that since the application layer also has a cache lifetime of something like 5 mins the object in varnish may or may not refresh for an hour or more. How? Well that’s where it gets difficult to explain. It all depends on when the change get’s published to the content with respect to the application servers cache expiry on the object AND varnish’s cache expiry on the object AND the time of the next request for the object after Varnish’s cache expiry has happened. HAH! Before we go any further a picture is needed.
As you can see from the graph there are two sawtooth looking lines representing the two types of cache's in use on this site. I have made the assumption that an editor publishes a change to an object (say a homepage of a site) every half hour. Really they may publish 25 times within that half hour but let's just keep it simple. So in the best case example an editor publishes a change. That publish event happens RIGHT BEFORE the cache in varnish AND the cache in the application server is invalidated for that object and the user request comes in RIGHT AFTER the cache is invalidated in both places so that the user request flows all the way upstream to the application server which re-renders the page with the new content, caches it and then sends it back up to varnish which also caches it and then sends it to the client. That is the BEST POSSIBLE case. That RARELY happens. What normally happens is that a user request comes in, hit's the varnish cache and is served a moderately old object. I say moderate because more than likely on average the object is somewhere near the middle of it's TTL. IF the varnish TTL has expired on an object what is next most likely is that the APP server is somewhere within the middle of the TTL for the object within IT'S cache and the app server will just serve the cached response and then varnish will do what it knows best and CACHE THE OBJECT FOR ANOTHER CACHE CYLE. That means ANOTHER 30 mins in this scenario. Hmmm editors are not to happy about this situation. Thing is. It get's worse. What CAN happen is that editor publishes a change immediately AFTER the app server rolls over on it's cache for the object, re-caches the old content and never sees the editors change. The re-cached old content in the appserver is then served again to varnish who could (as within the last example) roll over on it's cache RIGHT at the end of the app servers cache cycle for the object BUT BEFORE the appserver's cache actually invalidates. As explained in the last example the appserver serves varnish the old object, varnish re-caches the old object and then that old object is within the varnish cache for the extra cache cycle. SO in the worst case the editors story doesn't get picked up for what is essentially the sum of TWO FULL CACHE CYCLES of both cache's TTL's.
All of this is VERY hard to explain to laypeople. Hence the picture above with the perty colorful stars.
I hope this helps someone out there. I know I will use it over and over again..
P.S. I do realize you can always just get rid of one of the application servers cache. Varnish is performing the same exact function so why bother. Fact is I simplified this example to html buffer caching in both layers so that you it would be easier to understand. What Is really going on is that the application layer is not caching the html buffer. It is actually caching the results of the queries from the database in memcache for a period of time. In addition there are "partials" strewn through out the app that are cached independently of context within the app which gives finer grained control over areas that change less frequently than varnish allows for. BTW I do realize varnish supports ESI allowing certain of these partials to be re-calculated at the app server rather than at varnish even when varnish serves some of the page from it's cache. I hope to utilize ESI in the future but that will require an app re-write. ALSO I would love to have the application server send the cache invalidation requests upstream to varnish when it's own cache cycles roll over on objects. Unfortunately the off the shelf app we are using is not that smart and again it would require a re-write to get that in there.
