Update: The new Cloudflare Reserve feature stores your files in cache indefinitely, thereby solving this problem. It’s a paid service but small to medium-sized sites will barely see any cost. This puts Cloudflare way above all other CDN providers.
After years of working with different CDN providers, I’ve reached a disturbing conclusion. All CDNs ignore your cache retention preferences. They flush your data long before you want them to. And I have the data to prove it. This drags down your cache hit ratio and slows down your site.
Ideally, you want your CDN to keep your files in their cache for as long as possible. This way, more of your customers get fast loading times, and there’s less load on your origin server, which can be used to serve HTML instead. All CDNs give you multiple ways to control the cache retention of your files. Usually in one of the following ways:
- Through the “cache-control” response header from your origin server
- Via explicit instructions on the CDN settings
Cloudflare even allows you to explicitly set the “Edge Cache TTL” for individual pages via its Page Rules section like this:
“Wonderful!”, you think. “I’ll just tell the CDNs to store my data for 1-year and be done with it. That way, my static files will never be served from my origin server.”. Pretty cool right?
Not so fast.
All CDNs Flush Your Data After 2 Days if Not Accessed
My site gets infrequent traffic from certain countries. But that traffic is important. It pays the bills. So imagine my surprise when my cache hit ratio was very low for these countries. I became suspicious and tested out three CDNs for myself:
- Cloudflare
- KeyCDN
- BunnyCDN
I found that without exception, they all flush your files from cache if they’re not accessed for 2-3 days. It doesn’t matter what your cache retention policies are. It doesn’t matter if you explicitly set the TTL for certain page. None of it matters. They will flush your cache, no matter how badly you want them to retain it for infrequently accessed data.
CDN Cache Retention: Proof with Data
All three CDNs include a response header that tells us if they’re serving cached content. Here they are for each of them:
CDN Response Headers | |
---|---|
Cloudflare | cf-cache |
KeyCDN | X-Cache |
BunnyCDN | cdn-cache |
Based on this, I created 3 dummy CSS files and primed each of the caches. Then I accessed them at different times to test whether or not they were still cached. And here are the results:
Start Date: 5:00 pm 13th May 2019
Time (Hours) | Cloudflare | KeyCDN | BunnyCDN |
---|---|---|---|
5 hours | HIT | HIT | HIT |
17 hours | HIT | HIT | HIT |
29 hours | HIT | HIT | HIT |
49 hours | HIT | HIT | HIT |
72 hours | MISS | MISS | MISS |
My cache retention settings for all three CDNs was at least one week for my new CSS files. You can see that despite that, each provider flushed my files from their cache after 2 days of inactivity. Sometime before day 3 in any case. What’s the point of a CDN allowing you to specify cache retention policies, if they’re only going to ignore them?
Honestly, I was surprised that Cloudflare’s free plan was able to match the paid products of the other two. I expected KeyCDN and BunnyCDN to keep the files for longer, but I was wrong. After much consideration, this is why I feel that Cloudflare is still the best CDN even though it’s free.
Also, my anecdotal experience is that Cloudflare clears its HTML cache far more frequently even if there’s a page rule set to explicitly keep in for longer. I’ve heard that Incapsula has long cache retention policies, but they’re too expensive for me to just test out.
It’s Particularly Bad for Low-Traffic Sites
Sites whose content is infrequently accessed are the biggest losers here. There are good chances that certain content won’t be accessed for more than 2 days on certain POP servers, and that will cause a delay for the few visitors who do finally visit.
As far as I know, there’s no way to change this behavior. CDNs flush their cache frequently to avoid overburdening their servers with useless storage. One workaround would perhaps be for a CDN to charge its users a fee for pull storage POPs, so they could keep the content indefinitely. But such a provider doesn’t exist.
Push Zones are Not a Solution
Push zones are where you manually upload your data onto a CDN instead of waiting for it to be passively pulled from your origin server. However, this doesn’t mean that the edge POPs will store the data. That still has to be downloaded from the push zone the first time a request comes through.
There’s no easy solution here. CDNs are incredibly useful if you can trust them to store your files in cache and serve them to customers when needed. But if it’s cleared out regularly, then it defeats the purpose. If you have some better insight into this problem, let me know in the comments below!
Speak Your Mind