Ensuring maximum CDN performance

Every internet project needs an infrastructure to support it and it must be sufficiently sized for each phase. This means that consumption is not excessive when it is not necessary, that resources are not idle when they are not needed and that they can grow when we need them most. All this without neglecting security so as not to suffer a technical debt later on that will cost us a lot to pay due to the changes we have to undertake or, in the worst case, the theft of information that we are victims of.

Leading my team at Transparent Edge, I have spent many years helping businesses and other organizations manage their origin platforms.

The reality is that many lack experience in implementing web platforms or do not have enough skills to be able to carry out this type of work, which requires senior Systems and Security teams both in the design and implementation and in the subsequent administration of the Internet infrastructure.

We are talking about teams that, like ours, work at the origin and are focused at all times on performance, scalability, operational benefit and security.

Based on a real case, in this post I want to tell you an example of how we work at Transparent Edge to guarantee the maximum performance of our next-generation CDN and what we do for organizations that want and need to introduce one in their service but do not know how or do not dare to do so.

Finding the most out of a next-generation CDN

Some time ago, a company came to us with serious instability problems in its web platform. This platform was based on microservices and served its product in production.

The first thing to do in these cases is to analyse the situation. To do this, we implement different monitoring solutions that allow us to obtain data on which to base ourselves and then make decisions to achieve the best performance from the CDN.

Thanks to this monitoring, and thanks to the direct contact between our technical teams and the client’s internal technical teams, we were able to detect bottlenecks, configuration defects and other problems that meant that content delivery was not as efficient as the client needed and the end user desired.

In parallel, our client, who until then had relied on one of the largest non-European CDN service providers, decided to change and hire the next-generation CDN from Transparent Edge.

The change was simple: we added the configurations in our dashboard and changed a CNAME in the client’s DNS. All their traffic started going through the edge nodes of our CDN. And then we were faced with the harsh reality: none of the requests were being cached.

The importance of being part of the customer’s team

The investigation we immediately began allowed us to see that all requests showed a no-cache configuration that not only prevented the CDN from being used, but also added a few milliseconds more to each request because the CDN servers, even if they were not used, were acting as proxies through which all traffic passed.

The previous provider had never raised the alarm and was billing for the traffic originated. Why? Why? Because the other CDNs lack origin services. We not only have a next-generation CDN, supported in its daily operation by a team of engineers. We also have a team of senior Systems and Security engineers who work as if they were part of the client’s team, looking for potential problems and opportunities to keep the CDN running at full capacity and to maximize the operational benefits of the IT area.

Back to our client, they confirmed that they had had quite a few issues updating content in the past and had therefore decided to introduce no-cache headers to prevent anything from being cached.

Thanks to the symbiosis between our technical teams and the client’s teams, we came up with a solution proposal. In this way, we started caching only the static content of their applications: images, javascript, CSS, etc., leaving the more dynamic content for a second phase.

With this measure, one might think that traffic would have been significantly reduced, but this was not the case, since all requests were accompanied by a Vary: Cookie header that caused a cached object to be generated for each request that was only accessible to that user, which greatly fragmented the cache and made it ineffective.

The emergency solution we then came up with was to configure something like this on the nodes at the edge:

if(bereq.url~"^(avi|bmp|doc|flv|ico|mov|mp3|pdf|png|jpg|jpeg|css|js|xml)$") {

     unset beresp.http.set-cookie;

     unset beresp.http.pragma;

     unset beresp.http.etag;

     set beresp.ttl=86400;

}

There are more elaborate solutions that do not delegate what to do to the request URL, that honor the cache headers they receive from the origin and that allow generating images dynamically. But at that time we needed a quick solution that would reduce the load on the origin servers and this was the most appropriate for our client.

After this, it was time to decide what and where we wanted to cache. Anywhere? In browsers? Also on the edge servers of our next-generation CDN?

The client was skeptical about caching some of their content in browsers and we proposed caching only on edge nodes by introducing something like this in their origins:

Cache-control "max-age=0, s-maxage=2629800"

These objects were cached at the edge for up to a month, but not in the users’ browsers. This allowed our client to control what was delivered at all times, invalidating at their discretion what needed to be invalidated. To do this, they were able to rely on our real-time invalidation solution, which is easily integrated via API and allows invalidation programmatically. The invalidation occurs in a matter of seconds on our nodes at the edge, without having to worry about browsers storing obsolete content in their internal caches.

The results

With this and a few other things involving both our architecture engineers and the client’s development team, we managed to ensure that the service was stable. In addition, we achieved maximum CDN performance, as well as ensuring that users had an excellent browsing experience in terms of delivery speed.

Our work for this client enabled them to increase their IT operational profit by significantly reducing infrastructure costs. They are now serving an average of 70,000 requests per second per day with a very small origin platform, thanks to 99% of cached objects.

Fermín Manzanedo is co-founder and COO of Transparent Edge.

With a critical mind and a search for continuous improvement, Fermín has contributed to the development of a good number of companies, including large communication groups, solving their information technology problems with solutions designed for the specific objectives of their businesses. He is also one of the ‘fathers’ of the first Spanish CDN. A physicist by training, he always brings a strategic vision to the administration of IT systems.