Securing content through the use of tokens

Content delivery networks play a fundamental role in video broadcasting and, more precisely, in securing the transmission itself. In recent years we have had the opportunity to work a lot on this point through the use of tokens together with one of our clients. In today’s post I tell you what we have accomplished together.

Unlike other resources that can be delivered by a CDN, videos have a differentiating characteristic: they are heavy, very heavy. According to our real-time traffic monitoring, more than 30% of the traffic delivered daily by our CDN corresponds to video resources, despite being only 1% of the total requests.

This information simply serves to illustrate the weight I am talking about because it directly influences bandwidth and the cost of outgoing or egress traffic. Correctly sizing these two variables is essential to guarantee the quality of the service (availability, response time, low latency, etc.) and the user experience.

This is how we distribute the videos

At Transparent Edge – and with permission from Julius Caesar – we follow a dívide et impera approach. This approach invites us to divide a complex problem into subproblems of an equivalent but simpler nature and, consequently, more easily manageable. Thus, we carry out the transmission in a staggered and progressive manner, in smaller, lighter fragments called chunks.

We went from having a single resource, a video of a certain length, to having a multitude of small resources. To determine the order in which these chunks are played, we use an index. We call these indexes manifests.

In the case of a video on demand, your manifest could be the index of all the chunks that make up that video, from start to finish; conversely, in the case of a live broadcast, this manifest will be gradually refreshed as the broadcast progresses.

In terms of cache policies, in the first case we could set relatively high persistence times, for example with a ‘max-age‘ of one week in the ‘Cache-Control‘ header. However, in the second case, the cache policy should be radically different: with a minimum persistence of a few seconds, depending on the duration of these chunks.

How to secure content using tokens

We can imagine a token as a kind of dynamic signature generated by the application from which the user plays the content and verified by the client’s origin. We say it is dynamic because it is updated as the playback progresses, being valid only for a certain period of time.

To generate a token we need certain variables: the user’s IP address, the start and end of the token’s validity period, a secret key, etc. The specific variables depend in each case on the particular needs of each client.

In any case, these variables are concatenated in a certain way (e.g. [<IP address>][<valid from>][<valid until>][<secret key>]) and a cryptographic or hash function is applied to them. This function will convert the input string into an unintelligible sequence of characters, as if Schrödinger’s cat had escaped from its box and, enraged (and perhaps zombie-like), had started jumping around on the computer keyboard.

The resulting hash, together with the public variables (the user’s IP address and the start and end of the token’s validity period), will allow us to compose the token itself, which will be sent to the client’s backend, for example: <IP address>-<hash>-<valid from>-<valid until>.

The request received by the origin will collect this token and extract the different fields that make it up. It will take the variables and, together with the secret key, it will repeat the same steps previously executed on the user’s side and compare the resulting hashes. For the token received to be valid, both must necessarily match.

At this point, we can find different scenarios:

If the origin does not receive a token or if the token is invalid, the origin might respond with a status code 401 (Unauthorized);
If the token validity period has not yet started, the origin may return a 404 (Not Found);
If the validity period has expired, the origin may return a 410 (Gone).

The challenge from a CDN perspective

The use of tokens irremediably affects the cache key. Each request will be different from all the others, even if they refer to the same resource or asset (chunk or manifest), collapsing the hit ratio and the efficiency of the cache itself.

To deal with this problem we can follow two strategies:

Implement a preflight request so that when receiving a request for token-secured content, an external service is invoked at the client’s origin (backend) to tell us whether the token is valid or not and, if so, return the cached object. To do this, Varnish Enterprise provides us with very interesting tools, such as the http VMOD. This approach is similar to the one we follow in other use cases and scenarios of a very different nature to the one we are dealing with now, such as the implementation of payment walls or paywalls.
Delegate the token verification mechanism to the CDN itself. To do this, the client must share with us the way in which the signature is generated. This is the strategy we have used in the case I am presenting to you today. At Transparent Edge, we provide an out-of-the-box solution to protect content using a token, which can be easily integrated into our clients’ VCL configuration. Here, Varnish Enterprise provides us with the VMOD crypto, which makes our lives easier when carrying out the cryptographic operations we talked about earlier.

The main advantage of the chosen approach is not only to ensure adequate cache efficiency, but also to free the client’s backend from all the computation required to validate one after another the different tokens sent in the underlying requests. In addition, executing these calculations at the edge improves the user experience thanks to the network of nodes strategically distributed around the world.

In the case we are addressing in this post, we have worked alongside our client for the last few years. It is an honour to continue walking alongside them and adapting our functionalities to the particular needs of each project.

As Antonio Machado said, “Walker, your footprints are the path and nothing else; walker, there is no path, the path is made by walking.”

We keep walking.

Alberto Suárez López is a systems administrator at Transparent Edge.

More Asturian than cider, Alberto studied Technical Engineering in Management Information Technology and a Postgraduate Degree in Web Engineering at the University of Oviedo. Since then, he has faced all possible types of UNIX infrastructure and has defeated them all thanks to his vast knowledge that makes him an all-rounder when faced with any technical problem.