30 Nov 23
As the volume of data on a website increases, analyzing it manually can become a nightmare. An anomaly detection system can automate and simplify this strategic task to detect incidences or suspicious behaviors.
It’s not common to start by asking why, but it’s as good a way as any. What the hell has happened that we decided to jump into this mess? Our platform already does its job perfectly, making your website fly and protecting it from attacks at all times. Do we need this kind of service? The answer is a definite YES.
With a platform like ours, a vast amount of information is in motion, and democratizing that data is not easy. Extracting information from said data can be a complex, laborious, and even intimidating task for our clients. That’s why we’re here: to make things easy for them.
Data is a window to knowledge, in this case about your website. For your website to continue flying and being a fortress, you have to know exactly what is happening. And preferably, at the moment it happens.
Another question, I’m starting to like this, let’s keep it going.
The anomaly detection system, as its name implies, is a system that tries to find anomalies in your website traffic and alerts you when they occur. As simple as that. As complicated as that.
Sounds good, but what is an anomaly? In contrast to the negativity that the term evokes, it doesn’t necessarily have to be something bad. Depending on the nature of your website, a tenfold increase in traffic at a given moment can be considered an anomaly, positive or negative, but still, an anomaly. And, as such, it is important that you are aware that these events are happening and at the moment they happen.
At the time of writing this post, the system contemplated six different types of anomalies:
This service integrates with another recently developed feature: our ACL management system, so if a trusted IP is triggering an anomaly, it can be included in a list to be disregarded.
Of course, the sensitivity of these probes can be adjusted in our panel to match your site’s needs and, above all, to ensure that the information we send is useful and doesn’t end up in your spam folder.
This anomaly detection system is based on our analytics tool which, as you know, allows you to see in detail what is happening on your website in real-time.
As you can imagine, moving this much data isn’t easy, and having analytics like ours with such a level of detail is costly on many levels.
The analytics system is a large database based on Elasticsearch. We send the logs of all our systems there in real-time, to be able to exploit that data more comfortably, either from our panel or to feed other tools like the one we are discussing today.
To transport these logs, we use Filebeat and Kafka. The architecture is similar to this diagram:
And this is the result:
The anomaly detection service associates with our advanced analytics, which retains data for one week. Compared to the four hours of retention offered in basic analytics, this is a great advantage. Additionally, all clients with this system will have a free license to activate it on one of their company’s sites.
Jorge Román is the co-founder and CEO of Transparent Edge.
Jorge Román is a systems technician who over the years has given way to the CEO he had inside, or a CEO who over the years has given way to the systems technician he had inside. He doesn’t have a clear idea and often thinks about it while mopping the office floor and serving coffees. The rest of the time, he manages the first CDN of Spanish origin, raises two daughters and has plenty of time to read about entrepreneurship. Sleeping is left for another life.