If you’ve been following the Alcion blog over the past couple of months, you’ll know that Niraj Tolia, Alcion’s CEO, and I have explored the increasingly blurred lines between data protection and cyber-security. In Niraj’s blog post, he outlined Alcion’s thesis – that the overwhelming majority of data outages are likely to be caused by cyberattacks – and my previous article gives you a sense about the true cost of a breach by walking through a real-world case study. The bottom line is that ransomware represents a credible, high-severity threat to every business, irrespective of the business size or sector, and its capabilities are ever-evolving.
Alcion is focused on building a data protection solution for the modern age – in addition to our cloud-native architecture, Alcion’s platform leverages artificial intelligence in multiple components of the product, from intelligent backups to compliance scoring to ransomware detection. But what separates us from other data protection services (and other startups with “.ai” domains) is that we’ve built these AI models not just to pay lip-service to this exciting new technology trend, but to help solve the problem that we obsess over – protecting our customers’ mission-critical data.
This article is going to drill down into just one of our AI-driven features, ransomware detection, and explain how it provides industry-leading cyber-security protection for our customers. As the article will show, Alcion’s composable architecture has been purpose-built for AI-driven data protection workflows, allowing allowed us to efficiently implement fine-grained ransomware detection techniques that our larger legacy competitors have claimed were unfeasible or, frankly, impossible.
Our ransomware detection algorithm is built on the following five pillars, each of which will be discussed in detail below:
Alcion uses anomaly detection models to detect ransomware in customer’s environments. Anomaly detection refers to a family of unsupervised machine learning algorithms that learn what's "normal" within a given data set or system and then alert on deviations from this norm. The unsupervised nature of these models was particularly important to address the cold start problem. Specifically, it enables Alcion to offer AI-driven ransomware detection without having a corpus of pre-aggregated training data.
Alcion’s anomaly detection models are considered “multivariate” because they’re designed to process observations which consist of multiple signals. Alcion publishes one observation for each backed-up file and, by collecting multiple signals for each file, we’re able to detect attacks from a wide range of ransomware strains. This is non-trivial because . For example, some ransomware strains encrypt the entirety of the file content so that the victim can’t salvage any unencrypted content, while others only encrypt a portion of the file content so that they can expedite the attack process. Lockbit falls into the latter category and Splunk cybersecurity researchers observed this strain to be capable of encrypting over 98,000 files, totaling 53 GB, in less than 5 minutes because it only encrypts 4KB of each file.
We spent months doing research and analysis on a diverse set of ransomware strains. Using these findings, we identified a set of signals which are highly representative of ransomware-encrypted files and built a system to feed these inputs to our anomaly detection models. This feature engineering was arguably the most important part of the project because a model is only as good as the data. But that’s only one aspect to the “secret sauce” – another differentiator here is that Alcion collects a multivariate observation for each file on every backup. This breaks through the industry consensus which, as recently as 2022, held that it was prohibitively inefficient to collect signals at this granularity and frequency. This gives Alcion a leg-up in detecting ransomware attacks which are programmed to encrypt only a subset of available data. These attacks are no-less severe because, as the same Splunk cybersecurity researchers noted, “the catastrophic apex may be when a single critical file is encrypted.”
Consider the following scenario: ACME corporation is a financial consulting firm that has 75 employees, of which 5 are executives. There is a targeted ransomware attack against just the executive team that works with critical client and business data. If multivariate observations were to be collected from all 75 users and fed into a single anomaly detection model, the performance would be underwhelming because the patterns in file-related activity differ greatly between the 75 employees. This skews the aggregate model’s perception of “normal” such that it’s less likely to raise an alarm.
Instead, we’ll achieve the most accurate inference results with separate anomaly detection models for each user, where each model is trained on the file-related activity trends which are specific for that user. This informed our decision to have each model assigned to a specific user – each model is trained only on data collected from that assigned user’s backups. As a result, the threat inferences are tailored to that user. The same technique is also applied to other distinct resources, such as individual SharePoint sites, that contain data at risk of ransomware.
By training our models at the most granular level, the user, our ransomware detection feature is also able to detect anomalies quicker. Early detection is of utmost importance when it comes to cyber-attacks – the sooner admins are alerted to an attack, the sooner they can isolate the infected systems and protect any remaining unencrypted data. This, in turn, reduces the leverage of the bad actor.
Furthermore, this design ensures that data is never shared between tenants.
To the best of our knowledge, Alcion is the first enterprise data protection service to offer ransomware insights at resource-level granularity. But we took it one step further and implemented “ensemble methods” for each resource, meaning that each inference result published to the customer is actually a combination of results from many models. Each model in the ensemble analyzes the likelihood of threat for a specific ransomware attack profile. For instance, certain models are refined to account for the speed at which data is encrypted (remember the Lockbit example from above). Meanwhile, other models are optimized to detect ransomware attacks based on the encryption approach – some strains encrypt the data in-place while other strains delete the original file and replace it with a new file containing the encrypted content. These are just a few examples of the domain-specific knowledge that’s been encoded into our models.
In summary, we’ve designed our ransomware detection feature so that our customers get threat insights tailored to each protected resource and each of the distinct ransomware attack profiles that have been compiled by cybersecurity experts at Alcion.
Even though the variance in observations is greatly reduced by our resource-scoped models, trends in file-related activity for a resource can evolve over time. For example, we would expect to see new file-access patterns when a user changes roles or onboards to a new project. To account for this, we’ve built our models to support continuous learning (also referred to as “online learning”) so that stale observations are automatically pruned and replaced with fresh observations. This is different from traditional machine learning models where training and inference are discrete processes – once a model is deployed, it can’t learn from new observations, it can only make inferences on them. In the best-case scenario, new versions of the model are being trained on incoming observations, but end-users don’t benefit from these improvements until the new version is deployed.
It’s helpful to compare these model architectures through the lens of software deployment frameworks (continuous deployments vs. versioned deployments) – with the continuous deployment framework, customers benefit from improvements and fixes in real time. We believe that the ability to adapt in real-time to changes in file-related activity patterns is a key differentiator for Alcion’s ransomware detection capabilities.
Note that this continuous learning capability has been refined so that Alcion is still able to detect ransomware attacks that encrypt data at a slower rate. Traditionally, ransomware attacks try to encrypt data as quickly as possible, but this slow-moving approach could be used to avoid other ransomware detection alarms based on resource consumption (e.g., CPU utilization or network I/O).
In the previous sections, we walked through the architecture of Alcion’s custom-built, AI-driven ransomware detection feature. While our anomaly detection models boast industry-leading detection capabilities, we’ve augmented this offering with insights from Microsoft Defender. Specifically, Alcion will check for ransomware-related signals from Microsoft Defender and seamlessly integrate any relevant data into our system so that Alcion customers can see all alerts in one place.
Our XDR integration also allows us to benefit from detected attacks impacting other parts of a customer’s IT infrastructure (e.g., employee laptops or cloud VMs). Apart from being able to leverage these signals in our AI models, they also allow us to trigger proactive backups to capture clean data before the attack spreads further.
We tested our ransomware threat detection against a number of prevalent ransomware strains. As an example, the above results show the effectiveness of Alcion’s ransomware detection algorithms against the Bad Rabbit ransomware strain on a variety of different file types and file sizes. The green dotted line represents the threshold which separates ransomware-encrypted files (represented by red dots) from non-ransomware-encrypted files (represented by blue dots).
As an emerging leader in the Microsoft 365 data protection space, Alcion found itself in a unique position to be able to redesign ransomware detection from the ground up. Rather than settling for industry parity, we invested heavily in in-house research and development. As a result, we’ve been able to incorporate findings from the frontier of cybersecurity research and build artificial intelligence models that yield insights with unprecedented granularity.
Don't let your organization become the next ransomware headline. Equip yourself with a defense mechanism that's been purpose-built to combat today's threats and anticipate tomorrow's. Let Alcion's AI-driven platform be your first line of defense against ransomware – you can try Alcion for free! The trial runs for 14 days, and no credit card is required. Find instructions on how you can get a free Microsoft 365 test/sandbox domain in five minutes and use it to trial Alcion. If you have questions or need support, find us on Discord or contact us via our support page.