On-Demand Webinar

The UEBA Illusion: Why Traditional UEBA Falls Short

Detection Strategies

March 28, 2025 11:00 AM

CST

Online

On-Demand Webinar

The UEBA Illusion: Why Traditional UEBA Falls Short

Detection Strategies

By: Kevin Gonzalez, VP of Security, Operations, and Data at Anvilogic

‍

User and Entity Behavior Analytics (UEBA) has been put on a pedestal by security teams due to its promises to uncover insider threats, malicious behaviors, and sophisticated attacks through the power of advanced analytics. Yet many of us who’ve deployed these solutions know the reality can be far less of a beautiful picture…

In this article, I’ll share my personal experiences with implementing UEBA (more than once!) and highlight why traditional approaches often fail to deliver actionable results. Then, I’ll explore how a detection engineering-driven method–one that enriches and contextualizes data before data science steps in–promises to radically improve the signal-to-noise ratio for SOC teams while actually delivering on UEBA’s promises.

My UEBA Journey: A Tale of Two Organizations–Chasing the ROI Carrot

Part 1: Many Months… Many Many Many Many Months.. (ref. 50 Cent..)

The first time I implemented a UEBA product for a Fortune 200 organization, I quickly learned that the deployment is only half of the battle. Once the tool was installed, configured, and pointed at the right data feeds, the vendor told me: “Now we just need a few months to baseline your environment.”

Why so long? The system attempts to normalize the endless heap of endpoint, network, and user data and define what “normal” looks like.
The result? A painful waiting game. No real value for the SOC until the system finishes learning and believe me, what it “learned” isn’t anything impressive. But hey, checkbox checked for our requirement for an insider threat program.

Of course, once those months passed, we saw fewer anomalies being raised to our triage teams, but ironically, I still had an avalanche of worthless, un-triageable, alerts wreaking havoc on our response teams. The tool flagged everything from sysadmin patching to normal application upgrades as “suspicious”, and from its point of view, they were! In the system’s defense, it had absolutely no context of what threats actually looked like even if it smacked it right in the models that powered it. By the time I filtered out the noise, we’d spent valuable weeks explaining to the triage team why these so-called anomalies were actually just business as usual.

Part 2: Black Box System, Black Box Anomalies, Bird Box Analysts?

If you don’t know me, I can’t stand not knowing why something isn’t working. So, I dove into the backend of our UEBA product, analyzing how thresholds were calculated and how data points were ingested and leveraged. After a month of tuning thresholds, reconfiguring data pulls, adding model-specific backend filtering, and adjusting the baseline periods, I had a partial success story–but at what cost?

As the first two letters of UEBA suggest–Only I (the user), and any higher-level Entity knew these custom tweaks so the SOC was effectively blind if I was OOO.
My triage analysts still faced alerts like “Anomalous Network Activity”, with minimal context about why it was suspicious.
The correlation logic was primarily statistical, lacking essential security intelligence as to what makes an event suspicious in conjunction with core enrichment properties like known TTP references.

In short, the alerts lacked context. Meanwhile, custom detection rules my team had built (with real security logic) were catching more legitimate threats–and insider threats–than this big, expensive platform. And this story repeated at another Fortune 200 environment, plus multiple consulting gigs. Over and over, we faced:

Long baselining periods–delaying ROI by months.
Noisy “anomalies” that lacked clear investigation paths.
Limits on data ingestion–if the UEBA product didn’t receive the exact data points it expected, all bets were off.
Complexities of large networks can actually make datasets that have all the right data points completely unusable and downright malicious to your product capabilities (like “impossible travel” that is actually normal due to multiple proxies, VPNs, and dynamic routing).

In the end of these implementations, I am not quite sure we ever got that UEBA ROI carrot in our hands, it always seemed just out of reach.

Why Normalized Data Alone Isn’t Enough

Many UEBA tools do attempt to normalize data. They map logs from firewalls, endpoints, authentication systems, and more into a centralized schema. The premise sounds great: once everything’s categorized into like-schemas, the product can run advanced models on top of it and leverage each data domain as needed, regardless of the underlying data sets.

The Problem of “Data Science on Raw”

“Normalized” does not mean “security contextualized.” These models still chase anomalies in a broad dataset. Something might be an outlier mathematically, but that doesn’t mean it’s malicious.

Admin Activity: OS patching, system troubleshooting, network & infrastructure changes might look different from the normal daily log sets, but it’s benign.
Application Rollouts: A big software update might trigger spikes in resources, new processes taking affect to support daily functionality or maybe its just part of an upgrade script–again, unusual but safe.
Unpredictable Network Events: The mix of external proxies, VPNs, and SSO/IDP providers can resemble “impossible” travel,” but is simply how your environment is structured.

Without tying these events to known threat intel or analyzing their behaviors in the context of real TTPs, analysts see a never-ending stream of ambiguous alerts that read like “could be suspicious.. Or not.” That’s how you end up with two months of “baselining,” only to realize anomalies never quite settle down.

Rethinking UEBA: Detection Engineering First, Data Science Second

After facing these challenges multiple times, I began shifting my approach. Instead of letting a UEBA solution “see everything” in raw or even normalized logs, why not feed it a curated stream of alerts–each containing real security context?

Step 1: Create Atomic-Level Detections

Rather than waiting for a black-box model to declare something “weird,” detection engineers define atomic detections that address actual TTPs and suspicious patterns. This can be a combination of behavioral, signature, and composite-based detections, such as:

Parent-child process anomalies such as PowerShell spawning rundll32
Remote admin tool utilization such as PSExec, SMBExec, WinRM, WMI, etc.
Unusual file modifications that align with known ransomware behaviors
Network calls to known or malicious suspicious domains or IPs
Insider Threat-specific behaviors such as employees on job boards, exfil to cloud drives, emailing their personal emails, etc.

Each detection includes relevant context:

MITRE ATT&CK references
Threat Intel/CMDB Lookups such as domain reputations, known malicious IOCs, asset & identity information, etc.
Metadata about the detection use case itself and what it is trying to achieve, whether it should be triaged, how it should be triaged, confidence and severity scoring, etc.

Step 2: “Alert Lake”

These atomic alerts–already validated as “worth looking at” by your detection engineering initiatives–are enriched with additional intelligence and stored in a central alert lake. Instead of seeing raw logs, your SOC sees structured alerts that say why they triggered and what TTP they map to.

The result is a smaller dataset, but one that’s high in security relevance. Each alert is effectively “labeled” with a reason for suspicion.

Step 3: Data Science on Context-Rich Signals

Only now does your data science team step in. It doesn’t have to rummage through millions of log lines searching for ephemeral anomalies. Instead, it consumes a curated feed of suspicious events:

Clustering can reveal if multiple suspicious events across different systems share a commonality.
Sequential analysis can detect multi-stage attacks by linking related alerts in a short timeframe.
Machine learning to rank these alerts by likelihood of being malicious and escalate anomalies from an alert pool of pre-determined “weird” versus the traditional means of baselining normie behavior. ML applied downstream = less hamster-wheel chases

(You can find more about implementing data science at this level within my Detection Engineering and Escalation Recommendations framework here.)

With detection engineering taking the first pass, you’re not looking for random outliers. You’re analyzing purpose-built alerts. This dramatically reduces the “noise” problem while maximizing your team’s confidence in each alert that hits their queue.

Case in Point: Faster Time to Value, Smarter Triage

In my experience, once we pivoted to a “detections first” pipeline, we saw:

Fewer Overwhelmed Analysts: Triage teams no longer scrolled through hundreds of “anomaly” alerts for routine events or spent unnecessary time determining the validity of an alert output.
Shorter Baselining Windows: Data science efforts now revolve around already-labeled signals leading to better outputs and faster turnaround times on new R&D efforts without the need for heavy baselining.
Clearer Investigation Paths: Each alert carried explicit reasons for suspicion, letting analysts tie an event back to known TTPs or set of malicious behaviors.
Real Threat Coverage: Because detection engineers anchored the logic in real attacker techniques, the pipeline caught more genuine threats with less guesswork.

Evolving Past Traditional UEBA

Traditional UEBA places too much faith in pure anomaly detection on raw (or only superficially normalized) data. My personal experiences confirm that you can spend months baselining and tuning only to deliver minimal value. Instead, a detection engineering-first approach addresses the fundamental gap: raw data rarely translates directly to a security threat without deeper context and labeling.

Build: Create atomic, context-rich detections
Collect: Store those alerts in an “alert lake,” ensuring everything is traceable back to a known TTP or suspicious condition.
Correlate: Let data science handle these curated alerts, surfacing advanced patterns that single detections might miss.

UEBA’s core idea–examining user and entity behavior–remains valid. The key is to do it after you’ve injected security context. That’s how you move from months of frustrating “baselining” to a truly valuable analytics engine that triage teams can trust.

Get the Latest Resources

Leave Your Data Where You Want: Detect Across Snowflake

Demo Series

Leave Your Data Where You Want: Detect Across Snowflake

Watch

MonteAI: Your Detection Engineering & Threat Hunting Co-Pilot

Demo Series

MonteAI: Your Detection Engineering & Threat Hunting Co-Pilot

Watch

White Paper

The UEBA Illusion: Why Traditional UEBA Falls Short

Detection Strategies

March 28, 2025

The UEBA Illusion: Why Traditional UEBA Falls Short

Detection Strategies

No items found.

By: Kevin Gonzalez, VP of Security, Operations, and Data at Anvilogic

‍

My UEBA Journey: A Tale of Two Organizations–Chasing the ROI Carrot

Part 1: Many Months… Many Many Many Many Months.. (ref. 50 Cent..)

Why so long? The system attempts to normalize the endless heap of endpoint, network, and user data and define what “normal” looks like.
The result? A painful waiting game. No real value for the SOC until the system finishes learning and believe me, what it “learned” isn’t anything impressive. But hey, checkbox checked for our requirement for an insider threat program.

Part 2: Black Box System, Black Box Anomalies, Bird Box Analysts?

As the first two letters of UEBA suggest–Only I (the user), and any higher-level Entity knew these custom tweaks so the SOC was effectively blind if I was OOO.
My triage analysts still faced alerts like “Anomalous Network Activity”, with minimal context about why it was suspicious.
The correlation logic was primarily statistical, lacking essential security intelligence as to what makes an event suspicious in conjunction with core enrichment properties like known TTP references.

Long baselining periods–delaying ROI by months.
Noisy “anomalies” that lacked clear investigation paths.
Limits on data ingestion–if the UEBA product didn’t receive the exact data points it expected, all bets were off.
Complexities of large networks can actually make datasets that have all the right data points completely unusable and downright malicious to your product capabilities (like “impossible travel” that is actually normal due to multiple proxies, VPNs, and dynamic routing).

In the end of these implementations, I am not quite sure we ever got that UEBA ROI carrot in our hands, it always seemed just out of reach.

Why Normalized Data Alone Isn’t Enough

The Problem of “Data Science on Raw”

Admin Activity: OS patching, system troubleshooting, network & infrastructure changes might look different from the normal daily log sets, but it’s benign.
Application Rollouts: A big software update might trigger spikes in resources, new processes taking affect to support daily functionality or maybe its just part of an upgrade script–again, unusual but safe.
Unpredictable Network Events: The mix of external proxies, VPNs, and SSO/IDP providers can resemble “impossible” travel,” but is simply how your environment is structured.

Rethinking UEBA: Detection Engineering First, Data Science Second

Step 1: Create Atomic-Level Detections

Parent-child process anomalies such as PowerShell spawning rundll32
Remote admin tool utilization such as PSExec, SMBExec, WinRM, WMI, etc.
Unusual file modifications that align with known ransomware behaviors
Network calls to known or malicious suspicious domains or IPs
Insider Threat-specific behaviors such as employees on job boards, exfil to cloud drives, emailing their personal emails, etc.

Each detection includes relevant context:

MITRE ATT&CK references
Threat Intel/CMDB Lookups such as domain reputations, known malicious IOCs, asset & identity information, etc.
Metadata about the detection use case itself and what it is trying to achieve, whether it should be triaged, how it should be triaged, confidence and severity scoring, etc.

Step 2: “Alert Lake”

The result is a smaller dataset, but one that’s high in security relevance. Each alert is effectively “labeled” with a reason for suspicion.

Step 3: Data Science on Context-Rich Signals

Only now does your data science team step in. It doesn’t have to rummage through millions of log lines searching for ephemeral anomalies. Instead, it consumes a curated feed of suspicious events:

Clustering can reveal if multiple suspicious events across different systems share a commonality.
Sequential analysis can detect multi-stage attacks by linking related alerts in a short timeframe.
Machine learning to rank these alerts by likelihood of being malicious and escalate anomalies from an alert pool of pre-determined “weird” versus the traditional means of baselining normie behavior. ML applied downstream = less hamster-wheel chases

(You can find more about implementing data science at this level within my Detection Engineering and Escalation Recommendations framework here.)

Case in Point: Faster Time to Value, Smarter Triage

In my experience, once we pivoted to a “detections first” pipeline, we saw:

Fewer Overwhelmed Analysts: Triage teams no longer scrolled through hundreds of “anomaly” alerts for routine events or spent unnecessary time determining the validity of an alert output.
Shorter Baselining Windows: Data science efforts now revolve around already-labeled signals leading to better outputs and faster turnaround times on new R&D efforts without the need for heavy baselining.
Clearer Investigation Paths: Each alert carried explicit reasons for suspicion, letting analysts tie an event back to known TTPs or set of malicious behaviors.
Real Threat Coverage: Because detection engineers anchored the logic in real attacker techniques, the pipeline caught more genuine threats with less guesswork.

Evolving Past Traditional UEBA

Build: Create atomic, context-rich detections
Collect: Store those alerts in an “alert lake,” ensuring everything is traceable back to a known TTP or suspicious condition.
Correlate: Let data science handle these curated alerts, surfacing advanced patterns that single detections might miss.