Skip to main content

Command Palette

Search for a command to run...

The Unknown Unknown Problem in Threat Intelligence

Updated
8 min read

There's a quiet assumption baked into most threat intelligence programs: that if we just monitor enough feeds, subscribe to enough platforms, and map enough TTPs to MITRE ATT&CK, we'll eventually have a complete picture of the threat landscape. It's a comforting idea. It's also wrong, and the reason it's wrong has less to do with our tools and more to do with how knowledge itself works.

In 2002, Donald Rumsfeld gave a press briefing that became famous for all the wrong reasons - mocked as evasive double talk. But buried in that ridiculed statement was a genuinely useful epistemological framework, one that maps almost perfectly onto the structural blind spots every CTI team eventually runs into.

The Four Quadrants, Applied to CTI

Known knowns - this is your bread and butter. CVE-2026-XXXXX affects this product, the patch is here, the exploit is in the wild, here's the advisory. Most of a daily threat brief lives here. It's necessary work, but it's also the easiest work, because the problem has already been named by someone else before it reaches you.

Known unknowns - you know there's a gap, you just haven't filled it yet. "We know APT29 has historically pivoted through compromised SOHO routers, but we don't know if they're using this technique against our client's industry right now." This is where most "intelligence requirements" and collection plans live. You can write a ticket for it. You can task an analyst with closing it.

Unknown knowns - this one is underrated. This is information that exists somewhere in your organization but isn't being used as intelligence. A SOC analyst noticed something weird three months ago and mentioned it in passing on a call. A customer's IT team quietly rotated credentials after an "unrelated" incident. The data is there, but nobody connected it to the threat picture because nobody asked the right question at the right time.

Unknown unknowns - and here's the one that actually keeps CTI leads up at night. These are the threats, techniques, infrastructure patterns, or actor motivations that you have no framework to even look for, because your mental model of the threat landscape doesn't have a slot for them yet.

Why This Isn't Just Philosophy

Here's where it gets practical. Most CTI tooling - Google Threat Intelligence, Threat Command, any feed based platform is fundamentally built to surface known knowns and accelerate known unknowns. You configure a Threat Profile around the actors, sectors, and TTPs you've already decided matter. The DTM module alerts on brand mentions, keywords, and aliases you've already defined. Rapid7's modules score IOCs against indicators that someone, somewhere, already classified.

This is not a criticism of the tools it's their job, and they do it well. But it means the entire intelligence pipeline is shaped like a filter with pre drilled holes. Anything that doesn't fit through one of those holes simply doesn't appear. It's not flagged as "unknown" it's just absent, and absence looks identical to "nothing happened."

Think about how a Threat Actor Profile gets built. You start with attributed campaigns, known infrastructure, documented TTPs all things that became "known" only because someone else, somewhere else, already got compromised, investigated it, and published findings. Your entire early warning capability is downstream of someone else's incident. By construction, CTI is reactive at the meta level even when individual alerts feel proactive.

Where the Unknown Unknowns Actually Hide

A few patterns I've seen, and that show up repeatedly in postmortems of "how did we miss this":

Novel infrastructure that doesn't pattern-match yet. When a threat actor stands up new C2 infrastructure using a hosting provider, ASN, or TLS fingerprint combination that hasn't been seen before, every signature-based and even most heuristic-based detection will be silent. JARM fingerprints, favicon hashes, and passive DNS pivoting are powerful after you have a seed indicator but the first sighting of something genuinely new produces no pivot point. It's a needle that doesn't know it's in a haystack yet.

Cross domain blind spots. A threat intel team focused on brand abuse and phishing may have zero visibility into a geopolitical shift that's about to reshape the actor landscape relevant to their client. The Iran-Israel-US tension dynamics, for instance, don't show up in a DTM dashboard they show up weeks later as a sudden spike in destructive wiper activity from groups like Handala Hack, by which point it's "known" but the lead time is gone. The unknown unknown wasn't technical; it was the absence of a geopolitical sensing layer feeding into the technical one.

Second order effects of your own defenses. When an organization rolls out a new email security control, attackers adapt but the direction of that adaptation is genuinely unknowable in advance. Maybe they shift to QR code phishing. Maybe they pivot to SMS. Maybe they target a third party vendor instead. The space of possible adaptations is enormous, and you can't build a Threat Profile around a technique that's a creative response to your specific defensive posture, because it doesn't exist anywhere until an adversary invents it.

The "boring" infrastructure. Dynamic DNS services, legitimate cloud providers, residential proxy networks these get used by actors precisely because they're not on anyone's watchlist. The unknown unknown here isn't a new IOC, it's a new category of normal that's being weaponized. By the time "residential proxies as C2 relay" becomes a documented TTP with a MITRE ID, the actors using it have already moved three steps ahead.

So What Do You Actually Do About This?

You can't monitor for what you can't conceptualize that's the whole point. But there are structural moves that shrink the unknown-unknown space, even if they can't eliminate it.

Build in deliberate "weak signal" channels that bypass your filters. Most alerting is threshold based and keyword-based by necessity, but that means anything below threshold or outside the keyword list is invisible. Some teams solve this by having analysts spend a fixed percentage of time on unstructured exploration just reading, with no specific IOC in mind, across forums, research blogs, and adjacent industry advisories. It feels unproductive. It's actually where most "we should add this to our Threat Profile" moments come from.

Treat your own assumptions as a documented, falsifiable list. Every Threat Profile encodes assumptions: "this actor targets financial services," "this TTP requires initial access via phishing." Write these assumptions down explicitly, and periodically ask what would have to be true for this assumption to be wrong, and would we even notice if it became wrong? This converts some unknown unknowns into known unknowns just by making the implicit explicit.

Diversify the epistemics of your sources, not just the count. Ten feeds that all source from the same handful of researchers give you ten known-knowns, not ten perspectives. A genuinely different epistemic source a regional language forum, a sector specific ISAC with different visibility, an academic researcher working on something tangential is more valuable for unknown-unknown reduction than another commercial feed covering the same ground.

Build feedback loops from "unknown known" to "known known." This is the cheapest win and most underused. The information your own SOC, IR team, and customer facing staff have already encountered but never formalized is enormous. A lightweight, low-friction way for frontline staff to flag "this felt odd" without needing to prove it's significant often surfaces patterns that no external feed would catch, because the pattern is specific to your environment.

Accept that some intelligence value comes from being wrong fast, not right slow. A hypothesis like "we think there's a connection between this infrastructure cluster and this actor, but we're not confident" has more operational value circulated early even if later disproven than a fully validated assessment that arrives after the relevant window has closed. The unknown-unknown problem is partly a confidence threshold problem; teams that require near certainty before sharing anything will systematically under-detect novel activity.

The Honest Conclusion

The unknown unknown problem doesn't have a solution in the sense of a tool you can buy or a framework you can fully implement. What it has is a posture one of structural humility about the limits of any intelligence apparatus, no matter how well resourced.

The CIA triad, the Pyramid of Pain, MITRE ATT&CK, the Diamond Model all of these are extraordinarily useful models. But a model is, by definition, a simplification, and every simplification has an edge where it stops describing reality. The unknown unknowns live exactly at those edges. The job of a mature CTI program isn't to pretend the edges don't exist it's to know roughly where your edges are, and to build just enough redundancy and weak signal capacity that when something walks in from outside the model, you have at least a chance of noticing the footprints before you see the thing itself.

That's not a comfortable place to end an intelligence report. But comfort was never really the deliverable.

Until next time, thank you for reading :)