GTA3 Workshop @ ICDM 2025

Navigating the Blue Nowhere:
A Framework for Mapping Validated
Adversarial Trajectories

Shlok Gilda, Karsten Martiny, Justin Ho,
Laura Tinnel, Grit Denker, and Bonnie J. Dorr

University of Florida & SRI International

The Challenge: From Chaos to Clarity

🎯 Goal

Construction of a Knowledge Graph from CTI reports by translating natural language inputs to formal knowledge

Example Input¹

"Sandworm was first observed in the victim's environment in June 2022, when the actor deployed the Neo-REGEORG webshell..."

↓

Knowledge Graph

Entities connected by relationships

Sandworm

Actor

uses

→

Neo-REGEORG

Software

Enables temporal analysis & pattern clustering

The Problem with LLMs

LLMs are unreliable for CTI and hallucinate, creating dangerous misrepresentations.²

⚠️

Performance degrades on complex reports

🔄

Inconsistent outputs across runs

❌

High confidence in wrong answers

¹ Google/Mandiant. (2022). Sandworm Disrupts Power in Ukraine Using a Novel Attack Against Operational Technology.

² Mezzi E., at al. (2025). Large Language Models are Unreliable for Cyber Threat Intelligence. Springer.

Core Contributions

The Cyber Behavior Pattern Extractor (CBPE) is a neuro-symbolic framework that generates validated, temporally aware knowledge graphs from multi-modal CTI sources

🔗

Neuro-Symbolic Pipeline

Multi-modal pipeline for transparent and reliable KG construction that implements the scaffolding paradigm.¹

Input

Unstructured reports
(text + visuals)

→

Output

Validated
Knowledge Graph

LLM extraction + Schema validation

Scaffolding: Robust structure around LLMs to ensure reliability through constraints and validation

✓

Automated Validation

Two-stage validation loop corrects hallucinations without reliance on pre-existing trusted KGs.²

Syntactic Validation

+

Semantic Validation

Two-stage loop: Schema conformance, source verification, hallucination correction

⏱️

Time-Ordered Extraction

Extracts time-ordered TTP patterns to construct attacker playbooks for sequential prediction.

1

→

2

→

3

Capture temporal relationships

Graph structure enables: Path queries, pattern clustering, sequential prediction

¹ Mezzi, E., et al. (2025). Large Language Models are Unreliable for Cyber Threat Intelligence. Springer.

² Wu, Z., et al. (2024). KGV: Integrating Large Language Models with Knowledge Graphs for Cyber Threat Intelligence Credibility Assessment. arXiv.

The Foundation: MITRE ATT&CK

What is MITRE ATT&CK?

The globally recognized knowledge base of adversary tactics, techniques, and procedures (TTPs) based on real-world observations.¹

✓

Standard framework for describing cyberattacks

✓

Used by security teams worldwide

✓

Provides common language for threat intelligence

ATT&CK Hierarchy Example

TACTIC

→

TECHNIQUE

→

SUB-TECHNIQUE

Initial Access

→

Phishing

→

Spearphishing Attachment

(T1566.001)

How CBPE Uses MITRE ATT&CK:

CBPE extends the MITRE ATT&CK ontology to capture temporal attack sequences, creating validated knowledge graphs that model dynamic adversarial behavior over time.

We ingest and process the cited threat reports referenced by MITRE with our framework to enrich the knowledge base.

¹ MITRE ATT&CK. (2025). https://attack.mitre.org

The CBPE Pipeline

Text → Concrete Syntax Tree (CST)

What is a Concrete Syntax Tree? A structured, human-readable representation that captures entities, relationships, and temporal information in a format ready for validation and knowledge graph construction.

📝 Input Text

"Sandworm deployed GOGETTER roughly one month later..."

→

EventEntity(
  actors: ["Sandworm"],
  software: ["GOGETTER"],
  temporal_descriptor: 
    "one month later"
)

✓ Validation Process

Fully Automated: The entire validation loop runs without human intervention

📝

Original

"Sandworm deployed GOGETTER"

→

🔧

CST

EventEntity(
  actors: ["Sandworm"],
  software: ["GOGETTER"]
)

↗

↘

✓

Syntactic Validation

Structure & types correct

✅

Semantic Validation

Meaning preserved

🌐 Enrich Knowledge Graph

✓

Validated CST

EventEntity(
  actors: ["Sandworm"],
  software: ["GOGETTER"]
)

→

🎯

Enriched KG

USES_SOFTWARE(
actor: "Sandworm",
software: "GOGETTER"
)

System Architecture

📚

Threat Intelligence Data

PDF, HTML, MITRE data

→

⚙️

Pre-processing

Sentences with section context

→

🤖

LLM-based Formalization

w.r.t. given schema

📋

Formal Schema

↓

→

Candidate CST

✓

Validation

pass

→

⚙️

Post-processing

Valid KG instances

→

🕸️

Threat Intelligence KG

Validated knowledge

fail

Key Innovation: Automated validation loop ensures data fidelity by detecting and correcting LLM hallucinations through iterative feedback

Step 1: Multi-Modal Preprocessing

Text Ingestion

CBPE ingests reports and preserves document structure (chapters, sections, paragraphs) for coherent context.

Text Input:¹

"Sandworm was first observed in the victim's environment in June 2022, when the actor deployed the Neo-REGEORG webshell on an internet-facing server."

Visual Processing

Vision-Language Models (VLMs) extract information from diagrams, timelines, and screenshots.

Visual Input (Attack Chain):²

VLM Output Example

"A four-step attack chain is depicted. First, an ISO file (‘a.iso’) is mounted on a MicroSCADA server, leading to the execution of ‘n.bat’. Second, ‘n.bat’ executes ‘Scilc.exe’, which is installed on the server. Third, ‘Scilc.exe’ creates a file ‘s1.txt’. Finally, the server communicates with an RTU using IEC-104/101 protocols."

¹ Google/Mandiant. (2022). Sandworm Disrupts Power in Ukraine Using a Novel Attack Against Operational Technology.

Pipeline: Formalization

Structuring Unstructured Text

The LLM transforms unstructured text into Concrete Syntax Trees (CSTs) guided by a formal schema.

What is a CST?

Concrete Syntax Trees are parse trees defined through a formal grammar that we use as a knowledge unit to formally capture the full semantic content of inputs on the level of individual input sentences.

Why CSTs?

CSTs bridge the gap between raw text and knowledge graphs. They preserve meaning of individual sentences in a structured form.

Key Features

The CST captures entities (actors, software), actions (deployed), temporal context (June 2022), and relationships in a structured format ready for validation.

Note: This is a domain-specific modeling choice

Source Text Example:

"Sandworm was first observed in June 2022, when the actor deployed the Neo-REGEORG webshell. Roughly one month later, Sandworm deployed GOGETTER..."

Generated CST Structure

ReportChunkCST(
  actors: ["Sandworm"],
  timeline: [
    EventEntity(
      id: "event_1",
      temporal_descriptor: "June 2022",
      actors: ["Sandworm"],
      software: ["Neo-REGEORG"]
    ),
    EventEntity(
      id: "event_2",
      temporal_descriptor: 
        "Roughly one month later",
      preceding_event_ids: ["event_1"],
      actors: ["Sandworm"],
      software: ["GOGETTER"]
    )
  ]
)

Pipeline: Two-Stage Validation

Validation

Stage 1:
Syntactic Validation

Is the CST structurally sound and typed correctly?

Stage 2:
Semantic Validation

Is the CST factually accurate compared to the source?

Stage 1: Syntactic Validation

Type Checking

Each field in the CST is checked for conformance to the schema's data types.

Example:

Schema requires: sequence_index: int

❌ Invalid CST:

{"sequence_index": "second"}

✓ Valid CST:

{"sequence_index": 2}

Error Detection

The system automatically provides specific feedback to the LLM for correction when malformed data is detected.

Common errors:

Missing required fields

Event missing "actors" field

Incorrect data types

Expected integer, got string

Invalid temporal formats

"June" instead of "2022-06-01"

Stage 2: Semantic Validation

🔄

Reformulation

The LLM converts the validated CST back into natural language text.

Example: CST → "Sandworm deployed GOGETTER."

⚖️

Comparison

An evaluator LLM checks if the reformulated text is semantically equivalent to the original source.

Example: Compare "Sandworm deployed GOGETTER" with original text.

Why use an LLM to validate an LLM? Extraction is complex, while equivalence checking is a constrained comparison task where LLMs perform reliably.¹

¹ Y. Hayashi (2025). Evaluating LLMs' Capability to Identify Lexical Semantic Equivalence. COLING.

Validation in Action

Original Source Text:

"...Sandworm deployed GOGETTER..."

Let's see what the LLM extracts

→ Running Initial Extraction

Attempt 1: LLM Extraction

Generated CST

{
  "event_2": {
    "actors": [{"name": "GOGETTER"}],
    "software": [{"name": "Sandworm"}],
    "actions": [{"name": "deployed"}]
  }
}

Reformulation

"GOGETTER deployed Sandworm."

❌ Semantic Validation: FAILED

Reason: Original text states 'Sandworm deployed GOGETTER', but the CST claims 'GOGETTER deployed Sandworm'.

⟳ FEEDBACK SENT TO LLM

System generates corrective feedback automatically

Attempt 2: Corrected Extraction

Generated CST (Corrected)

{
  "event_2": {
    "actors": [{"name": "Sandworm"}],
    "software": [{"name": "GOGETTER"}],
    "actions": [{"name": "deployed"}]
  }
}

Reformulation

"Sandworm deployed GOGETTER."

✓ Semantic Validation: PASSED

The reformulated text is semantically equivalent to the source. The data is validated and ready for the Knowledge Graph.

Pipeline: Postprocessing

Enrich the Knowledge Graph

Validated CSTs enrich the final KG. This step performs entity resolution, merging duplicates like "Sandworm" and "APT44".

Example: Add New Fact

Validated CST for GOGETTER

⬇

USES_SOFTWARE(Sandworm, GOGETTER)

Building Intelligence Over Time

MITRE ATT&CK

→

Report 1

→

Report 2

→

Unified KG

Initial State

(from MITRE ATT&CK)

Sandworm → USES_SOFTWARE → CaddyWiper

After Report 1

Sandworm → USES_SOFTWARE → CaddyWiper Sandworm → USES_SOFTWARE → GOGETTER NEW

After Report 2

Sandworm → USES_SOFTWARE → CaddyWiper Sandworm → USES_SOFTWARE → GOGETTER Sandworm → USES_SOFTWARE → GOGETTER → USES_LIBRARY → Yamux NEW DETAIL

Temporal Pattern Extraction

Narrative Flow

The order of events in a report is crucial for capturing the sequence of adversarial actions for predictive modeling.

Temporal Keywords

Keywords enhance the clarity of attack timelines, providing insight into campaign phases.

Example: "...observed in June 2022... Roughly one month later, Sandworm deployed GOGETTER..."

Source Text

"Sandworm was first observed in the victim's environment in June 2022, when the actor deployed the Neo-REGEORG webshell. Roughly one month later, Sandworm deployed GOGETTER, which proxies communications for its C2 server using Yamux over TLS."

→

Extracted Timeline

June 2022

Deploy Webshell

Neo-REGEORG

↓

July 2022

Deploy Tunneler

GOGETTER with Yamux

Build "Attacker Playbooks"

By clustering attack timelines, CBPE identifies recurring TTP sequences, enabling defenders to anticipate an adversary's next move.

Key Insight: Detect patterns early to predict and preempt later stages before they unfold.

Example: Ransomware Campaign

1

Spearphishing

(T1566.001)

→

2

OS Credential Dumping

(T1003)

→

3

Lateral Movement

(T1021)

→

4

Data Encrypted

(T1486)

Key Takeaways

1. Reliable CTI automation requires robust scaffolding around LLMs, not just scaling models alone.

2. CBPE's automated validation loop detects and corrects hallucinations without requiring pre-existing trusted knowledge graphs.

3. Temporal modeling enables dynamic attacker playbooks that support predictive defense strategies.

4. The validated KG serves as a dynamic repository for confirming known behaviors and assimilating novel threat intelligence.

Thank You

Contact Information

Shlok Gilda

University of Florida

shlokgilda@ufl.edu

GTA3 Workshop @ ICDM 2025

GTA3 Workshop @ ICDM 2025

Navigating the Blue Nowhere:A Framework for Mapping ValidatedAdversarial Trajectories

The Challenge: From Chaos to Clarity

Core Contributions

Neuro-Symbolic Pipeline

Automated Validation

Time-Ordered Extraction

The Foundation: MITRE ATT&CK

What is MITRE ATT&CK?

ATT&CK Hierarchy Example

How CBPE Uses MITRE ATT&CK:

The CBPE Pipeline

1. Ingest

2. Formalize

3. Validate

4. Enrich KG

Multi-Modal Input Processing

Text → Concrete Syntax Tree (CST)

✓ Validation Process

🌐 Enrich Knowledge Graph

System Architecture

Threat Intelligence Data

Pre-processing

LLM-based Formalization

Formal Schema

Validation

Post-processing

Threat Intelligence KG

Step 1: Multi-Modal Preprocessing

Text Ingestion

Visual Processing

Pipeline: Formalization

Structuring Unstructured Text

What is a CST?

Why CSTs?

Key Features

Source Text Example:

Generated CST Structure

Pipeline: Two-Stage Validation

Validation

Stage 1:Syntactic Validation

Stage 2:Semantic Validation

Stage 1: Syntactic Validation

Type Checking

Error Detection

Stage 2: Semantic Validation

Reformulation

Comparison

Validation in Action

Attempt 1: LLM Extraction

Generated CST

Reformulation

Attempt 2: Corrected Extraction

Generated CST (Corrected)

Reformulation

Pipeline: Postprocessing

Enrich the Knowledge Graph

Example: Add New Fact

Building Intelligence Over Time

Temporal Pattern Extraction

Narrative Flow

Temporal Keywords

Source Text

Extracted Timeline

Build "Attacker Playbooks"

Example: Ransomware Campaign

Key Takeaways

Thank You

Contact Information

Navigating the Blue Nowhere:
A Framework for Mapping Validated
Adversarial Trajectories

Stage 1:
Syntactic Validation

Stage 2:
Semantic Validation