Potential PowerShell Obfuscation via High Numeric Character Proportion

Detects long PowerShell script block content with unusually high numeric character density (high digit-to-length ratio), often produced by byte arrays, character-code reconstruction, or embedded encoded blobs. Attackers use numeric-heavy obfuscation to conceal payloads and rebuild them at runtime to avoid static inspection.

Rule type: esql
Rule indices:

Rule Severity: low
Risk Score: 21
Runs every:
Searches indices from: now-9m
Maximum alerts per execution: 100
References:

Tags:

Domain: Endpoint
OS: Windows
Use Case: Threat Detection
Tactic: Defense Evasion
Data Source: PowerShell Logs
Resources: Investigation Guide

Version: 9
Rule authors:

Elastic

Rule license: Elastic License v2

Setup

PowerShell Script Block Logging must be enabled to generate the events used by this rule (e.g., 4104). Setup instructions: https://ela.st/powershell-logging-setup

Investigation guide

Triage and analysis

Disclaimer: This guide was created by humans with the assistance of generative AI. While its contents have been manually curated to include the most valuable information, always validate assumptions and adjust procedures to match your internal runbooks and incident triage and response policies.

Investigating Potential PowerShell Obfuscation via High Numeric Character Proportion

This rule flags long PowerShell script blocks with unusually digit-dense content. Numeric-heavy script blocks are often used to conceal payloads as byte arrays or character codes that are decoded at runtime. Triage should focus on reconstructing the full script content, determining how it was initiated, and identifying any decoded or executed secondary content.

Key alert fields to review

user.name, user.domain, user.id: Account execution context for correlation, prioritization, and scoping.
host.name, host.id: Host execution context for correlation, prioritization, and scoping.
file.path, file.directory, file.name: File-origin context when the script block is sourced from an on-disk file.
powershell.file.script_block_text: Script block content that matched the detection logic.
powershell.file.script_block_id, powershell.sequence, powershell.total: Script block metadata to pivot to other fragments or reconstruct full script content when split across multiple events.
Esql.script_block_tmp: Transformed script block where detection patterns replace original content with a marker to support scoring/counting and quickly spot match locations.
Esql.script_block_ratio: Proportion of the script block's characters that match the alert's target character set, divided by total script length (0-1).
Esql.script_block_pattern_count: Count of matches for the detection pattern(s) observed in the script block content.
powershell.file.script_block_entropy_bits: Shannon entropy of the script block. Higher values may indicate obfuscation.
powershell.file.script_block_surprisal_stdev: Standard deviation of surprisal across the script block. Low values indicate uniform randomness. High values indicate mixed patterns and variability.
powershell.file.script_block_unique_symbols: Count of distinct characters present in the script block.
powershell.file.script_block_length: Script block length (size) context.

Possible investigation steps

Review powershell.file.script_block_text to characterize the numeric content:
- Look for long comma-separated numbers, repeated digit sequences, or 0x-prefixed values that may represent reconstructed bytes.
- Identify string reconstruction patterns (for example, casting numeric values to characters) and any subsequent decoding or decompression logic.
- Note any execution primitives that would run derived content (for example, invoking dynamically built commands or loading content into memory).
If the script is fragmented, use powershell.sequence and powershell.total to collect the related script block events on the same host.name and user.id and reconstruct the complete content in the correct order before drawing conclusions.
Establish execution context and scope using host.name, host.id, agent.id, and user.id:
- Determine whether the user context is expected to run PowerShell and whether similar script blocks have occurred recently on the same host or by the same user.
- Look for other alerts on the same host or user that could indicate staging, persistence, or lateral movement.
Assess script origin using file.path and file.directory when present:
- Determine whether the script is sourced from a location consistent with approved administration or automation workflows.
- If the script is file-backed, check for other security telemetry referencing the same path to identify file creation, modification, or repeated execution patterns.
Correlate with adjacent telemetry (as available in your environment) using the host and user pivots above:
- Process execution telemetry near the alert time to identify the PowerShell host process and its parent, and to understand how PowerShell was launched.
- Network telemetry for outbound connections or downloads that could support payload retrieval or command and control.
- File activity for dropped payloads or staging artifacts related to the script content or its on-disk source.

False positive analysis

Legitimate scripts that embed binary content as numeric arrays (for example, packaging resources into scripts or deploying configuration blobs) can appear digit-dense.
Administrative tooling that generates large reports, inventories, or exports may include extensive numeric identifiers and constants.
Some legitimate security or management products may produce numeric-heavy PowerShell content as part of automation; validate against known software, expected execution accounts, and change windows.

Response and remediation

If malicious behavior is suspected, contain the affected host to prevent further execution and reduce the risk of follow-on activity.
Preserve the script content from powershell.file.script_block_text (and any reconstructed multi-part content) for deeper analysis and to support incident response and retrospective hunting.
If file.path is present and the source is not authorized, remove or quarantine the script and investigate related host artifacts and execution mechanisms.
Investigate potential account compromise for the associated user.id by reviewing recent authentication and endpoint activity; take credential and session remediation actions in line with your procedures.
Hunt for related activity using host.id, agent.id, user.id, and distinctive script patterns identified during triage to find additional impacted systems.
Apply preventive controls based on findings, such as tightening PowerShell usage for affected accounts, improving script provenance controls, and enhancing monitoring for similar obfuscation patterns.

Rule Query

		from logs-windows.powershell_operational* metadata _id, _version, _index
| where event.code == "4104"

// Filter out smaller scripts that are unlikely to implement obfuscation using the patterns we are looking for
| eval Esql.script_block_length = length(powershell.file.script_block_text)
| where Esql.script_block_length > 1000

// replace the patterns we are looking for with the 🔥 emoji to enable counting them
// The emoji is used because it's unlikely to appear in scripts and has a consistent character length of 1
| eval Esql.script_block_tmp = replace(powershell.file.script_block_text, """[0-9]""", "🔥")

// count how many patterns were detected by calculating the number of 🔥 characters inserted
| eval Esql.script_block_pattern_count = Esql.script_block_length - length(replace(Esql.script_block_tmp, "🔥", ""))

// Calculate the ratio of special characters to total length
| eval Esql.script_block_ratio = Esql.script_block_pattern_count::double / Esql.script_block_length::double

// keep the fields relevant to the query, although this is not needed as the alert is populated using _id
| keep
    Esql.script_block_pattern_count,
    Esql.script_block_ratio,
    Esql.script_block_length,
    Esql.script_block_tmp,
    powershell.file.*,
    file.directory,
    file.path,
    powershell.sequence,
    powershell.total,
    _id,
    _version,
    _index,
    host.name,
    host.id,
    agent.id,
    user.id

// Filter for scripts with high numeric character ratio
| where Esql.script_block_ratio > 0.5

// Exclude Windows Defender Noisy Patterns
| where not (
    file.directory == "C:\\ProgramData\\Microsoft\\Windows Defender Advanced Threat Protection\\Downloads" or
    file.directory like (
        "C:\\\\ProgramData\\\\Microsoft\\\\Windows Defender Advanced Threat Protection\\\\DataCollection*",
        "C:\\\\Program Files\\\\SentinelOne\\\\Sentinel Agent*"
    )
  )
  // ESQL requires this condition, otherwise it only returns matches where file.directory exists.
  or file.directory is null
| where not powershell.file.script_block_text like "*[System.IO.File]::Open('C:\\\\ProgramData\\\\Microsoft\\\\Windows Defender Advanced Threat Protection\\\\DataCollection*"
| where not powershell.file.script_block_text : "26a24ae4-039d-4ca4-87b4-2f64180311f0"
		
	

Framework: MITRE ATT&CK

Tactic:
- Name: Defense Evasion
- Id: TA0005
- Reference URL: https://attack.mitre.org/tactics/TA0005/
Technique:
- Name: Obfuscated Files or Information
- Id: T1027
- Reference URL: https://attack.mitre.org/techniques/T1027/
Technique:
- Name: Deobfuscate/Decode Files or Information
- Id: T1140
- Reference URL: https://attack.mitre.org/techniques/T1140/

Framework: MITRE ATT&CK

Tactic:
- Name: Execution
- Id: TA0002
- Reference URL: https://attack.mitre.org/tactics/TA0002/
Technique:
- Name: Command and Scripting Interpreter
- Id: T1059
- Reference URL: https://attack.mitre.org/techniques/T1059/
Sub Technique:
- Name: PowerShell
- Id: T1059.001
- Reference URL: https://attack.mitre.org/techniques/T1059/001/