@xmldom/xmldom CVE-2026-41675

HIGH
XML Injection (aka Blind XPath Injection) (CWE-91)
2026-04-22 https://github.com/xmldom/xmldom GHSA-x6wf-f3px-wcqx
Share

Lifecycle Timeline

1
Analysis Generated
Apr 23, 2026 - 06:53 vuln.today

Blast Radius

ecosystem impact
† from your stack dependencies † transitive graph · vuln.today resolves 4-path depth
  • 4 npm packages depend on @xmldom/xmldom (4 direct, 0 indirect)

Ecosystem-wide dependent count for version 0.9.0.

DescriptionNVD

Summary

The package allows attacker-controlled processing instruction data to be serialized into XML without validating or neutralizing the PI-closing sequence ?>. As a result, an attacker can terminate the processing instruction early and inject arbitrary XML nodes into the serialized output.

---

Details

The issue is in the DOM construction and serialization flow for processing instruction nodes.

When createProcessingInstruction(target, data) is called, the supplied data string is stored directly on the node without validation. Later, when the document is serialized, the serializer writes PI nodes by concatenating <?, the target, a space, node.data, and ?> directly.

That behavior is unsafe because processing instructions are a syntax-sensitive context. The closing delimiter ?> terminates the PI. If attacker-controlled input contains ?>, the serializer does not preserve it as literal PI content. Instead, it emits output where the remainder of the payload is treated as live XML markup.

The same class of vulnerability was previously addressed for CDATA sections (GHSA-wh4c-j3r5-mjhp / CVE-2026-34601), where ]]> in CDATA data was handled by splitting. The serializer applies no equivalent protection to processing instruction data.

---

Affected code

lib/dom.js - createProcessingInstruction (lines 2240-2246):

js
createProcessingInstruction: function (target, data) {
    var node = new ProcessingInstruction(PDC);
    node.ownerDocument = this;
    node.childNodes = new NodeList();
    node.nodeName = node.target = target;
    node.nodeValue = node.data = data;
    return node;
},

No validation is performed on data. Any string including ?> is stored as-is.

lib/dom.js - serializer PI case (line 2966):

js
case PROCESSING_INSTRUCTION_NODE:
    return buf.push('<?', node.target, ' ', node.data, '?>');

node.data is emitted verbatim. If it contains ?>, that sequence terminates the PI in the output stream and the remainder appears as active XML markup.

Contrast - CDATA (line 2945, patched):

js
case CDATA_SECTION_NODE:
    return buf.push(g.CDATA_START, node.data.replace(/]]>/g, ']]]]><![CDATA[>'), g.CDATA_END);

---

PoC

Minimal (from @tlsbollei report, 2026-04-01)

js
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');

const doc = new DOMImplementation().createDocument(null, 'r', null);
doc.documentElement.appendChild(
    doc.createProcessingInstruction('a', '?><z/><?q ')
);
console.log(new XMLSerializer().serializeToString(doc));
// <r><?a ?><z/><?q ?></r>
//          ^^^^ injected <z/> element is active markup

With re-parse verification (from @tlsbollei report)

js
const assert = require('assert');
const { DOMParser, XMLSerializer } = require('@xmldom/xmldom');

const doc = new DOMParser().parseFromString('<r/>', 'application/xml');
doc.documentElement.appendChild(doc.createProcessingInstruction('a', '?><z/><?q '));
const xml = new XMLSerializer().serializeToString(doc);
assert.strictEqual(new DOMParser().parseFromString(xml, 'application/xml')
    .getElementsByTagName('z').length, 1); // passes - z is a real element

---

Impact

An application that uses the package to build XML from untrusted input can be made to emit attacker-controlled elements outside the intended PI boundary. That allows the attacker to alter the meaning and structure of generated XML documents.

In practice, this can affect any workflow that generates XML and then stores it, forwards it, signs it, or hands it to another parser. Realistic targets include XML-based configuration, policy documents, and message formats where downstream consumers trust the serialized structure.

As noted by @tlsbollei: this is the same delimiter-driven XML injection bug class previously addressed by GHSA-wh4c-j3r5-mjhp for createCDATASection(). Fixing CDATA while leaving PI creation and PI serialization unguarded leaves the same standards-constrained issue open for another node type.

---

Disclosure

This vulnerability was publicly disclosed at 2026-04-06T11:25:07Z via xmldom/xmldom#987, which was subsequently closed without being merged.

---

Fix Applied

> ⚠ Opt-in required. Protection is not automatic. Existing serialization calls remain > vulnerable unless { requireWellFormed: true } is explicitly passed. Applications that pass > untrusted data to createProcessingInstruction() or mutate PI nodes with untrusted input > (via .data = or CharacterData mutation methods) should audit all serializeToString() > call sites and add the option.

XMLSerializer.serializeToString() now accepts an options object as a second argument. When { requireWellFormed: true } is passed, the serializer throws InvalidStateError before emitting any ProcessingInstruction node whose .data contains ?>. This check applies regardless of how ?> entered the node - whether via createProcessingInstruction directly or a subsequent mutation (.data =, CharacterData methods).

On @xmldom/xmldom ≥ 0.9.10, the serializer additionally applies the full W3C DOM Parsing §3.2.1.7 checks when requireWellFormed: true:

  1. Target check: throws InvalidStateError if the PI target contains a : character or is an ASCII case-insensitive match for "xml".
  2. Data Char check: throws InvalidStateError if the PI data contains characters outside the XML Char production.
  3. Data sequence check: throws InvalidStateError if the PI data contains ?>.

On @xmldom/xmldom ≥ 0.8.13 (LTS), only the ?> data check (check 3) is applied. The target and XML Char checks are not included in the LTS fix.

PoC - fixed path

js
const { DOMImplementation, XMLSerializer } = require('@xmldom/xmldom');

const doc = new DOMImplementation().createDocument(null, 'r', null);
doc.documentElement.appendChild(doc.createProcessingInstruction('a', '?><z/><?q '));

// Default (unchanged): verbatim - injection present
const unsafe = new XMLSerializer().serializeToString(doc);
console.log(unsafe);
// <r><?a ?><z/><?q ?></r>

// Opt-in guard: throws InvalidStateError before serializing
try {
  new XMLSerializer().serializeToString(doc, { requireWellFormed: true });
} catch (e) {
  console.log(e.name, e.message);
  // InvalidStateError: The ProcessingInstruction data contains "?>"
}

The guard catches ?> regardless of when it was introduced:

js
// Post-creation mutation: also caught at serialization time
const pi = doc.createProcessingInstruction('target', 'safe data');
doc.documentElement.appendChild(pi);
pi.data = 'safe?><injected/>';
new XMLSerializer().serializeToString(doc, { requireWellFormed: true });
// InvalidStateError: The ProcessingInstruction data contains "?>"

Why the default stays verbatim

The W3C DOM Parsing and Serialization spec §3.2.1.3 defines a require well-formed flag whose default value is false. With the flag unset, the spec explicitly permits serializing PI data verbatim. This matches browser behavior: Chrome, Firefox, and Safari all emit ?> in PI data verbatim by default without error.

Unconditionally throwing would be a behavioral breaking change with no spec justification. The opt-in requireWellFormed: true flag allows applications that require injection safety to enable strict mode without breaking existing code.

Residual limitation

createProcessingInstruction(target, data) does not validate data at creation time. The WHATWG DOM spec (§4.5 step 2) mandates an InvalidCharacterError when data contains ?>; enforcing this check unconditionally at creation time is a breaking change and is deferred to a future breaking release.

When the default serialization path is used (without requireWellFormed: true), PI data containing ?> is still emitted verbatim. Applications that do not pass requireWellFormed: true remain exposed.

AnalysisAI

Processing instruction injection in @xmldom/xmldom allows attackers to break out of XML PI context and inject arbitrary markup by including the '?>' delimiter in PI data. When untrusted input is passed to createProcessingInstruction(), the serializer emits it verbatim without escaping, causing everything after '?>' to be parsed as live XML nodes instead of PI content. …

Sign in for full analysis, threat intelligence, and remediation guidance.

RemediationAI

Within 24 hours: Identify all applications and dependencies using @xmldom/xmldom via software composition analysis (SCA) scanning; audit which versions are currently deployed. Within 7 days: Upgrade to @xmldom/xmldom version 0.8.13 (LTS) or 0.9.10+ and enable the {requireWellFormed: true} flag in all createProcessingInstruction() calls; validate in pre-production. …

Sign in for detailed remediation steps.

Share

CVE-2026-41675 vulnerability details – vuln.today

This site uses cookies essential for authentication and security. No tracking or analytics cookies are used. Privacy Policy