Analyzing and Mitigating XML External Entity Vulnerabilities in Apache NiFi
Background
XML external entity processing is a markup language feature that remains a perennial problem across the spectrum of software applications. Although XML 1.0 Specification Section 4.2.2 provides the official definition of an external entity, support for this capability exposes an application to a number of threats. Depending on the deployment environment, XML external entity processing creates the potential for Denial-of-Service attacks, disclosure of sensitive information, and enumeration of network resources, among other vulnerabilities.
Document Type Definitions
An XML Document Type Definition (DTD) is the most common location for external references. Unrestricted XML parsers interpret entity declarations and request referenced resources during the evaluation process.
The following provides an example XML document containing a DTD that declares an external entity named configuration
,
which the document references in the source
element:
<?xml version="1.0" ?>
<!DOCTYPE source [
<!ENTITY configuration SYSTEM "http://localhost/configuration">
]>
<source>&configuration;</source>
During parsing, an unrestricted XML parser makes a request to http://localhost/configuration
and replaces the
&configuration;
entity with the response returned. Although this might be a useful feature in limited cases when
processing information from trusted sources, it can expose unprotected applications to a range of problems.
Common Vulnerabilities
The Open Web Application Security Project has listed XXE vulnerabilities among its Top Ten most critical security issues for many years. The Common Weaknesses Enumeration defines CWE-611 as the standard identifier for improper restriction of XML external entity references.
The ubiquity of XML, combined with insecure default settings in XML parsers, has contributed to the persistent nature of vulnerabilities related to XML external entity references. Although framework abstractions can offer protection against these issues, using XML capabilities in many programming languages requires careful implementation to avoid dangerous code.
Java Processing
The Java API for XML Processing includes multiple configuration classes and programming interfaces for performing various XML operations. The JAXP specification is organized into several categories:
- Document Object Model (DOM)
- Extensible Stylesheet Language Transformations (XSLT)
- Simple API for XML (SAX)
- Streaming API for XML (StAX)
JAXP also supports Schema Validation and XML Path Language capabilities. Each category requires careful evaluation and configuration to implement a secure system.
Introduction
Apache NiFi handles XML in a variety of ways using both framework and extension components. As a flexible format, XML provides configuration for authentication and authorization components, and also enables structured data transmission for custom data flows. With these varied use cases, NiFi presents a number of places where it is necessary to implement secure XML processing.
Over the course of multiple releases, NiFi has resolved several vulnerabilities related to unrestricted XML external entity parsing. These vulnerabilities have impacted both internal framework processing and configurable extension components. The attack surface and severity have varied depending on the location, but each vulnerability shared the same characteristics. The number of issues over years illustrates the challenge of implementing consistent defenses against XML external entity attacks. The recurrence of similar problems also highlighted the need for a reusable solution. In response to the most recent instance of vulnerable XML processing, Apache NiFi 1.16.1 introduced a new module implementing a common approach to secure XML handling.
Enumerated Vulnerabilities
NiFi has published the following vulnerabilities related to direct processing of XML external entities:
- CVE-2017-12623
- Summary: Flow Template Processing
- Corrected Version: NiFi 1.4.0
- NVD Base Score: 6.5
- CVE-2018-1309
- Summary: SplitXml Processing
- Corrected Version: NiFi 1.6.0
- NVD Base Score: 9.8
- CVE-2019-10080
- Summary: XMLFileLookupService Configuration
- Corrected Version: NiFi 1.10.0
- NVD Base Score: 6.5
- CVE-2020-13940
- Summary: Bootstrap and Framework Configuration
- Corrected Version: NiFi 1.12.0
- NVD Base Score: 5.5
- CVE-2021-44145
- Summary: TransformXml Configuration
- Corrected Version: NiFi 1.15.1
- NVD Base Score: 6.5
- CVE-2022-29265
- Summary: Component and Viewer Processing
- Corrected Version: NiFi 1.16.1
- NVD Base Score: 7.5
Flow Template Processing
CVE-2017-12623 describes vulnerabilities related to processing flow templates, which authenticated users can download and upload. Flow templates enable reusable component groups to be managed outside NiFi, making it easier to share flow configuration details across environments.
NiFi relies on Java Architecture for XML Binding to parse an XML stream into standard model objects. The default configuration of JAXB and the supporting XML stream sources do not disable XML external entity references, which allowed authenticated users to upload dangerous XML documents.
NIFI-4353 covered the initial resolution to vulnerable template
processing, and NIFI-4357 extended the approach to address other uses
of JAXB and XMLStreamReader in
multiple framework configuration components. These changes introduced a reusable
XmlUtils
class in nifi-security-utils
that disabled
DTD and
External Entity
support on an XMLInputFactory used
to create XMLStreamReader
instances. This approach addressed issues with multiple framework configuration sources,
such as Login Identity Providers and Authorizers. The impact of this vulnerability was limited due to the requirement
for authentication in order to provide XML documents. NiFi 1.4.0 and following incorporated these improvements.
SplitXml Processing
CVE-2018-1309 addressed vulnerable XML handling in the
SplitXml
Processor, which enables separating a single XML document into multiple documents using a configurable depth property.
SplitXml
leverages the Simple API for XML to read
XML streams using event listener approach.
NIFI-4869 and the associated pull request provided the resolution
to vulnerable SAX usage in the SplitXml
Processor. The resolution expanded the
XmlUtils
class to include a new method for creating a
SAXParser that disabled external entities
and disallowed DTD declarations. The severity of vulnerable processing in SplitXml
depends on when and where it is
configured on the context of a particular flow. Flows without SplitXml
are not vulnerable, but flows configured to
receive and process XML from untrusted sources present a serious risk. NiFi 1.6.0 and following provided a secure
version of the SplitXml
Processor.
XMLFileLookupService Configuration
CVE-2019-10080 improved the behavior of the
XMLFileLookupService
,
which enables externalized configuration for Processors such as
LookupAttribute
.
The XMLFileLookupService
reads an XML document specified using the Configuration File
property. The service
leverages the Apache Commons Configuration library to load
XML documents, providing a standard format, without disabling DTD processing.
NIFI-6301 improved the security of XMLFileLookupService
with the
introduction of a
SafeXMLConfiguration
class designed to disable vulnerable processing with
Document Object Model components. Exposure to
vulnerable processing through XMLFileLookupService
requires both configuring the Controller Service itself and
providing a document containing XML external entities. This vulnerability presents a low level of concern given the
limited attack surface and requirement for authenticated access. NiFi 1.10.0 and following included an updated version
of the XMLFileLookupService
component.
Bootstrap and Framework Configuration
CVE-2020-13940 involved unrestricted external entity processing in the NiFi Bootstrap Notification Manager configuration and multiple framework configuration files, such as the Access Policy Provider, User Group Provider, and State Manager. Processing bootstrap and framework configuration files leverages DOM DocumentBuilder methods, which read elements into memory for simplified access.
NIFI-7680 added several methods to
XmlUtils
supporting instantiation of DocumentBuilder
instances with external access disabled. Changes associated with this
issue involved a number of framework classes, as well as several unit tests. Exploiting vulnerable processing required
local filesystem access, minimizing the scope of potential problems. NiFi 1.12.0 and following incorporated the
bootstrap and framework improvements.
TransformXml Configuration
CVE-2021-44145 outlines processing issues related to XSLT
sources configured in the
TransformXml
Processor. NiFi 1.3.0 introduced the Secure processing
property to TransformXml
, neutralizing dangerous XML sources,
but the configuration did not apply to XML stylesheets configured from the XSLT file name
or XSLT Lookup
properties.
The TransformXml
Processor converts input XML to a new output structure based on the configured XSLT source.
NIFI-9399 improved the behavior of TransformXml
, applying the
Secure processing
property when reading the XSLT source using the
Streaming API for XML. The changes leveraged the
existing XmlUtils
method for creating and configuring secure XMLStreamReader
instances. Although vulnerable XSLT
files could be introduced more easily when using a Lookup Service, the scope of the problem was limited due to the
requirement for authenticated access to configure the TransformXml
Processor. NiFi 1.15.1 and following
incorporated TransformXml
Processor improvements.
Component and Viewer Processing
CVE-2022-29265 involved runtime XML external entity handling in the following Processors:
In addition to vulnerable Processors, the
Standard Content Viewer was also
susceptible to XML external entity attacks when viewing XML documents using the formatted
selection. All of these
components were vulnerable when configured using default property values.
The EvaluateXPath
and EvaluateXQuery
Processors leverage SAX for parsing input documents, while ValidateXml
uses StAX for reading sources to be validated. The Standard Content Viewer also leveraged StAX for input sources, as
well as JAXP XSLT components for formatting output documents.
Resolving issues with these components required implementing secure settings for several styles of Java XML
processing. Starting with the reusable approach that XmlUtils
provided, NIFI-9901 involved a complete refactor of XML handling
across both framework and extension components.
NIFI-9901 introduced a new
nifi-xml-processing module with discrete
components for common Java XML processing operations. This approach avoided unnecessary dependencies
in nifi-security-utils
and replaced XmlUtils
utility methods with interfaces and standard implementations that
encapsulated operations for DOM, SAX, StAX, and Validator interfaces.
NIFI-9943 resolved vulnerabilities specific to the
Standard Content Viewer and added reusable components for XSLT processing. This strategy replaced direct references
to JAXP interfaces with access to components provided in nifi-xml-processing
, and also incorporated common
vulnerability testing.
The severity of the CVE-2022-29265 varies depending on the flow configuration and sources of data processed. Flows that
do not use EvaluateXPath
, EvaluateXQuery
, or ValidateXml
are less vulnerable to runtime attacks. Even without
these Processors, however, flows that process XML from untrusted sources are still vulnerable to issues with the
Standard Content Viewer. Using the formatted
option to view XML FlowFile content triggers unrestricted processing,
which includes evaluating XML external entities. NiFi 1.16.1 included the refactored approach to XML processing across
framework and extension components.
Enabling Secure Processing
Implementing a secure approach to XML processing requires different settings depending on the JAXP API used. The implementation strategy also varies depending on the runtime Java version and JAXP implementation. Historical approaches continue to work on recent versions of Java, but a secure solution requires both correct code and controlled runtime class configuration.
JDK Enhancement Proposal 185 introduced new configuration properties and improved existing secure processing features to restrict access to external XML resources. Java 7 Update 40 and Java 8 included JAXP 1.5, which incorporated JEP 185 improvements. Supporting secure XML processing in Java 7 is more complicated, but Java 8 provides a streamlined approach when using the standard JAXP implementation.
The Java API for XML Processing (JAXP) Security Guide
provides a detailed overview of potential attacks and secure programming recommendations. NiFi components in the
nifi-xml-processing
module follow the general guidelines and provide reusable abstractions around JAXP capabilities.
Configuring JAXP Components
Different Java XML processing components require different approaches to secure configuration. Each JAXP API has an associated class that must be configured to avoid potential XML processing vulnerabilities.
The majority of JAXP configuration classes support the Feature for Secure Processing , which disables XML external entity processing and also sets limits on memory consumption. StAX components do not support the feature, but can be configured using specific properties.
XML Processing Configuration Classes
The following classes control XML processing capabilities, and any reference to these classes should be evaluated to confirm a secure implementation:
- javax.xml.parsers.DocumentBuilderFactory
- javax.xml.parsers.SAXParserFactory
- javax.xml.stream.XMLInputFactory
- javax.xml.transform.TransformerFactory
- javax.xml.validation.SchemaFactory
- javax.xml.validation.Validator
- javax.xml.validation.ValidatorHandler
- javax.xml.xpath.XPathFactory
All of these components except for the StAX XMLInputFactory
support the Feature for Secure Processing defined in
javax.xml.XMLConstants.FEATURE_SECURE_PROCESSING
flag.
Enabling Feature for Secure Processing
Calling the setFeature
method on supporting classes, using the feature flag as follows, enables secure processing:
factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
This approach is sufficient for components outside the javax.xml.stream
package when running on Java 8 and
following, using the standard JDK implementation.
Providing other JAXP implementations through different dependencies can alter runtime behavior. For this reason, it is important to know both the runtime Java version and the runtime JAXP implementation.
Disabling External Entities for Streaming Components
Without support for the secure processing feature flag, configuring StAX components requires setting additional
properties on XMLInputFactory
instances prior to creating XML readers.
The javax.xml.XMLConstants.ACCESS_EXTERNAL_DTD property supports disabling XML external entity references when setting the value to an empty string. The property controls which protocols the parser can use to retrieve external resources, and setting the value to an empty string disables all protocols as follows:
factory.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, "");
As an alternative to disabling external access, Document Type Definition support can be disabled using the
javax.xml.stream.XMLInputFactory.SUPPORT_DTD
property, and XML external entities can be disabled using the
javax.xml.stream.XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES
property. Setting the property values to false
is more restrictive than disabling external access from a DTD, but
provides a more secure implementation.
factory.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, false);
factory.setProperty(XMLInputFactory.SUPPORT_DTD, false);
Abstracted Processing Interfaces
The nifi-xml-processing
module includes the following interfaces that correspond to JAXP components:
- Document Object Model
- Extensible Stylesheet Language Transformations
- Simple API for XML
- Streaming API for XML
- Validation
Each interface has an associated standard implementation that includes the features and properties necessary to enable secure processing. The standard classes are compatible with the JAXP implementation bundled with the JDK, requiring no external dependencies.
Verifying Secure Processing
Java components configured for secure processing will throw exceptions when reading XML that contains external entity references. The exception class and message will be different depending on the particular JAXP component referenced, and each component type will throw an exception according to the API used. For example, both DOM and SAX parsers will throw an org.xml.sax.SAXParseException while StAX readers will throw a javax.xml.stream.XMLStreamException when encountering external references.
Implementing Unit Tests
The nifi-xml-processing resources directory includes a number of standard XML files with and without external entities. These files provide a straightforward method to confirm that each JAXP component has the required configuration settings. Using the standard JUnit 5 assertThrows method enables automated verification of expected parsing behavior when reading XML external entities.
Components in nifi-xml-processing
throw a generalized ProcessingException
, so checking the exception cause verifies
expected results. The following illustrates the behavior expected when attempting to parse an XML stream containing an
external entity:
ProcessingException e = assertThrows(ProcessingException.class, () -> provider.parse(stream));
assertInstanceOf(SAXParseException.class, e.getCause());
Configuring Static Code Analysis
Static code analysis has some limitations in terms of detection capabilities and runtime awareness, but it provides a useful layer of evaluation for XML processing.
The SpotBugs Maven Plugin provides generalized code analysis and
incorporates a plugin architecture to support additional features. The
Find Security Bugs plugin for SpotBugs includes analyzers capable of detecting XML
external entity vulnerabilities for the majority of JAXP components. Although the Find Security Bugs analyzer
cannot detect all types of XML external entity vulnerabilities as of version 1.12.0, it provides an additional layer of
evaluation for XML components within the nifi-xml-processing
module.
Configuring static analysis on a single module does not avoid the potential for introducing future issues outside
nifi-xml-processing
, but it provides a basic level of verification. Unlike previous methods in NiFi XmlUtils
, most
classes within nifi-xml-processing
do not provide direct access to configurable JAXP components, reducing the
possibility of misconfiguration.
Conclusion
The flexibility and features of XML continue to make it a common standard for formatted configuration and communication. Although Java 8 simplified the steps necessary to implement secure processing, external entity support remains both a common capability and potential vulnerability. The number of XML processing issues in NiFi over the years is a reminder of the challenges associated with maintaining a secure implementation across a large set of capabilities. With the introduction of a dedicated module and a comprehensive review of the NiFi repository, versions 1.16.1 and following provide the best protection against XXE attacks.