Analyzing and Mitigating XML External Entity Vulnerabilities in Apache NiFi

2022-05-31 • 12 minute read • David Handermann

Background

XML external entity processing is a markup language feature that remains a perennial problem across the spectrum of software applications. Although XML 1.0 Specification Section 4.2.2 provides the official definition of an external entity, support for this capability exposes an application to a number of threats. Depending on the deployment environment, XML external entity processing creates the potential for Denial-of-Service attacks, disclosure of sensitive information, and enumeration of network resources, among other vulnerabilities.

Document Type Definitions

An XML Document Type Definition (DTD) is the most common location for external references. Unrestricted XML parsers interpret entity declarations and request referenced resources during the evaluation process.

The following provides an example XML document containing a DTD that declares an external entity named configuration, which the document references in the source element:

<?xml version="1.0" ?>
<!DOCTYPE source [
  <!ENTITY configuration SYSTEM "http://localhost/configuration">
]>
<source>&configuration;</source>

During parsing, an unrestricted XML parser makes a request to http://localhost/configuration and replaces the &configuration; entity with the response returned. Although this might be a useful feature in limited cases when processing information from trusted sources, it can expose unprotected applications to a range of problems.

Common Vulnerabilities

The Open Web Application Security Project has listed XXE vulnerabilities among its Top Ten most critical security issues for many years. The Common Weaknesses Enumeration defines CWE-611 as the standard identifier for improper restriction of XML external entity references.

The ubiquity of XML, combined with insecure default settings in XML parsers, has contributed to the persistent nature of vulnerabilities related to XML external entity references. Although framework abstractions can offer protection against these issues, using XML capabilities in many programming languages requires careful implementation to avoid dangerous code.

Java Processing

The Java API for XML Processing includes multiple configuration classes and programming interfaces for performing various XML operations. The JAXP specification is organized into several categories:

JAXP also supports Schema Validation and XML Path Language capabilities. Each category requires careful evaluation and configuration to implement a secure system.

Introduction

Apache NiFi handles XML in a variety of ways using both framework and extension components. As a flexible format, XML provides configuration for authentication and authorization components, and also enables structured data transmission for custom data flows. With these varied use cases, NiFi presents a number of places where it is necessary to implement secure XML processing.

Over the course of multiple releases, NiFi has resolved several vulnerabilities related to unrestricted XML external entity parsing. These vulnerabilities have impacted both internal framework processing and configurable extension components. The attack surface and severity have varied depending on the location, but each vulnerability shared the same characteristics. The number of issues over years illustrates the challenge of implementing consistent defenses against XML external entity attacks. The recurrence of similar problems also highlighted the need for a reusable solution. In response to the most recent instance of vulnerable XML processing, Apache NiFi 1.16.1 introduced a new module implementing a common approach to secure XML handling.

Enumerated Vulnerabilities

NiFi has published the following vulnerabilities related to direct processing of XML external entities:

CVE-2017-12623
- Summary: Flow Template Processing
- Corrected Version: NiFi 1.4.0
- NVD Base Score: 6.5
CVE-2018-1309
- Summary: SplitXml Processing
- Corrected Version: NiFi 1.6.0
- NVD Base Score: 9.8
CVE-2019-10080
- Summary: XMLFileLookupService Configuration
- Corrected Version: NiFi 1.10.0
- NVD Base Score: 6.5
CVE-2020-13940
- Summary: Bootstrap and Framework Configuration
- Corrected Version: NiFi 1.12.0
- NVD Base Score: 5.5
CVE-2021-44145
- Summary: TransformXml Configuration
- Corrected Version: NiFi 1.15.1
- NVD Base Score: 6.5
CVE-2022-29265
- Summary: Component and Viewer Processing
- Corrected Version: NiFi 1.16.1
- NVD Base Score: 7.5

Flow Template Processing

CVE-2017-12623 describes vulnerabilities related to processing flow templates, which authenticated users can download and upload. Flow templates enable reusable component groups to be managed outside NiFi, making it easier to share flow configuration details across environments.

NiFi relies on Java Architecture for XML Binding to parse an XML stream into standard model objects. The default configuration of JAXB and the supporting XML stream sources do not disable XML external entity references, which allowed authenticated users to upload dangerous XML documents.

NIFI-4353 covered the initial resolution to vulnerable template processing, and NIFI-4357 extended the approach to address other uses of JAXB and XMLStreamReader in multiple framework configuration components. These changes introduced a reusable XmlUtils class in nifi-security-utils that disabled DTD and External Entity support on an XMLInputFactory used to create XMLStreamReader instances. This approach addressed issues with multiple framework configuration sources, such as Login Identity Providers and Authorizers. The impact of this vulnerability was limited due to the requirement for authentication in order to provide XML documents. NiFi 1.4.0 and following incorporated these improvements.

SplitXml Processing

CVE-2018-1309 addressed vulnerable XML handling in the SplitXml Processor, which enables separating a single XML document into multiple documents using a configurable depth property. SplitXml leverages the Simple API for XML to read XML streams using event listener approach.

NIFI-4869 and the associated pull request provided the resolution to vulnerable SAX usage in the SplitXml Processor. The resolution expanded the XmlUtils class to include a new method for creating a SAXParser that disabled external entities and disallowed DTD declarations. The severity of vulnerable processing in SplitXml depends on when and where it is configured on the context of a particular flow. Flows without SplitXml are not vulnerable, but flows configured to receive and process XML from untrusted sources present a serious risk. NiFi 1.6.0 and following provided a secure version of the SplitXml Processor.

XMLFileLookupService Configuration

CVE-2019-10080 improved the behavior of the XMLFileLookupService , which enables externalized configuration for Processors such as LookupAttribute . The XMLFileLookupService reads an XML document specified using the Configuration File property. The service leverages the Apache Commons Configuration library to load XML documents, providing a standard format, without disabling DTD processing.

NIFI-6301 improved the security of XMLFileLookupService with the introduction of a SafeXMLConfiguration class designed to disable vulnerable processing with Document Object Model components. Exposure to vulnerable processing through XMLFileLookupService requires both configuring the Controller Service itself and providing a document containing XML external entities. This vulnerability presents a low level of concern given the limited attack surface and requirement for authenticated access. NiFi 1.10.0 and following included an updated version of the XMLFileLookupService component.

Bootstrap and Framework Configuration

CVE-2020-13940 involved unrestricted external entity processing in the NiFi Bootstrap Notification Manager configuration and multiple framework configuration files, such as the Access Policy Provider, User Group Provider, and State Manager. Processing bootstrap and framework configuration files leverages DOM DocumentBuilder methods, which read elements into memory for simplified access.

NIFI-7680 added several methods to XmlUtils supporting instantiation of DocumentBuilder instances with external access disabled. Changes associated with this issue involved a number of framework classes, as well as several unit tests. Exploiting vulnerable processing required local filesystem access, minimizing the scope of potential problems. NiFi 1.12.0 and following incorporated the bootstrap and framework improvements.

TransformXml Configuration

CVE-2021-44145 outlines processing issues related to XSLT sources configured in the TransformXml Processor. NiFi 1.3.0 introduced the Secure processing property to TransformXml, neutralizing dangerous XML sources, but the configuration did not apply to XML stylesheets configured from the XSLT file name or XSLT Lookup properties. The TransformXml Processor converts input XML to a new output structure based on the configured XSLT source.

NIFI-9399 improved the behavior of TransformXml, applying the Secure processing property when reading the XSLT source using the Streaming API for XML. The changes leveraged the existing XmlUtils method for creating and configuring secure XMLStreamReader instances. Although vulnerable XSLT files could be introduced more easily when using a Lookup Service, the scope of the problem was limited due to the requirement for authenticated access to configure the TransformXml Processor. NiFi 1.15.1 and following incorporated TransformXml Processor improvements.

Component and Viewer Processing

CVE-2022-29265 involved runtime XML external entity handling in the following Processors:

In addition to vulnerable Processors, the Standard Content Viewer was also susceptible to XML external entity attacks when viewing XML documents using the formatted selection. All of these components were vulnerable when configured using default property values.

The EvaluateXPath and EvaluateXQuery Processors leverage SAX for parsing input documents, while ValidateXml uses StAX for reading sources to be validated. The Standard Content Viewer also leveraged StAX for input sources, as well as JAXP XSLT components for formatting output documents.

Resolving issues with these components required implementing secure settings for several styles of Java XML processing. Starting with the reusable approach that XmlUtils provided, NIFI-9901 involved a complete refactor of XML handling across both framework and extension components.

NIFI-9901 introduced a new nifi-xml-processing module with discrete components for common Java XML processing operations. This approach avoided unnecessary dependencies in nifi-security-utils and replaced XmlUtils utility methods with interfaces and standard implementations that encapsulated operations for DOM, SAX, StAX, and Validator interfaces. NIFI-9943 resolved vulnerabilities specific to the Standard Content Viewer and added reusable components for XSLT processing. This strategy replaced direct references to JAXP interfaces with access to components provided in nifi-xml-processing, and also incorporated common vulnerability testing.

The severity of the CVE-2022-29265 varies depending on the flow configuration and sources of data processed. Flows that do not use EvaluateXPath, EvaluateXQuery, or ValidateXml are less vulnerable to runtime attacks. Even without these Processors, however, flows that process XML from untrusted sources are still vulnerable to issues with the Standard Content Viewer. Using the formatted option to view XML FlowFile content triggers unrestricted processing, which includes evaluating XML external entities. NiFi 1.16.1 included the refactored approach to XML processing across framework and extension components.

Enabling Secure Processing

Implementing a secure approach to XML processing requires different settings depending on the JAXP API used. The implementation strategy also varies depending on the runtime Java version and JAXP implementation. Historical approaches continue to work on recent versions of Java, but a secure solution requires both correct code and controlled runtime class configuration.

JDK Enhancement Proposal 185 introduced new configuration properties and improved existing secure processing features to restrict access to external XML resources. Java 7 Update 40 and Java 8 included JAXP 1.5, which incorporated JEP 185 improvements. Supporting secure XML processing in Java 7 is more complicated, but Java 8 provides a streamlined approach when using the standard JAXP implementation.

The Java API for XML Processing (JAXP) Security Guide provides a detailed overview of potential attacks and secure programming recommendations. NiFi components in the nifi-xml-processing module follow the general guidelines and provide reusable abstractions around JAXP capabilities.

Configuring JAXP Components

Different Java XML processing components require different approaches to secure configuration. Each JAXP API has an associated class that must be configured to avoid potential XML processing vulnerabilities.

The majority of JAXP configuration classes support the Feature for Secure Processing , which disables XML external entity processing and also sets limits on memory consumption. StAX components do not support the feature, but can be configured using specific properties.

XML Processing Configuration Classes

The following classes control XML processing capabilities, and any reference to these classes should be evaluated to confirm a secure implementation:

All of these components except for the StAX XMLInputFactory support the Feature for Secure Processing defined in javax.xml.XMLConstants.FEATURE_SECURE_PROCESSING flag.

Enabling Feature for Secure Processing

Calling the setFeature method on supporting classes, using the feature flag as follows, enables secure processing:

factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);

This approach is sufficient for components outside the javax.xml.stream package when running on Java 8 and following, using the standard JDK implementation.

Providing other JAXP implementations through different dependencies can alter runtime behavior. For this reason, it is important to know both the runtime Java version and the runtime JAXP implementation.

Disabling External Entities for Streaming Components

Without support for the secure processing feature flag, configuring StAX components requires setting additional properties on XMLInputFactory instances prior to creating XML readers.

The javax.xml.XMLConstants.ACCESS_EXTERNAL_DTD property supports disabling XML external entity references when setting the value to an empty string. The property controls which protocols the parser can use to retrieve external resources, and setting the value to an empty string disables all protocols as follows:

factory.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, "");

As an alternative to disabling external access, Document Type Definition support can be disabled using the javax.xml.stream.XMLInputFactory.SUPPORT_DTD property, and XML external entities can be disabled using the javax.xml.stream.XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES property. Setting the property values to false is more restrictive than disabling external access from a DTD, but provides a more secure implementation.

factory.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, false);
factory.setProperty(XMLInputFactory.SUPPORT_DTD, false);

Abstracted Processing Interfaces

The nifi-xml-processing module includes the following interfaces that correspond to JAXP components:

Document Object Model
- org.apache.nifi.xml.processing.parsers.DocumentProvider
Extensible Stylesheet Language Transformations
- org.apache.nifi.xml.processing.transform.TransformProvider
Simple API for XML
- org.apache.nifi.xml.processing.sax.InputSourceParser
Streaming API for XML
- org.apache.nifi.xml.processing.stream.XMLEventReaderProvider
- org.apache.nifi.xml.processing.stream.XMLStreamReaderProvider
Validation
- org.apache.nifi.xml.processing.validation.SchemaValidator

Each interface has an associated standard implementation that includes the features and properties necessary to enable secure processing. The standard classes are compatible with the JAXP implementation bundled with the JDK, requiring no external dependencies.

Verifying Secure Processing

Java components configured for secure processing will throw exceptions when reading XML that contains external entity references. The exception class and message will be different depending on the particular JAXP component referenced, and each component type will throw an exception according to the API used. For example, both DOM and SAX parsers will throw an org.xml.sax.SAXParseException while StAX readers will throw a javax.xml.stream.XMLStreamException when encountering external references.

Implementing Unit Tests

The nifi-xml-processing resources directory includes a number of standard XML files with and without external entities. These files provide a straightforward method to confirm that each JAXP component has the required configuration settings. Using the standard JUnit 5 assertThrows method enables automated verification of expected parsing behavior when reading XML external entities.

Components in nifi-xml-processing throw a generalized ProcessingException, so checking the exception cause verifies expected results. The following illustrates the behavior expected when attempting to parse an XML stream containing an external entity:

ProcessingException e = assertThrows(ProcessingException.class, () -> provider.parse(stream));
assertInstanceOf(SAXParseException.class, e.getCause());

Configuring Static Code Analysis

Static code analysis has some limitations in terms of detection capabilities and runtime awareness, but it provides a useful layer of evaluation for XML processing.

The SpotBugs Maven Plugin provides generalized code analysis and incorporates a plugin architecture to support additional features. The Find Security Bugs plugin for SpotBugs includes analyzers capable of detecting XML external entity vulnerabilities for the majority of JAXP components. Although the Find Security Bugs analyzer cannot detect all types of XML external entity vulnerabilities as of version 1.12.0, it provides an additional layer of evaluation for XML components within the nifi-xml-processing module.

Configuring static analysis on a single module does not avoid the potential for introducing future issues outside nifi-xml-processing, but it provides a basic level of verification. Unlike previous methods in NiFi XmlUtils, most classes within nifi-xml-processing do not provide direct access to configurable JAXP components, reducing the possibility of misconfiguration.

Conclusion

The flexibility and features of XML continue to make it a common standard for formatted configuration and communication. Although Java 8 simplified the steps necessary to implement secure processing, external entity support remains both a common capability and potential vulnerability. The number of XML processing issues in NiFi over the years is a reminder of the challenges associated with maintaining a secure implementation across a large set of capabilities. With the introduction of a dedicated module and a comprehensive review of the NiFi repository, versions 1.16.1 and following provide the best protection against XXE attacks.