Deciphering Apache NiFi Component Property Encryption
Background
The concepts and terminology surrounding encryption can be as unintelligible as the enciphered contents itself. Layers of advanced mathematics, specialized vocabulary, and complex acronyms can confuse even experienced developers. Translating these concepts into usable documentation is challenging, making it difficult for an administrator to select the best configuration for a secure system. With a basic understanding of the concepts involved, however, it is possible to make informed decisions using available configuration options.
Introduction
Encryption of sensitive component properties is one of the foundational features of Apache NiFi. With the ability to send and receive data from a wide array of systems and services, protecting passwords, keys, and other credentials is vital. NiFi has supported encrypting these component properties from project incubation, but recent versions introduced several important new features. Through the addition of minimum length requirements, enhanced key derivation functions, authenticated encryption, and automatic key generation, NiFi 1.14.0 provides greater protection for sensitive component properties than previous versions. Understanding the concepts involved provides a basis of comparison for both new and existing capabilities.
Encryption Terminology
A thorough treatment of encryption and related terminology is beyond the scope of this present discussion, but highlighting several primary concepts should provide sufficient background. Cryptographic hashing, key derivation, and symmetric-key algorithms can be complicated to understand, and those with greater expertise may find the current summary too simplistic. For the purpose of relative comparison, however, a summary understanding provides the knowledge necessary for basic system security.
Cryptographic Hashing
A hash can be thought of as a compressed representation of arbitrary information. A hash function is a reusable method for generating a unique representation from one or more input parameters. The Java hashCode() method is one example of a reusable method for generating a unique representation of an object as an integer. Those familiar with the minimum and maximum values of an Integer understand that having over 4.2 billion unique values may not be enough to represent the possible combinations of various property values.
Cryptographic hashing has the same basic purpose as a regular hash function: generating a unique representation from input parameters. In addition, a cryptographic hash function should have several other important attributes. These attributes include protection against reverse engineering the original input from the hash, and avoidance of collisions in which two different inputs produce the same output representation. Hashes generated for encryption and signature verification are some of the most important building blocks for a secure application. MD5 is one of the most common cryptographic hashing functions of the last several decades, but recent versions of Secure Hash Algorithms have become the standard for modern security protocols.
Key Derivation
Key derivation is the process of producing a key from a simple input parameter, such as a word or phrase. Strong key derivation functions use additional inputs and various processing methods to generate a more complex representation. Many functions use a series of cryptographic hashing operations, combined with random information, to produce a value suitable for use as an encryption key. Advanced functions provide configurable properties that influence the amount of processing cycles and memory used to derive the output representation. Argon2 and PBKDF2 are examples of key derivation functions with configurable complexity parameters.
Symmetric-Key Encryption
Symmetric-key encryption involves using the same key for both sides of the cryptographic process. As opposed to asymmetric algorithms, where encryption uses a public key and decryption uses a private key, a symmetric-key algorithm requires the same key for both operations. Using a shared key has positives and negatives, often trading some amount of configuration security for a greater level of configuration simplicity. Modern protocols use a combination of asymmetric and symmetric algorithms to support multiple features including authentication and communication security. AES is one of the most common symmetric-key algorithms used for securing information, whether stored or transmitted.
NiFi Component Property Encryption
NiFi supports encryption for a variety of purposes, including sensitive component properties. Although NiFi uses the
terms sensitive and property in different contexts, sensitive component properties describes configurable
attributes in Processors
and Controller Services. NiFi uses a
different approach to encrypt values in nifi.properties
and other configuration files.
Sensitive Component Properties
Each component declares the sensitive status of a property through a custom descriptor attribute, which NiFi indicates in component documentation. Administrators cannot change the sensitive status of a property, as protection of the property is the responsibility of the component developer in relation to usage such as logging and external service access. In more recent versions of NiFi, Parameter Contexts provide a method to reuse shared values, but parameters marked as sensitive must be used in conjunction with component properties having the same status.
Configurable Settings
NiFi uses a configurable key and algorithm to determine the encryption scheme used when storing sensitive component properties. The persistent flow configuration includes component definitions and relationships as well as configuration attributes. The flow configuration contains each configured property, including an encrypted representation of sensitive values. The encrypted representation cannot be deciphered without the correct properties key and associated algorithm. This provides a layer of protection for the persistent configuration, but also requires the careful selection and tracking of the properties key.
The Sensitive Properties Algorithm
The nifi.properties
configuration defines the Sensitive Properties Algorithm using a setting
named nifi.sensitive.props.algorithm
. This property combines several attributes in a single string. The Sensitive
Properties Algorithm specifies the key derivation function, the symmetric-key algorithm, and the
key size that NiFi uses when encrypting component properties. Selecting the
optimal algorithm depends on several factors, and an exhaustive comparison of each option could be the subject of a
separate discussion. For the current consideration, available configuration options can be divided into two primary
categories.
Password-Based Encryption Algorithms
The first category includes algorithms implemented in the Bouncy Castle security library, which supports several cryptographic hash functions and symmetric-key algorithms. NiFi has supported algorithms in the first category from the beginning of the project. Although NiFi supports these algorithms for backward compatibility, most of the options do not meet current best practices for information security. Algorithms in this category including the following:
PBEWITHMD5AND256BITAES-CBC-OPENSSL
PBEWITHSHA256AND256BITAES-CBC-BC
PBEWITHSHAAND192BITAES-CBC-BC
Breaking down the algorithm values into their component parts describes the associated features. Taking the first
algorithm as an example, PBE
stands for Password-Based Encryption, which is the general term describing the encryption
process. WITHMD5
indicates using the MD5 cryptographic hash function for key derivation. AND256BITAES-CBC
specifies the use of AES in cipher block chaining
mode with a key size of 256 bits for symmetric-key encryption. OPENSSL
refers to the implementation scheme for
handling key derivation parameters, in this case OpenSSL. Other algorithms use BC
referring to Bouncy Castle.
Strong Key Derivation Function Algorithms
The second category includes algorithms based on strong key derivation functions and authenticated symmetric-key encryption. NiFi 1.12.0 introduced support for a new custom algorithm using Argon2 for key derivation and AES in Galois/Counter Mode for symmetric-key encryption. NiFi 1.14.0 expanded number of available options to include algorithms leveraging other strong key derivation functions, including bcrypt, PBKDF2, and scrypt. NiFi 1.12.0 and following support Argon2 algorithms:
NIFI_ARGON2_AES_GCM_256
NIFI_ARGON2_AES_GCM_128
NiFi 1.14.0 added support for the following algorithms:
NIFI_BCRYPT_AES_GCM_256
NIFI_PBKDF2_AES_GCM_256
NIFI_SCRYPT_AES_GCM_256
The NIFI
prefix indicates that the algorithm implementation is specific to Apache NiFi. The second element indicates
the key derivation function, such as ARGON2
. The remaining elements specify the symmetric-key algorithm and key size.
AES-GCM provides integrity protection as well as confidentiality,
and each option supports key sizes of either 128 or 256 bits.
The Sensitive Properties Key
The nifi.properties
configuration defines the Sensitive Properties Key using a property
named nifi.sensitive.props.key
, which specifies the source string used to derive an encryption key. Although the
property name includes the word key, it is more precise to describe it as a password. From a technical perspective,
the Sensitive Properties Key is not the encryption key itself, but the password from which NiFi derives the actual
encryption key. This distinction is important to remember when considering the configuration of the Sensitive Properties
Algorithm.
Encrypted component properties cannot be deciphered without the exact combination of the Sensitive Properties Key and the Sensitive Properties Algorithm that NiFi used for initial encryption. In other words, the same Sensitive Properties Key string results in a different encryption key when configured with a different Sensitive Properties Algorithm. After configuring components in a NiFi flow, this property cannot be changed without using other utilities.
The initial version of NiFi included an internal default string as a fallback for the Sensitive Properties Key, allowing an administrator to avoid configuring a custom setting. Starting with version 1.8.0, NiFi logged a multiline error message when encountering an empty string for the Sensitive Properties Key. This approach provided backward compatibility, but required manual configuration to implement a secure configuration. NiFi 1.14.0 introduced several changes related to the Sensitive Properties Key, requiring manual adjustments when upgrading, but providing a much more secure configuration for new installations.
Improved Default Settings
NiFi version 1.14.0 introduced several improvements to both the Sensitive Properties Key and Sensitive Properties
Algorithm. In earlier versions, the default nifi.properties
configuration included a blank Sensitive Properties Key
and a Sensitive Properties Algorithm with a weak cryptographic hash function.
NiFi version 1.13.2 and earlier contained the following line in nifi.properties
included with the binary distribution:
nifi.sensitive.props.algorithm=PBEWITHMD5AND256BITAES-CBC-OPENSSL
The release of NiFi 1.14.0 addressed latent issues in both the Sensitive Properties Key and the Sensitive Properties Algorithm through a combination of updated defaults and mandatory configuration settings.
New Default Sensitive Properties Algorithm
NiFi version 1.14.0 changed the default algorithm configuration to use PBKDF2 with AES-GCM and a key size of 256 bits:
nifi.sensitive.props.algorithm=NIFI_PBKDF2_AES_GCM_256
The new default algorithm replaces the insecure MD5 hash function with PBKDF2 for key derivation. The internal PBKDF2 configuration uses SHA-512 with 160,000 iterations as opposed to MD5 with 1000 iterations. This provides a much stronger key derivation process while maintaining reasonable performance on modern systems.
The new default algorithm also uses AES-GCM in place of AES-CBC to support integrity checking as well as confidentiality. Attacks against AES-CBC involve some level effort, but recent encryption protocols have removed support for the CBC mode of operation. The Transport Layer Security protocol removed cipher suites based on AES-CBC in favor of authenticated algorithms such as AES-GCM starting in TLS 1.3. With the constant advance of processing power, AES-GCM provides an additional layer of protection for sensitive information.
Mandatory Sensitive Properties Key
NiFi 1.14.0 also changed the internal handling of the Sensitive Properties Key. Rather than logging an error and falling back to a default string, version 1.14.0 throws an error in the absence of a Sensitive Properties Key:
Sensitive Properties Key [nifi.sensitive.props.key] not found:
See Admin Guide section [Updating the Sensitive Properties Key]
NiFi logs the following error when upgrading from a previous version with an existing flow configuration:
Flow Configuration [flow.xml.gz] Found:
Migration Required for blank Sensitive Properties Key [nifi.sensitive.props.key]
NiFi logs a different error when running in clustered mode without a Sensitive Properties Key:
Clustered Configuration Found:
Shared Sensitive Properties Key [nifi.sensitive.props.key] required for cluster nodes
Several options are available for updating the Sensitive Properties Key.
Setting the Sensitive Properties Key
NiFi 1.14.0 includes a new option for setting the key using the NiFi script command. As referenced in the error message, the Administration Guide describes setting the Sensitive Properties Key using the following command:
nifi.sh set-sensitive-properties-key PROPERTIES_KEY
The command reads the current Sensitive Properties Key and Sensitive Properties Algorithm from nifi.properties
to
process the existing flow configuration, and then uses the new Sensitive Properties Key to write the updated flow
configuration. The command also updates the value of nifi.sensitive.props.key
in nifi.properties
. New Sensitive
Properties Key values have a minimum length requirement of 12 characters.
The NiFi Encrypt-Config Tool also includes parameters for setting the Sensitive Properties Key. The following NiFi Toolkit command can be used to change the Sensitive Properties Key:
encrypt-config.sh -n nifi.properties -f flow.xml.gz -x -s PROPERTIES_KEY
The command performs the same basic operation as the set-sensitive-properties-key
command, updating
both nifi.properties
and flow.xml.gz
.
Setting the Sensitive Properties Algorithm
The NiFi Encrypt-Config Tool can be used to change the Sensitive Properties Algorithm using additional command arguments. Just as changing the Sensitive Properties Key requires updating both NiFi properties and flow configuration, changing the Sensitive Properties Algorithm requires updating both files. The following NiFi Toolkit command updates the flow configuration using the specified algorithm and new Sensitive Properties Key:
encrypt-config.sh -n nifi.properties -f flow.xml.gz -x -s PROPERTIES_KEY -A NIFI_ARGON2_AES_GCM_256
The nifi.properties
configuration must be updated with the new Sensitive Properties Algorithm after running the
command.
nifi.sensitive.props.algorithm=NIFI_ARGON2_AES_GCM_256
The nifi.properties
configuration must include both the Sensitive Properties Key and Sensitive Properties Algorithm
which NiFi used to encrypt flow configuration component properties. If either of these settings do not match the
original values, NiFi throws the following error when configured with one of the algorithms using AES-GCM encryption:
EncryptionException: Decryption Failed with Algorithm [AES/GCM/NoPadding]
Caused by: javax.crypto.AEADBadTagException: mac check in GCM failed
NiFi throws a similar error when configured with one of the algorithms using AES-CBC encryption:
EncryptionException: Decryption Failed with Algorithm [PBEWITHMD5AND256BITAES-CBC-OPENSSL]
Caused by: javax.crypto.IllegalBlockSizeException: last block incomplete in decryption
If it is not possible to recover the correct Sensitive Properties Key and Sensitive Properties Algorithm, encrypted property values can be removed from the flow configuration. This requires reentering the sensitive properties using the NiFi user interface, but it preserves the flow configuration itself.
Random Sensitive Properties Key Generation
The NiFi binary distribution includes a blank property for nifi.sensitive.props.key
, but version 1.14.0 added support
for generating a random Sensitive Properties Key in new installations. When running as a single node without an existing
flow configuration, the system checks for the absence of the Sensitive Properties Key and sets a random string. NiFi
logs the following warning when generating a new Sensitive Properties Key:
Generating Random Sensitive Properties Key [nifi.sensitive.props.key]
NiFi leverages the Java SecureRandom class
to generate 24 bytes and then converts the binary array to a string of 32 Base64
characters. NiFi writes the generated string to nifi.properties
and logs the following informational message:
NiFi Properties [nifi.properties] updated with Sensitive Properties Key
The random string is suitable for single node installations, but will not work with clustered deployments since all nodes must share the same Sensitive Properties Key.
Conclusion
Component property encryption is a complicated subject, and can be a source of confusion given the array of concepts involved. NiFi abstracts the details using several configuration properties, and NiFi 1.14.0 provides notable improvements to the implementation. Upgrading the configuration to use a strong key derivation function and selecting a complex Sensitive Properties Key are two important parts of maintaining a secure deployment. A basic awareness of strengths and weakness in various algorithms is useful not only for configuring NiFi, but for understanding the relative security of numerous applications and communication protocols.