ExceptionFactory

Producing content that a reasonable developer might want to read

Restructuring Apache NiFi Support for OpenPGP

NiFi OpenPGP PGP Encryption

2021-09-14 • 15 minute read • David Handermann

Background

The OpenPGP specification has its roots in the Pretty Good Privacy software program released in 1991. As a public standard for encryption and digital signatures, OpenPGP supports secure communication across multiple platforms. As of this writing, RFC 4880 represents the current official specification for message formatting and supported algorithms. With its technological history, OpenPGP has accumulated both capabilities and criticisms, but ongoing support continues in a number of programming languages and applications. See Surveying Pretty Good Privacy After Three Decades for additional background on both the technical specification and various implementations.

Original Implementation

Support for OpenPGP in Apache NiFi goes back to version 0.1.0, which added encryption and decryption capabilities to the EncryptContent Processor. NiFi leverages the Bouncy Castle cryptographic library to handle all OpenPGP processing functions. Aside from the introduction of a configurable symmetric cipher property in NiFi 1.10.0, OpenPGP support has remained largely unchanged since the original implementation.

Design Considerations

Although the EncryptContent Processor supports a number of OpenPGP encryption and decryption features, it also includes a handful of latent issues. Aside from overloading the Encryption Algorithm property to include PGP and PGP_ASCII_ARMOR as supported values, configuring the Processor requires selecting a different combination of properties depending on whether it is intended to perform encryption or decryption. As a result of the initial design, the Processor also lacks support for alternative public key algorithms. Although NiFi 1.10.0 included support for configurable symmetric cipher algorithms, the internal encryption process uses Zip compression as the default setting.

Performance and Error Handling

The original implementation suffered from poor performance in some scenarios due to loading and searching keyring files for every invocation of the Processor. The flexibility of the OpenPGP format together with the Processor design complicated failure handling, resulting in a large variety of potential error conditions. Lack of logging and discarding of some exceptions also made troubleshooting more difficult than necessary. These concerns presented serious maintenance challenges and slowed progress on potential enhancements.

Redesigned Components

Issues with the original implementation and the scope of requested features presented an opportunity for a new approach to OpenPGP processing. At a fundamental level, public-key cryptography requires different inputs for different operations: a public key for encryption and a private key for decryption. Although OpenPGP also supports password-based encryption, requiring a shared secret for encryption and decryption, splitting the cipher operations into separate components provides a much clearer indication of the necessary properties. NiFi provides a high degree of flexibility when it comes to both flow design and component implementation, so developing with the right level of abstraction is important when it comes to supporting composable processing capabilities.

In addition to splitting encryption and decryption operations, abstracting the retrieval of public and private keys also provides a helpful separation between loading required resources and processing files. Through the use of Controller Services, loading keys for encryption or decryption becomes both a generalized concern and an opportunity for reusing common resources. A combination of these concepts provided the basis for a new solution with new components.

New Features

Before walking through the implementation details, it is worth highlighting a number of new and improved features. In comparison to the original implementation, the new Processors and Controller Services incorporate the following capabilities:

New Processors

NiFi 1.14.0 includes two new Processors supporting OpenPGP messages: EncryptContentPGP and DecryptContentPGP . The EncryptContentPGP Processor includes and requires several properties while the DecryptContentPGP Processor is capable of parsing OpenPGP messages to determine format details. Each Processor depends on a corresponding Controller Service to handle messages using public-key cryptography. The Processors do not require Controller Services to perform password-based encryption and decryption.

New Controller Services

As part the refactored implementation, NiFi 1.14.0 defines two new Controller Service interfaces: PGPPublicKeyService and PGPPrivateKeyService. Implementing these interfaces, StandardPGPPublicKeyService provides support for EncryptContentPGP and StandardPGPPrivateKeyService provides support for DecryptContentPGP. Abstracting configuration and retrieval of public and private keys streamlined the testing process for both Processors and allowed greater configuration flexibility for the Controller Services.

Encryption Properties

EncryptContentPGP includes several configurable properties with default values to control encryption processing. The Processor requires the following properties and specifies the corresponding default values:

These default properties are equivalent to the following properties from the EncryptContent Processor:

Symmetric-Key Algorithm Configuration

The Symmetric-Key Algorithm property in EncryptContentPGP supports a subset of available OpenPGP encryption algorithms and does not include the following options due to proven or potential cipher weaknesses:

Supported symmetric-key algorithms include AES and Camellia with key sizes of 128, 192, or 256 bits. The OpenPGP implementation of AES operates in Cipher Feedback mode, which does not incorporate integrity checking associated with other modes such as GCM or CCM.

The default property value of AES_256 uses the largest available key size together with the most widely supported symmetric-key algorithm. The EncryptContentPGP Processor also includes the Modification Detection Code Packet on all encrypted messages to support basic integrity checking.

Compression Algorithm Configuration

The Compression Algorithm property in EncryptContentPGP defaults to the common ZIP algorithm, and also supports new options not previously available in the EncryptContent Processor. These additional compression algorithm options include:

Although the OpenPGP specification indicates that implementations should compress messages prior to encryption, there are different perspectives regarding the relative security of using compression with encryption. The default ZIP option provides a high level of compatibility, while using UNCOMPRESSED may be a better option for smaller automated messages. For larger messages, BZIP2 can provide more effective compression at the expense of greater CPU usage during compression.

File Encoding Configuration

The File Encoding property in EncryptContentPGP defaults to BINARY and also supports Base64-encoded ASCII, also known as ASCII Armor. Binary encoding is much more efficient in terms of message size, but ASCII encoding supports transfer over textual protocols such as SMTP.

Password-Based Configuration

The Passphrase property in EncryptContentPGP specifies the string source for protecting messages using password-based encryption. The Bouncy Castle implementation of OpenPGP uses the Iterated and Salted String-to-Key function for deriving an encryption key from the configured passphrase. EncryptContentPGP uses the SHA-1 hash function and leverages the Bouncy Castle default setting of 65536 iterations with a random salt of eight bytes. As with any password-based algorithm, the strength of the encryption rests ultimately on the length and complexity of the passphrase.

Public Key Configuration

In contrast to the original EncryptContent implementation, EncryptContentPGP delegates public key retrieval to a configurable Controller Service. When configured with a value for Public Key Service, the EncryptContentPGP Processor also requires a value for the Public Key Search property. With this approach, the configured service can reference a keyring containing multiple public keys, and the processor must be configured to select a specific public key for encryption operations.

The StandardPGPublicKeyService supports configuring a file path reference using the Keyring File property, or providing the contents of a public key encoded using ASCII Armor in the Keyring property. Public keys encoded using ASCII Armor contain multiple Base64 lines with a standard header:

-----BEGIN PGP PUBLIC KEY BLOCK-----

Interpretation of the Public Key Search property depends on the configured Public Key Service. The StandardPGPPublicKeyService supports searching the User ID packet of each public key, matching on name, email address, or the combined user identifier string. The standard service implementation also supports matching against the numeric key identifier when the Public Key Search is configured with 16 hexadecimal characters.

The OpenPGP public key for the NiFi Security email address provides an example of potential configuration options for the Public Key Search property. The public key is available for download from keys.openpgp.org. For the purposes of the following examples, the public key should be saved to a file named public.key.asc.

The following GNU Privacy Guard command can be used to print the packet information contained in a file named public.key.asc:

gpg --list-packets public.key.asc

The command output displays the contents of each packet in a separate section. The first packet contains the public key identifier and the second packet contains the user identifier:

# off=0 ctb=c6 tag=6 hlen=3 plen=525 new-ctb
:public key packet:
        version 4, algo 1, created 1490292491, expires 0
        pkey[0]: [4096 bits]
        pkey[1]: [17 bits]
        keyid: AFF2B36823B944E9
# off=528 ctb=cd tag=13 hlen=2 plen=47 new-ctb
:user ID packet: "Apache NiFi Security <security@nifi.apache.org>"

The keyid field of the public key packet provides the hexadecimal representation that can be specified in the Public Key Search property of EncryptContentPGP. Configuring the Processor with AFF2B36823B944E9 in Public Key Search requires an exact match against the key identifier, avoiding any potential ambiguity related to the user identifier.

The user ID packet contains both the name and email address that can also be specified in the Public Key Search property. Using the full user identifier of Apache NiFi Security <security@nifi.apache.org> in Public Key Search provides the most precise approach to matching against the user identifier. The service also supports partial matching using the name or email address. Configuring the full user identifier or email address avoids potential unexpected matches when the supplied keyring contains multiple entries.

Decryption Properties

Configuring content decryption is much simpler than encryption since OpenPGP messages contain all the necessary algorithm information. DecryptContentPGP provides two configurable properties: Passphrase and Private Key Service. The Passphrase property supports password-based encryption and the Private Key Service property supports public key encryption.

OpenPGP messages indicate the type of encryption strategy, and messages encrypted using a public key include the associated key identifier. With this information, DecryptContentPGP attempts to read messages using configured properties. When processing public-key encrypted messages, DecryptContentPGP searches for matching private keys based on the key identifier listed in the message itself.

Private Key Configuration

Similar to the public key service implementation, the StandardPGPPrivateKeyService supports configuring a file path using Keyring File, or providing the ASCII-encoded contents of a private key in the Keyring property. The service requires the Key Password property to read private keys. The service is capable of reading multiple private keys from configured properties as long as all private keys have the same password.

Flow Configuration

A full consideration of NiFi flow design with OpenPGP components in beyond the scope of the current discussion, but describing some basic examples provides a starting point for integration. Both EncryptContentPGP and DecryptContentPGP include the de facto standard success and failure routing relationships.

When configuring DecryptContentPGP, it is important to note that it does not incorporate digital signature verification. For this reason, content entering the Processor should not be considered trusted without some other means of authenticating the data source. Although EncryptContentPGP includes a modification detection code, it does not sign messages. With these caveats, the new Processors can support a number of use cases, and the new Controller Services can be leveraged for additional development efforts related to signing and verification.

Interoperation with GNU Privacy Guard

GNU Privacy Guard provides several capabilities necessary for building a functional flow using public key encryption. Leveraging GPG for key generation and initial processing demonstrates the capabilities of the new NiFi components as well as other potential integration options.

Key Pair Generation

For the purpose of demonstrating public key encryption, the first step is obtaining a public and private key. Generating a key pair requires a user identifier. Most user identifiers consist of a name and an email address. The following command generates a key pair with nifi-flow as the user identifier using the default GPG algorithm:

gpg --quick-generate-key nifi-flow

The command prompts for a passphrase to protect the private key and also sets an expiration based on the default GPG configuration. The command output includes the hexadecimal key identifier as well as the key fingerprint, expiration, and algorithm:

gpg: key 19FE266F132D8430 marked as ultimately trusted
public and secret key created and signed.

pub   rsa3072 2021-09-14 [SC] [expires: 2023-09-14]
2CCAE4781C90BBFDCB830EB719FE266F132D8430
uid                      nifi-flow
sub   rsa3072 2021-09-14 [E]

As indicated in the output, the key identifier is 19FE266F132D8430, the user identifier is nifi-flow, and the key algorithm is RSA with a size of 3072 bits. The command stores the generated key pair in the default GPG keyring of the user running the command.

Exporting Public Keys

In order for NiFi to perform encryption operations, it is necessary to export the public key from the GPG keyring. The following command exports the public key to a file named nifi-flow.public.key encoded using ASCII Armor:

gpg --export --armor nifi-flow > /tmp/nifi-flow.public.key

The file contains Base64-encoded lines with standard header and footer lines indicating the PGP public key contents.

Exporting Private Keys

NiFi requires a private key to decrypt OpenPGP messages. The following command exports the private key to a file named nifi-flow.private.key encoded using ASCII Armor and protected using the provided passphrase:

gpg --export-secret-keys --armor nifi-flow > /tmp/nifi-flow.private.key

Although the file is protected with a passphrase, read permissions on the file should be restricted to the NiFi user.

Encrypting Files using GPG

To provide an encrypted input file for testing the DecryptContentPGP Processor, run the following command to generate a file containing a random string:

uuidgen > /tmp/generated

Run the following GPG command to encrypt the file using the nifi-flow public key with ASCII Armor encoding:

gpg --encrypt --armor --recipient nifi-flow /tmp/generated

The GPG command leaves the input file unchanged and creates a new encrypted file with the following name:

/tmp/generated.asc

Decrypting Files using DecryptContentPGP

The DecryptContentPGP Processor requires an input relationship for processing, and the GetFile Processor provides a simple method for reading input files.

Configure a GetFile Processor with the following properties and values:

Configure a StandardPGPPrivateKeyService Controller Service with the following properties and values, substituting KEY PASSWORD with the passphrase entered during key pair generation:

Enable the Controller Service after entering the required properties.

Configure a DecryptContentPGP Processor with the following properties:

To log attributes after decryption processing, configure a LogAttribute Processor with the success relationship selected for automatic termination. The Bulletin Level option should be set to INFO. This LogAttribute configuration is useful for the purposes of demonstration, but should not be used for production flows.

After configuring each Processor, connect the GetFile relationship named success to DecryptContentPGP, and connect both the success and failure relationships from DecryptContentPGP to LogAttribute.

Start the DecryptContentPGP and LogAttribute Processors, and use the Run Once option on GetFile to trigger processing. The LogAttribute Processor should generate a Bulletin Board entry that includes the standard FlowFile attributes as well as the following OpenPGP attributes after successful decryption:

The pgp.literal.data.filename attribute contains the name of file prior to encryption. The pgp.literal.data.modified attribute contains the timestamp in milliseconds when the file was encrypted. The pgp.symmetric.key.algorithm.id contains the numeric identifier of the Symmetric-Key Algorithm that the originator used for encryption.

Encrypting Files using EncryptContentPGP

The EncryptContentPGP Processor requires an input relationship, and the GenerateFlowFile Processor is a convenient option for producing files.

Configure a GenerateFlowFile Processor with the following properties and values, including a custom filename property:

Configure a StandardPGPPublicKeyService Controller Service with the following properties:

Enable the Controller Service after entering the required properties.

Configure an EncryptContentPGP Processor with the following properties:

Configure a LogAttribute Processor with the Bulletin Level option set to INFO.

Configure a PutFile Processor with the following properties, and select both the success and failure relationships for automatic termination:

Connect the GenerateFlowFile relationship named success to EncryptContentPGP, and connect both the success and failure relationships from EncryptContentPGP to LogAttribute. Connect the success relationship from LogAttribute to PutFile.

Start the EncryptContentPGPand LogAttribute Processors along with PutFile, then use the Run Once option on GenerateFlowFile to trigger processing. The LogAttribute Processor should generate a Bulletin Board entry that includes the following OpenPGP attributes after successful encryption:

The PutFile Processor should write an encrypted file to the following location:

/tmp/generate-flow-file

Decrypting Files using GPG

Run the following command to decrypt and display the contents of the file that NiFi encrypted using EncryptContentPGP:

gpg --decrypt /tmp/generate-flow-file

The GPG command should prompt for the passphrase entered when generating the key pair, and then print the following information along with the content entered in GenerateFlowFile:

gpg: encrypted with 3072-bit RSA key, ID 19FE266F132D8430, created 2021-09-14
  "nifi-flow"
Lorem ipsum dolor sit amet

Removing Generated Keys

Removing keys from the internal GPG keyring requires running several commands. After completing the interoperation steps described, the following command can be used to remove the generated private key:

gpg --delete-secret-keys nifi-flow

Press y when prompted to confirm deletion of the selected key. After removing the private key, the following command can be used to delete the corresponding public key:

gpg --delete-keys nifi-flow

Press y when prompted to confirm deletion. Run the following command to confirm removal of the generated private key:

gpg --list-secret-keys

Conclusion

The new OpenPGP components released in NiFi 1.14.0 bring improvements to both flow configuration and processing capabilities. The EncryptContent Processor remains functional, but existing flows should be migrated to use EncryptContentPGP and DecryptContentPGP for all OpenPGP message handling. The new Controller Services supporting these Processors provide optimized access to public and private keys, avoiding configuration issues inherent in the EncryptContent design. More development effort is necessary to implement message signing and verification, but the new Controller Services provide a starting point for future work. Building on the foundation of the Bouncy Castle library, NiFi can support interoperation with a variety of OpenPGP applications.