Rebooting Bootstrap for Apache NiFi 2
Structural Background
The bootstrap module for Apache NiFi has provided process control as a foundational feature of the application since the initial release of version 0.0.1. Shell scripts designed for different platforms provide user-facing commands, but as the name implies, the bootstrap module manages the lifecycle of the Java Virtual Machine process for the application. This layered implementation strategy keeps shared capabilities in Java code, avoiding additional complexity in scripts specific to each supported platform. From preparing the application class path to formatting command arguments, the bootstrap module ensures that the NiFi launch process is both consistent and configurable across supported operating systems.
Introduction
The redesigned bootstrap process for Apache NiFi 2 combines modernized process handling with standardized HTTP monitoring, providing a robust structure for various deployment strategies. The rebuilt bootstrap components maintain support for existing shell script commands, minimizing the direct impact to users while introducing important structural improvements. Significant refactoring of bootstrap components also involved restructuring the framework runtime module with status reporting over HTTP.
The historical approach to loading application properties required unclear coupling between the framework and bootstrap modules, leading to confusing errors that did not appear during the standard build process. With a decoupled loading process, the bootstrap and framework modules now have a clearer separation of responsibilities. Refactoring code that has evolved over almost a decade presented a number of challenges. The release of a new major version provided the opportunity to bring a fresh approach to the NiFi bootstrap process. Revisiting and rebuilding application startup is neither glamorous nor trivial, but streamlined startup benefits current users, and also provides an example for applications with similar process management concerns.
Inception and Evolution
The bootstrap module originated under project issue NIFI-145 as a
strategy for configuring the arguments required to launch the application process. The bootstrap.conf configuration
file supports defining an extensible list of arguments, and the bootstrap process constructs the command to be executed.
The basic purpose and configuration structure for the bootstrap process remain the same in Apache NiFi 2.
Over the course of multiple releases, NiFi added support for new bootstrap features, including lifecycle notifications in version 0.3.0, providing methods for alerting when the application started or stopped. NiFi 1.0.0 introduced support for encrypted properties in configuration files, requiring significant changes to the startup process. Subsequent versions introduced application property loading from external services, which removed the need for a local encryption key, but required more complex configuration and a number of additional Java libraries.
Restructuring
Additional bootstrap features resulted in significant expansion of bundled dependencies, growing the size and scope of the standard NiFi distribution. Although these features experienced some adoption, the weight of maintaining numerous classes and libraries for optional features prompted their deprecation in final versions of NiFi 1, and subsequent removal from NiFi 2. NiFi 2 maintains support for encrypting or externalizing sensitive flow configuration properties, but protecting application properties is outside the scope of features as of NiFi 2.6.0. For process monitoring, NiFi 2 removed lifecycle notification handling, but introduced new status features accessible over HTTP.
Redesigned Status Communication
The historical communication approach between the bootstrap process and the application process consisted of a BootstrapListener in the application process and a NiFiListener in the bootstrap process. Both classes supported a simple protocol over TCP that allowed the bootstrap process to send commands to the application, and allowed the application to provide launch status to bootstrap. This approach built on standard Java socket capabilities and avoided potential platform challenges with handling process signals. With the introduction of additional bootstrap commands for cluster status and diagnostics, the implementation grew more complicated, without providing programmatic access to these status details.
NiFi 2 replaced the socket protocol with a new implementation built on the simple
HttpServer
class, first introduced as a standard component in Java 1.6. Although not intended for high performance, the simple
HttpServer provides an ideal solution for implementing basic HTTP services without additional libraries. NiFi uses
Eclipse Jetty for standard operations, but with purposeful decoupling of the bootstrap, runtime,
and framework modules, avoiding external dependencies during initialization is essential. With HttpServer, the
HttpHandler
interface is the primary extension point for implementing HTTP services with a single handle method for processing an
HttpExchange
with request and response properties.
The
StandardManagementServer
class in the framework runtime module manages an instance of HttpServer and registers several HttpHandler
implementations that provide health and status. The Management Server defaults to listening on the localhost address
with port 52020, and logs the URL on startup. Application status is available at the following default URL:
http://127.0.0.1:52020/health
The Management Server returns an HTTP 200 status code when the application is running.
Application cluster status is available at the following path under the default URL:
http://127.0.0.1:52020/health/cluster
The Management Server returns an HTTP 200 staus code when the application is connected or connecting to a cluster, and returns an HTTP 503 in other cases.
As implementation details of process communication, these HTTP services are not associated with or published in the REST API documentation. However, these HTTP resources support simple status checking for capabilities such as liveness probes for Kubernetes.
As the intended consumer for these HTTP services, the redesigned bootstrap process uses the standard Java
HttpClient to check
application status. Introduced in Java 11, HttpClient provides an improved alternative to the historical
HttpURLConnection class. With HttpServer
and HttpClient both available in modern versions of Java, bootstrap communication in NiFi 2 provides a complete
solution using HTTP without additional dependencies.
Refactored Process Tracking
With Java 21 as the minimum required version, NiFi 2 incorporates refactored process tracking using the java.lang.ProcessHandle interface introduced in Java 9.
Following common practices, the original bootstrap command used a file to track the application process identifier. The bootstrap command also used a separate file to track the application control port for status communication. Rebuilding status communication using HTTP eliminated the need for tracking the application control port, but tracking the application process identifier required a different strategy.
Rather than maintaining a file for PID tracking, the bootstrap command in NiFi 2 uses a strategy that iterates
over
ProcessHandle.allProcesses()
and searches for a process argument that matches the expected location of the nifi.properties configuration. This
approach works on macOS and Linux where instances of
ProcessHandle.Info
return values for the process arguments.
As described in
JDK-8176725, the ProcessHandle implementation for Microsoft Windows
does not return anything for process arguments, requiring an alternative solution for locating the NiFi application
process. Instead of falling back to PID file tracking, changes in NiFi 2.3.0 for
NIFI-14156 introduced a solution based on the
VirtualMachine
class. The
VirtualMachine.list()
method returns descriptors for running Java Virtual Machine processes, which provide access to associated Java System
Properties. Building on this foundation, the NiFi bootstrap process locates the application process based on matching
the nifi.properties location set as a System Property.
Standardized Process Handling
Building on tracking improvements using the ProcessHandle interface, NiFi 2 also eliminated use of operating system
ps and kill commands for status and shutdown.
The ProcessHandle.destroy() method supports normal termination of the associated process, returning request status. The ProcessHandle.destroyForcibly() method supports forced termination. After calling either method, the onExit() method should be used wait for a selected duration for completion of process shutdown.
Building on these methods, the NiFi 2 bootstrap command attempts normal shutdown and waits for a configurable graceful
shutdown period. If the application does not complete resource closure operations within the graceful shutdown period,
the bootstrap command initiates forced destruction of the process. Using the destroy methods on ProcessHandle
provides a common implementation that works across platforms without calling external commands.
Streamlined Process Execution
In addition to application command preparation, the NiFi bootstrap process also monitors the application process,
attempting to restart the application process in the event of an unexpected shutdown. Monitoring and conditional restart
is standard behavior for the nifi.sh start command. The nifi.sh run command starts the application process, but does
not perform monitoring. Prior to NiFi 2, the bootstrap process continued running regardless of the initialization
command. NiFi 2 changed the contract between the nifi.sh script and the bootstrap process for the run command,
eliminating the need for the persistent bootstrap process.
Although the bootstrap process requires a minimal amount of memory, running more than one process in cases such as
containerized deployments is not optimal. With changes in NiFi 2, the nifi.sh run command uses bootstrap to prepare
the Java command for direct execution, allowing the operating system to manage the lifecycle of the application process.
The standard container image for NiFi 2 uses this approach to launch the application, avoiding unnecessary memory
consumption and simplifying process tracking. Combining this approach with HTTP status monitoring supports improved
resource allocation for scalable NiFi clusters.
Conclusion
Application startup and shutdown are basic operations that define the environment for the duration of system operation. Modern frameworks provide common patterns for lifecycle handling, often obviating the need for direct process control. For systems such as Apache NiFi, with support for extensible bootstrap configuration across multiple platforms, maintaining robust lifecycle management is critical to application stability. Rebuilding on the foundation of Java 21 enabled NiFi 2 to eliminate historical workarounds, reduce external dependencies, maintain core capabilities, and introduce standardized status monitoring. Although less visible than other improvements, the refreshed bootstrap implementation highlights the value of revisiting historical approaches and adapting to current alternatives.