Introducing Apache NiFi HTTP Request Logging
Background
In 1993, the National Center for Supercomputing Applications released a server that set the standard for hosting web resources. Although development of the NCSA HTTPd server ended in 1995, the source code provided the foundation for the Apache HTTP Server Project. The NCSA server supported logging of HTTP requests using a pattern known as the Common Log Format. Apache HTTP Server and other web servers supported the format, which became the de facto standard for logging request attributes and response status. The Common Log Format reflects its historical roots, but it continues to provide a basis for evaluating HTTP server communication.
Introduction
One of the core capabilities of Apache NiFi is data flow management through a web interface. NiFi enables both direct user interaction and external integration through an extensive REST API. To support these features, NiFi includes an embedded Jetty web server. Although NiFi supports extensive logging for data processing and user interaction, releases prior to 1.16.0 did not support configurable HTTP request logging.
The nifi-user.log
file tracks authentication and authorization decisions, but log messages do not follow a standard
pattern and lack important details in some cases. NiFi does not log access to static resources, such as documentation,
to the nifi-user.log
file, so previous releases made it difficult to determine the full scope of HTTP communication.
For clustered installations, or deployments protected behind a proxy server, troubleshooting the source an HTTP request
is subject to additional challenges.
NiFi 1.16.0 introduces a new configurable approach to HTTP request logging, improving system observability through detailed request and response properties. The logging implementation leverages standard Jetty request logging features to provide pattern definitions that should be familiar to those with experience configuring web server logging. The request logging strategy integrates with the standard Logback configuration, allowing flexible destinations and retention policies. With straightforward default settings and documented custom options, NiFi HTTP request logging provides a simple and powerful way to monitor the health and status of system communication.
Implementation
HTTP request logging in NiFi involves the following essential elements:
- Jetty Request Log interface for processing HTTP requests and responses
- Jetty Custom Request Log implementation providing message formatting
- SLF4J Request Log Writer
with logger named
org.apache.nifi.web.server.RequestLog
- Logback Appender for writing messages to
nifi-request.log
With a basic understanding of these implementation elements, it is possible to adjust the configuration to meet the requirements of various deployment environments.
Configuration Settings
NiFi HTTP request logging consists of the Logback configuration and a format configuration in NiFi properties. The Logback configuration controls the output destination and retention policy, and the NiFi properties configuration controls the format of each log record.
Logback Configuration
The default Logback configuration for NiFi includes the following file appender definition for persisting HTTP request logs:
<appender name="REQUEST_FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
<file>nifi-request.log</file>
<rollingPolicy class="ch.qos.logback.core.rolling.TimeBasedRollingPolicy">
<fileNamePattern>nifi-request_%d.log</fileNamePattern>
<maxHistory>30</maxHistory>
</rollingPolicy>
<encoder class="ch.qos.logback.classic.encoder.PatternLayoutEncoder">
<pattern>%msg%n</pattern>
</encoder>
</appender>
The default appender writes the log message on a single line to a file named nifi-request.log
, located in the standard
NiFi logs directory. The default rolling policy creates a new file every day for a maximum of 30 days. This policy
follows the same default strategy as other NiFi logs. The default encoder defines a pattern that consists of the message
%msg
element and a new line %n
element, relying on the message to include other important elements such as the
timestamp.
The default Logback configuration also includes for the following logger definition for routing request log messages to the request file appender:
<logger name="org.apache.nifi.web.server.RequestLog" level="INFO" additivity="false">
<appender-ref ref="REQUEST_FILE"/>
</logger>
Upgrade Considerations
When upgrading from versions of NiFi prior to 1.16.0, it is important to define a logger
for org.apache.nifi.web.server.RequestLog
to avoid mixing HTTP request logs with other application messages. With
INFO
as the level for HTTP request logs, nifi-app.log
is the default location unless otherwise specified. Using
the updated logback.xml
from the 1.16.0 release version, or adding the logger
and appender
elements is sufficient
to ensure expected routing to HTTP request logs after upgrading.
Disabling HTTP Request Logging
For some use cases, such as deployment behind an HTTP gateway, NiFi HTTP request logging may not provide sufficient
usefulness to warrant the additional stream of information. Changing the logger level
attribute to OFF
in the NiFi
Logback configuration disables HTTP request logging, as shown in the following logger definition:
<logger name="org.apache.nifi.web.server.RequestLog" level="OFF" />
Format Configuration
The request log format can be configured in nifi.properties
using the following property:
nifi.web.request.log.format
The property supports configurable format elements as defined in the Jetty Custom Request Log documentation.
NiFi 1.16.0 includes the following default value for the format property:
%{client}a - %u %t "%r" %s %O "%{Referer}i" "%{User-Agent}i"
The default property value follows the Combined Log Format, which appends the HTTP Referer and User-Agent Headers to the Common Log Format.
Default Log Format Elements
The following log provides an example of the default format configuration as written to nifi-request.log
:
127.0.0.1 - user [15/Apr/2022:12:00:00 +0000] "GET /nifi-api/ HTTP/1.1" 200 2048 "-" "curl/7.82.0"
The example record can be interpreted as follows:
127.0.0.1
: Client Internet Protocol address-
: Placeholder for RFC1413 identificationuser
: Username provided during authentication[15/Apr/2022:12:00:00 +0000]
: Timestamp with timezone offset from GMTGET
HTTP method requested/nifi-api/
Resource path requestedHTTP/1.1
HTTP protocol version requested200
HTTP response status code returned2048
HTTP response body size in bytes"-"
HTTPReferer
request header or placeholder when not provided"curl/7.82.0"
HTTPUser-Agent
request header or placeholder when not provided
Information logged from client requests is not subject to filtering or sanitization. Log processing systems should enforce standard ranges for expected values to avoid problems related to malicious requests.
The following log provides an example of the default format, including the Referer
HTTP header, which most browsers
provide when making HTTP requests:
127.0.0.1 - user [15/Apr/2022:12:00:00 +0000] "GET /nifi-api/flow/current-user HTTP/1.1" 200 370 "https://localhost:8443/nifi/" "Mozilla/5.0 (X11; Linux x86_64; rv:99.0) Gecko/20100101 Firefox/99.0"
Custom Log Format Codes
The Jetty Custom Request Log supports several format codes that can be configured to gather additional HTTP request information.
The CustomRequestLog documentation provides a complete description of available format codes. Adjusting the request log format codes requires restarting NiFi to apply the changes.
Request Timestamp Formatting
The %t
format code supports customization of the request timestamp. The format code supports specifying a format,
timezone, and locale. The following property configuration provides an example of
an ISO 8601 date and time with millisecond precision in
Coordinated Universal Time:
%{client}a - %u %{yyyy-MM-dd'T'HH:mm:ss.SSS'Z'|UTC}t "%r" %s %O "%{Referer}i" "%{User-Agent}i"
The following log provides an example of the timestamp:
127.0.0.1 - user [2022-04-15T12:00:00.500Z] "GET /nifi-api/ HTTP/1.1" 200 2048 "-" "curl/7.82.0"
Request Processing Duration
The %T
format code appends the amount of time taken to process and return an HTTP request in seconds. With most
requests taking less than one second to process, using the ms
unit specification provides the processing duration in
milliseconds.
The following property configuration appends the request processing duration in milliseconds to the default format:
%{client}a - %u %t "%r" %s %O "%{Referer}i" "%{User-Agent}i" %{ms}T
Using %D
or %{us}T
provides the request processing duration in microseconds.
Request Body Size
The %I
format code supports tracking the size of HTTP requests sent to the server. The information is useful in
conjunction with the request processing duration, as request size often corresponds to longer request processing.
The following property configuration appends both the request processing duration and the request size to the default format:
%{client}a - %u %t "%r" %s %O "%{Referer}i" "%{User-Agent}i" %{ms}T %I
HTTP GET requests do not include a body, so the request body size will be 0, but in most cases, HTTP POST requests include a body with a size that will be logged.
Request Proxied Entities
NiFi supports the concept of proxying requests on behalf of an authenticated entity. When NiFi is deployed in a cluster, individual nodes are responsible for replicating some HTTP requests to other nodes in order to access distributed information or maintain a consistent configuration.
NiFi cluster communication uses mutual TLS with X.509 certificates for authenticating nodes, and uses the following HTTP request header to send client entity information when replicating requests:
X-ProxiedEntitiesChain
Logging client entity information enables traceability to the client that initiated that original HTTP request. NiFi also supports receiving and authenticating requests through gateway proxy servers, using the same HTTP request header.
The following property configuration appends the header containing chain of entities to the request log:
%{client}a - %u %t "%r" %s %O "%{Referer}i" "%{User-Agent}i" %{X-ProxiedEntitiesChain}i
The request log will contain the standard -
character as a placeholder when the header is not present. The header log
encloses each entity using the left angle bracket <
and right angle bracket >
characters.
Request Replication
NiFi uses several custom HTTP headers when replicating requests across cluster nodes. The following HTTP request header provides a unique identifier for tracing a transaction between cluster nodes:
X-RequestTransactionId
The transaction identifier consists of a random UUID, generated on the node initiating the request replication.
The following property configuration appends the transaction identifier header to the request log:
%{client}a - %u %t "%r" %s %O "%{Referer}i" "%{User-Agent}i" %{X-RequestTransactionId}i
Conclusion
NiFi HTTP request logging enables a number of new monitoring strategies and provides flexibility to support a variety of deployment scenarios. With a default configuration that follows standard format conventions, HTTP request logs can be processed using available log analysis systems. Building on the documented format code features of Jetty, NiFi 1.16.0 also supports enhanced request tracking for both standalone and clustered installations.