Building OpenTelemetry Collection in Apache NiFi with Netty
Background
OpenTelemetry supports the primary pillars of software observability using a common protocol with implementations in multiple programming languages. With standard specifications for logs, metrics, and traces, OpenTelemetry presents a modern alternative to historical de facto solutions and various vendor-based strategies. With an understanding of observability concepts and benefits, implementing a unified approach to application performance monitoring is essential to a stable and secure system. The OpenTelemetry Protocol specification defines the foundational building blocks for telemetry encoding and transmission, enabling integration with both OpenTelemetry components and external services. With support for gRPC or HTTP transmission, the OTLP specification enables straightforward interoperation across the landscape of languages and services.
Introduction
Apache NiFi introduced the
ListenOTLP
Processor in versions 2.0.0-M1 and 1.24.0. ListenOTLP
combines several important characteristics of the OTLP
specification in a single component with straightforward default settings. The Netty framework
supports several NiFi components and also forms the basis for micro-batched processing of OpenTelemetry in the
ListenOTLP
Processor. With reusable components for HTTP and TLS, Netty combines high performance with a pluggable
design that enables ListenOTLP
to support various aspects of the OpenTelemetry Protocol without introducing excessive
complexity. Coupled with Jackson for JSON processing and the HubSpot library for
handling Protocol Buffers, the NiFi solution for OpenTelemetry
collection combines security, performance, and ease of configuration.
OpenTelemetry Protocol Details
Reviewing the scope and features of OTLP provides a helpful background for understanding the implementation details of
the ListenOTLP
Processor.
OTLP defines gRPC and HTTP as the supported modes of transport for OpenTelemetry information. The gRPC protocol itself builds on HTTP/2 and defines structured communications using Protocol Buffers. OTLP defines requests and responses using Protobuf definitions that serve as interface definitions regardless of the implementing language. The Protobuf structures also define the model for encoding telemetry as JSON for transmission over HTTP.
Summarizing OTLP Specification 1.0.0, OpenTelemetry can be transmitted using any of the following strategies:
- gRPC
- Protobuf over HTTP
- JSON over HTTP
Considering transmission from a layered perspective, all OpenTelemetry communication occurs over HTTP, with HTTP/2 supporting gRPC, Protobuf, or JSON, and HTTP/1.1 supporting Protobuf or JSON. The fact that gRPC consists of Protobuf messages means that although some header information is different, handling both gRPC and Protobuf over HTTP can be accomplished with minimal overhead.
Netty Framework Features
The Netty framework includes powerful capabilities such as multithreading for socket handling, Application-Layer Protocol Negotiation with TLS, and HTTP processing for both HTTP/1.1 and HTTP/2. These composable features make Netty an ideal foundation for building custom network services that support protocols such as OpenTelemetry.
Building servers or clients with Netty requires some level of familiarity with event-driven programming. The project user guide provides a helpful introduction to the basic concepts involved in writing a network server. The guide highlights the ChannelHandlerInboundAdapter as an introduction to asynchronous processing of bytes buffered from a socket channel. The Netty ServerBootstrap is another central class, supporting thread pool configuration and the pipeline responsible for handling inbound connections. Based on the core design of a channel handler pipeline, Netty provides numerous implementation classes supporting a variety of standard network protocols.
Netty HTTP and ALPN
The netty-codec-http module provides handler classes that implement client and server communication for HTTP/1.1. The HttpServerCodec class combines request decoding and response encoding for streamlined configuration, which allows custom code to implement the logic required to handle structured HTTP requests and responses. The FullHttpRequest interface provides a straightforward abstraction for reading HTTP header and body information after the server class has decoded the request from the socket channel.
The netty-codec-http2 module depends on netty-codec-http
and
provides additional protocol handling to support HTTP/2. Netty provides several approaches for handling HTTP/2, but when
it is necessary to support both HTTP/2 and HTTP/1.1, the
InboundHttp2ToHttpAdapter
provides a convenient translation to HTTP/1.1 objects. This approach supports using a single handler for processing HTTP
requests in the Netty pipeline, regardless of protocol version. The single pipeline for both protocol versions may not
be suitable for applications that require custom handling for specific HTTP/2 semantics, but for many applications,
adapting requests to a single request interface is the best solution.
ALPN is a generic extension to the TLS protocol that allows clients to request specific application protocol versions during the TLS handshake process. HTTP/2 uses ALPN to support transparent fallback to HTTP/1.1 for clients that do not support HTTP/2. Netty provides a flexible ApplicationProtocolNegotiationHandler that allows custom classes to read client application protocol names and configure the appropriate set of pipeline handlers. As a general strategy, clients that do not support HTTP/2 may not present any ALPN information, which means servers should default to HTTP/1.1 while it remains supported.
Netty and NiFi Integration
As part of refactoring syslog components, NiFi 1.14.0 introduced the
nifi-event-transport
module with reusable abstractions for building Netty clients and servers. The module includes straightforward factory
classes for constructing network servers using standard properties for address, port number, and TLS negotiation. The
NettyEventServerFactory
removes the need for repeating common server construction steps, supporting NiFi Processors such as ListenTCP
,
ListenSyslog
, and ListenBeats
, as well as ListenOTLP
. These components highlight both the flexibility of Netty and
the power of reusable classes for building network services.
ListenOTLP
exposes these standard settings as properties within the Processor. The Address
and Port
properties
control the network socket on which the Processor listens for inbound requests. The SSL Context Service
and
Client Authentication
properties control the TLS negotiation process and determine whether clients must present a
certificate for mutual authentication.
The Worker Threads
property determines the number of threads allocated to handle socket connection processing. This
number should never exceed the number of CPU cores and should remain in the single digits for most deployment scenarios.
The Queue Capacity
and Batch Size
properties should be considered and adjusted together based on general volume
expectations. The Queue Capacity
places a limit on the number of messages that can be held in memory before the NiFi
framework invokes the ListenOTLP
Processor to write queued messages to a FlowFile. With the OTLP Specification
supporting reliable delivery and retry using standard response codes, the queue can remain limited without concern for
dropping requests at peak volumes.
OpenTelemetry Content Negotiation
With Netty as the foundation, implementing support for each of the OTLP transport formats required an additional layer of protocol processing to handle gRPC, Protobuf, and JSON using a single server.
OTLP defines TCP port 4317 as the default for gRPC, and TCP port 4318 as the default for HTTP, with the official
OpenTelemetry Collector requiring separate configuration for each protocol. Although ListenOTLP
could have followed a
similar strategy, the HTTP Content-Type header
provides a standard method for determining applicable protocol handling.
The OTLP Specification defines the following Content-Type
values for the respective transport formats:
application/grpc
for gRPCapplication/x-protobuf
for Protobuf over HTTPapplication/json
for JSON over HTTP
Selecting the applicable request handler according to the Content-Type
supports not only a single network server
for all transport protocols, but also enables basic request validation.
HTTP Request Validation
Following the OTLP Specification provides a clear path for initial request validation using standard HTTP properties.
All OpenTelemetry requests use the POST
method regardless of transport format, resulting in an HTTP 405
response
code for requests using other HTTP methods.
The HTTP Content-Type
header indicates the transport format selected, resulting in an HTTP 415
response code for
anything other than gRPC, Protobuf, or JSON.
Each OpenTelemetry request type uses a standard URL path according to the transport format and telemetry type, providing
a clear indication of whether the client is sending logs, metrics, or traces. URL paths outside the expected values for
gRPC or HTTP result in an HTTP 404
response code, indicating that the requested path is not found.
With these validation checks applied, the ListenOTLP
Processor can select the appropriate strategy for parsing the
HTTP request body. Each of these HTTP header elements can be misrepresented, but header validation avoids common errors.
HTTP Content Processing
With the transport format and telemetry type determined, the next step involves decoding buffered bytes to object representations. This step provides content validation, ensuring that Protobuf or JSON payload follow the structure defined in the OTLP Specification.
Both gRPC and HTTP transport formats support gzip compression. The HTTP
Content-Encoding header indicates the presence
of gzip
compression for Protobuf or JSON requests, whereas gRPC indicates compressed status using an initial byte flag
prior to the Protobuf message itself.
After determining compressed status, message parsing uses either standard Protobuf or Jackson JSON components to decode
objects. Following parsing, ListenOTLP
places messages on an internal queue. If the internal queue reaches the limit
defined in the Queue Capacity
property, ListenOTLP
returns an unavailable response code, which informs the client
that the request should be retried. This approach avoids volume-related service failures, ensuring a level of operation
even under stress.
Message Serialization
The last portion of ListenOTLP
processing consists of writing one or more messages to NiFi FlowFiles. ListenOTLP
uses a batching strategy based on the configured Batch Size
property to write telemetry messages containing up to the
number of records defined. Running ListenOTLP
on a more frequent schedule can produce more FlowFiles containing a
smaller number of records, subject to the volume of telemetry received. Optimal scheduling and batch sizing depends on
subsequent pipeline operations.
Regardless of the input transport format, ListenOTLP
writes all messages using JSON. This strategy enables subsequent
processing in NiFi using any of the available JSON components. Although JSON is more verbose than Protobuf, it provides
greater opportunity for selective adjustment using existing NiFi Processors and Controller Services. With JSON as one
of the supported formats defined in the OTLP Specification, it also enables transfer to other systems that support
OpenTelemetry. Serialized FlowFiles include standard attributes indicating the resource type and the count of messages
included for generalized routing.
Each OpenTelemetry resource element also includes attributes for client.socket.address
and client.socket.port
indicating the socket information of the client that sent the request. Other NiFi components can parse these attributes
for additional routing and processing decisions.
Conclusion
The ListenOTLP
Processor is an important element for integrating NiFi into observability pipelines based on
OpenTelemetry. With widespread adoption across languages, frameworks, and vendors, OpenTelemetry provides a clear path
for building robust and interoperable monitoring solutions, without requiring architectural designs tied to a
particular vendor. As one of numerous Processors in NiFi, ListenOTLP
highlights the adaptability of the NiFi framework
and its ability to build pipelines that bridge the gap between historical approaches and modern solutions.