Apache Flink® Web UI is a powerful tool provided by Apache Flink that allows users to interact with and monitor the state of their Flink clusters, jobs, and tasks. It is an essential feature for both developers and administrators, bringing a lot of benefits to help managing Flink jobs, from local development environment to enterprise deployments on Native Kubernetes (K8s) or OpenShift Container Platform (OCP) with IBM Event Processing running on Native K8s or OCP environment.
However, in such environments the user must be authenticated to access web applications.
This is the first part of a series of two articles and will cover the current state of the Flink Web UI and how it can be made secure at a high level. The follow-on article will go into more details on how to actually implement.
Apache Flink Web UI Overview
The Flink Web UI is the web-based user interface for Apache Flink and is part of the Flink distribution.
It provides a user-friendly graphical view of the job execution, allowing you to monitor and manage your Flink jobs. You can submit new jobs, track their progress, and analyze their performance. The figure below briefly shows the Overview tab of the Flink Web UI.
Additionally, it provides insights into the job's metrics, such as latency, throughput, and memory usage. You can also view the status and metrics of completed Flink jobs, making it easier to troubleshoot and analyze past job executions.
With the Flink Web UI, you can see the configuration and the logs of the Job Manager and the Task Managers, and interact with Flink's built-in features, such as checkpointing, savepoints, job management, flame graphs.
Overview of Task Managers in Web UI
|
Job details in Web UI
|
Job Manager in Web UI
|
The Flink Web UI has the following benefits in enterprise environments including production, staging or QA:
1. Real-time data processing and analysis: can be used for real-time data processing and analysis. It provides a web-based interface for monitoring and managing Flink jobs, tracking their progress, and visualizing the data flow within the system.
2. Monitoring: allows developers to monitor the incoming and outgoing streams of data, track the processing time, and identify any bottlenecks or issues in the system.
3. Debugging: provides a debugging interface that allows to inspect the state of running jobs, identify errors, and debug their code. This can be especially helpful when dealing with complex streaming applications.
4. Metrics and Performance indicators: provides a dashboard that displays key metrics and performance indicators for running jobs. This can be used to monitor the health and performance of the system and set up alerts for any issues or anomalies.
5. Integration: The REST API of the Flink Web UI can be called by other tools like Prometheus. This is particularly useful in enterprise environments.
Running the Apache Flink Web UI securely
The Job Manager hosts the Flink Web UI application and it’s REST API on the Job Manager. The connection to the application is unencrypted, which is fine for a local use. For enterprise deployments, Flink supports encryption of internal/External (REST) connections with Secure Sockets Layer (SSL).
The Web UI has no built-in mechanism to authenticate users, which can be fine for a local use.
Apache Flink documentation says: “The rationale behind delegating authentication to a proxy is that such proxies offer a wide variety of authentication options and thus better integration into existing infrastructures.”
However, for enterprise deployments in OCP/K8s environments, the Web UI needs to be run complying with the organizations security policies.
When you deploy Flink in an OCP/K8s cluster, it's important to realize that the Job Manager is not accessible from outside the cluster. To give access to Flink Web UI, firstly define an Ingress on K8s as described in Flink documentation, or a Route on OCP.
Part 2 will cover Ingress and Route configuration to access the Web UI. However, exposing a URL to the Flink UI is not enough: you also need to make sure network traffic is encrypted and users are authenticated before they access Web UI resources and REST API.
How to secure access to the Flink Web UI
Apache Flink documentation recommends deploying a “side car proxy” to secure the access to Apache Flink Web UI and its REST API. We chose Nginx as the reverse proxy: all external connections to the Job Manager go through a Nginx Pod, which acts as a proxy between the browser and the Job Manager.
We can opt for either basic authentication, or a Single Sign-On (SSO) authentication using a third-party OpenID Connect (OIDC) provider. OIDC is an authentication protocol that allows users to authenticate and log into applications using their existing credentials from an OIDC provider.
These two approaches can be used with or without encryption to secure the entire communication between the browser, the proxy, and the Job Manager. We will go through the details of the configuration for both cases in the second part of this blog series.
Basic Authentication
Basic authentication is natively supported by Nginx, which secures the access based on a .htpasswd file to manage user accounts (user/password) allowed to access Flink Web UI.
In this case, when the user tries to access the Flink Web UI, the browser will pop up a dialog like in the figure below asking a valid username and password and the user's input is checked against the content of the .htpasswd file.
OIDC Authentication
In this case, Nginx delegates the authentication part to an OIDC provider.
When a user logs into an application using OIDC, the application sends a request to the OIDC provider to verify the user's identity. If the user is authenticated, the OIDC provider issues an OIDC token, which is then sent back to the application. In this case the token is used to access user's resources and perform the necessary actions. Using an OIDC provider allows to perform Single Sign-On (SSO) across multiple applications.
Keycloak® can be used as an OIDC provider to authenticate users for both Flink Web UI and IBM Event Processing low-code authoring application.
When users try to access to the Flink Web UI they’ll be redirected first to a login page of the OIDC provider before accessing to the Web UI like shown below when using Keycloak for example as an OIDC provider.
Conclusion
We have shown approaches that can be taken to run the Web UI securely, allowing the enterprise deployments to benefit from the Web UI which helps a lot of users to do real-time data processing and analysis, monitoring and debugging and finally follow metrics and performance indicators on jobs.
It is possible to use different forms of authentication either with basic authentication or using and OIDC provider.
In next part of the series we will walk you through how to implement this important capability making your Flink UI enterprise ready.
Contributors
- Mehdi Deboub
- Anu K T
- Sebastien Pereira
- David Radley