Java Resources

DZone's Featured Java Resources

A Practical Guide to Building Generative AI in Java

By Xavier Portilla Edo

CORE

Building generative AI applications in Java used to be a complex, boilerplate-heavy endeavor. You’d wrestle with raw HTTP clients, hand-craft JSON payloads, parse streaming responses, manage API keys, and stitch together observability, all before writing a single line of actual AI logic. Those days are over. Genkit Java is an open-source framework that makes building AI-powered applications in Java as straightforward as defining a function. Pair it with Google’s Gemini models and Google Cloud Run, and you can go from zero to a production-deployed generative AI service in minutes, not days. This is a complete, working example. Clone it, set your API key, and run. Why Genkit Java? If you’re a Java developer, you’ve probably watched the GenAI revolution unfold mostly in Python and TypeScript. The tooling, frameworks, and tutorials are all skewed toward those ecosystems. Java developers were left to either build everything from scratch or use verbose, low-level SDKs. Genkit Java changes that. Here’s what makes it different: FeatureWithout GenkitWith GenkitCall GeminiManual HTTP client, JSON parsing, error handlinggenkit.generate(...), one method callExpose as APISet up Spring Boot, write controllers, handle serializationgenkit.defineFlow(...), auto-exposed as HTTP endpointStructured outputParse raw JSON strings, deserialize manuallyoutputClass(MyClass.class), Gemini returns typed Java objectsTool callingParse function call responses, execute tools, re-submitDefine tools with genkit.defineTool(...), automatic executionObservabilityManual OpenTelemetry setup, custom spans, metricsBuilt-in tracing, metrics, and latency tracking, zero configDev/test your flowscURL, Postman, write test harnessesGenkit DevUI, visual, interactive, built-in What We’re Building A Java application with a translation AI flow powered by Gemini via Genkit, showcasing: Typed flow inputs – TranslateRequest class with @JsonProperty annotations as the flow inputStructured LLM output – Gemini returns a TranslateResponse Java object directly (no manual JSON parsing)Typed flow outputs – The flow returns a fully typed TranslateResponse to the caller All of this in a single Java file + two model classes. No Spring Boot. No annotations soup. No XML configuration. Just clean, readable, type-safe code. Prerequisites Java 21+ (Eclipse Temurin recommended)Maven 3.6+Node.js 18+ (for the Genkit CLI)A Google GenAI API key (free from Google AI Studio)Google Cloud SDK (only for Cloud Run deployment) Install the Genkit CLI The Genkit CLI is your command-line companion for developing and testing AI flows. Install it globally: JavaScript npm install -g genkit Verify the installation: JavaScript genkit --version The CLI is what powers the DevUI and provides a seamless development experience, more on that below. Project Structure Plain Text genkit-java-getting-started/ ├── src/ │ └── main/ │ ├── java/ │ │ └── com/example/ │ │ ├── App.java # ← The main application │ │ ├── TranslateRequest.java # ← Typed flow input │ │ └── TranslateResponse.java # ← Typed flow + LLM output │ └── resources/ │ └── logback.xml # Logging configuration ├── pom.xml # Maven config with Genkit + Jib ├── run.sh # Quick-start script └── README.md # This article Getting Started 1. Clone and Set Your API Key Java git clone https://github.com/xavidop/genkit-java-getting-started.git cd genkit-java-getting-started 2. Run With the Genkit DevUI (Recommended) JavaScript genkit start -- mvn compile exec:java That’s it. Two commands. Your AI-powered Java server is running on http://localhost:8080, and the Genkit DevUI is available at http://localhost:4000. 3. Or Run Directly (Without DevUI) JavaScript mvn compile exec:java The Code, It’s Stupidly Simple Step 1: Define Typed Input/Output Classes Instead of using raw Map or String, define proper Java classes with Jackson annotations. Genkit uses these annotations to generate JSON schemas that tell Gemini exactly what structure to return. TranslateRequest.java, the flow input: Java import com.fasterxml.jackson.annotation.JsonProperty; import com.fasterxml.jackson.annotation.JsonPropertyDescription; /** * Input for the translate flow. */ public class TranslateRequest { @JsonProperty(required = true) @JsonPropertyDescription("The text to translate") private String text; @JsonProperty(required = true) @JsonPropertyDescription("The target language (e.g., Spanish, French, Japanese)") private String language; public TranslateRequest() {} public TranslateRequest(String text, String language) { this.text = text; this.language = language; } public String getText() { return text; } public void setText(String text) { this.text = text; } public String getLanguage() { return language; } public void setLanguage(String language) { this.language = language; } @Override public String toString() { return String.format("TranslateRequest{text='%s', language='%s'}", text, language); } TranslateResponse.java, the flow output, and the LLM structured output: Java import com.fasterxml.jackson.annotation.JsonProperty; import com.fasterxml.jackson.annotation.JsonPropertyDescription; /** * Structured output for the translate flow. */ public class TranslateResponse { @JsonProperty(required = true) @JsonPropertyDescription("The original text that was translated") private String originalText; @JsonProperty(required = true) @JsonPropertyDescription("The translated text") private String translatedText; @JsonProperty(required = true) @JsonPropertyDescription("The target language") private String language; public TranslateResponse() {} public TranslateResponse(String originalText, String translatedText, String language) { this.originalText = originalText; this.translatedText = translatedText; this.language = language; } public String getOriginalText() { return originalText; } public void setOriginalText(String originalText) { this.originalText = originalText; } public String getTranslatedText() { return translatedText; } public void setTranslatedText(String translatedText) { this.translatedText = translatedText; } public String getLanguage() { return language; } public void setLanguage(String language) { this.language = language; } @Override public String toString() { return String.format( "TranslateResponse{originalText='%s', translatedText='%s', language='%s'}", originalText, translatedText, language); } } The @JsonPropertyDescription annotations are key, Genkit passes them to Gemini as part of the JSON schema, so the model knows exactly what each field means. Step 2: Initialize Genkit Java Genkit genkit = Genkit.builder() .options(GenkitOptions.builder() .devMode(true) .reflectionPort(3100) .build()) .plugin(GoogleGenAIPlugin.create()) .plugin(jetty) That’s the entire setup. The GoogleGenAIPlugin reads your GOOGLE_API_KEY automatically. The JettyPlugin handles HTTP. Genkit wires everything together. Step 3: Define a Flow With Typed Classes and Structured Output Java genkit.defineFlow( "translate", TranslateRequest.class, // ← typed input TranslateResponse.class, // ← typed output (ctx, request) -> { String prompt = String.format( "Translate the following text to %s.\n\nText: %s", request.getLanguage(), request.getText() ); return genkit.generate( GenerateOptions.<TranslateResponse>builder() .model("googleai/gemini-3-flash-preview") .prompt(prompt) .outputClass(TranslateResponse.class) // ← Gemini returns a typed object! .config(GenerationConfig.builder() .temperature(0.1) .build()) .build() ); } ); Look at what’s happening here: TranslateRequest.class as the flow input, Genkit automatically deserializes incoming JSON into a TranslateRequest object. No Map.get() casting.TranslateResponse.class as the flow output, the flow returns a typed object, serialized automatically to JSON for the HTTP response.outputClass(TranslateResponse.class) on the generate call, this is the magic. Genkit sends the JSON schema derived from TranslateResponse to Gemini, and Gemini returns structured JSON that Genkit deserializes into a TranslateResponse object. No response.getText() + manual parsing. That single defineFlow call: Registers the flow in Genkit’s internal registryExposes it as a POST /api/flows/translate HTTP endpointMakes it visible in the DevUIAdds full OpenTelemetry tracing automaticallyTracks token usage, latency, and error rates Compare that to writing a Spring Boot controller + service + DTO + config + exception handler for the same functionality. The Genkit DevUI: Your AI Playground This is where Genkit truly shines for development. When you run with genkit start, the CLI launches a visual DevUI at http://localhost:4000. What Can You Do in the DevUI? Browse all flows. See every flow you’ve registered, like translate, with its typed input/output schemas.Run flows interactively. Fill in a TranslateRequest JSON, click “Run”, see the TranslateResponse instantly. No cURL needed.Inspect traces. Every flow execution is traced. See exactly which model was called, what the input/output was, how long it took, and how many tokens were used.View registered models and tools. See all available Gemini models and any tools you’ve defined.Test tool calling. Watch Gemini decide to call your tools in real-time.Manage datasets and evaluations. Create test datasets and evaluate your AI outputs. Deploying to Google Cloud Run The project uses Jib to build and push container images directly from Maven, no Dockerfile and no Docker daemon required. Jib is configured in the pom.xml and builds optimized, layered container images. Step-by-Step Deployment Shell # Set your GCP project export PROJECT_ID=$(gcloud config get-value project) export REGION=us-central1 # Build the container image and push it to Google Container Registry # No Docker needed, Jib does it all from Maven! mvn compile jib:build -Djib.to.image=gcr.io/$PROJECT_ID/genkit-java-app # Deploy to Cloud Run gcloud run deploy genkit-java-app \ --image gcr.io/$PROJECT_ID/genkit-java-app \ --region $REGION \ --platform managed \ --allow-unauthenticated \ --set-env-vars "GOOGLE_API_KEY=$GOOGLE_API_KEY" \ --memory 512Mi \ Two commands. No Docker. Your Java GenAI application is now live on a globally-distributed, auto-scaling, serverless platform. Why Jib? No Dockerfile – Container image is built directly from your Maven projectNo Docker daemon – Doesn’t require Docker installed or running on your machineFast rebuilds – Separates dependencies, classes, and resources into layers, so only changed layers are rebuiltReproducible – Builds are deterministic and don’t depend on the local Docker environmentDirect push – Sends the image straight to GCR/Artifact Registry without a local docker push You can also build a local Docker image (requires Docker running) with: Shell mvn compile jib:dockerBuild -Djib.to.image=genkit-java-app Available Flows and API Examples Once the server is running, test the translate flow: Translate Text Send a TranslateRequest JSON object and receive a structured TranslateResponse: Shell curl -X POST http://localhost:8080/api/flows/translate \ -H 'Content-Type: application/json' \ -d '{"text": "Building AI applications has never been easier", "language": "Spanish"}' Example response (a TranslateResponse object): JSON { "originalText": "Building AI applications has never been easier", "translatedText": "Construir aplicaciones de IA nunca ha sido tan fácil", "language": "Spanish" Try other languages: Shell # French curl -X POST http://localhost:8080/api/flows/translate \ -H 'Content-Type: application/json' \ -d '{"text": "Genkit makes Java AI development simple", "language": "French"}' # Japanese curl -X POST http://localhost:8080/api/flows/translate \ -H 'Content-Type: application/json' \ -d '{"text": "Hello world", "language": "Japanese"}' Notice how the response is always a structured JSON object, not a raw string. That’s the power of outputClass(TranslateResponse.class). Gemini returns structured data that Genkit deserializes into your Java class automatically. What Genkit Gives You for Free When you use Genkit, you’re not just getting a wrapper around API calls. You get a production-grade framework: Observability (Zero Config) Every flow execution is automatically traced with OpenTelemetry: Latency tracking per flow, per model callToken usage (input/output/thinking tokens)Error rates and failure trackingSpan hierarchy showing the full execution path Plugin Ecosystem Need to swap Gemini for another model? Change one line: Java // Switch from Gemini to OpenAI .plugin(OpenAIPlugin.create()) // Or use Anthropic Claude .plugin(AnthropicPlugin.create()) // Or run locally with Ollama Genkit supports 10+ model providers, vector databases (Pinecone, Weaviate, PostgreSQL), Firebase integration, and more. Type Safety This is where Genkit really shines for Java developers. Flows, generate calls, and even LLM responses are fully typed: Java // The flow takes a TranslateRequest and returns a TranslateResponse genkit.defineFlow("translate", TranslateRequest.class, TranslateResponse.class, ...); // The LLM returns a TranslateResponse directly, no string parsing genkit.generate( GenerateOptions.<TranslateResponse>builder() .outputClass(TranslateResponse.class) .build() ); Genkit derives JSON schemas from your @JsonProperty and @JsonPropertyDescription annotations and sends them to Gemini, so the model returns structured data that maps directly to your Java classes. No Object casting, no response.getText() + objectMapper.readValue(), no runtime surprises. What’s Next? This getting-started project covers the fundamentals. Genkit Java can do much more: RAG – Retrieval-augmented generation with vector stores (Firestore, Pinecone, pgvector, Weaviate)Multi-agent orchestration – Coordinate multiple AI agentsChat sessions – Multi-turn conversations with session persistenceEvaluations – RAGAS-style metrics to measure your AI output qualityMCP integration – Connect to Model Context Protocol serversSpring Boot – Use the Spring plugin instead of Jetty for existing Spring appsFirebase – Deploy as Cloud Functions with Firestore vector search Explore the full Genkit Java documentation and the samples directory to dive deeper. Conclusion As you can see, it is very easy to use Genkit Java and Gemini to build powerful generative AI applications with minimal code. The combination of typed inputs/outputs, structured LLM responses, built-in observability, and seamless deployment makes Genkit Java the best way to build GenAI features in Java. You can find the full code of this example in the GitHub repository. Happy coding! More

How to Configure JDK 25 for GitHub Copilot Coding Agent

By Bruno Borges

GitHub Copilot coding agent runs in an ephemeral GitHub Actions environment where it can build your code, run tests, and execute tools. By default, it uses the pre-installed Java version on the runner — but what if your project needs a specific version like JDK 25? In this post, I'll show you how to configure Copilot coding agent's environment to use any Java version, including the latest JDK 25, ensuring that Copilot can successfully build and test your Java projects. The Problem When Copilot coding agent works on your repository, it attempts to discover and install dependencies through trial and error. For Java projects, this means: Copilot might try to use an older JDK version pre-installed on the runnerBuild failures occur if your project requires newer Java features (records, pattern matching, virtual threads, etc.)Time is wasted as Copilot tries different approaches to fix JDK-related issues The solution? Preconfigure Copilot's environment with the exact JDK version your project needs. The Solution: copilot-setup-steps.yml GitHub provides a special workflow file called copilot-setup-steps.yml that runs before Copilot starts working. Think of it as your project's "pre-flight checklist" for Copilot. Create this file at .github/workflows/copilot-setup-steps.yml: name: "Copilot Setup Steps" on: workflow_dispatch: push: paths: - .github/workflows/copilot-setup-steps.yml pull_request: paths: - .github/workflows/copilot-setup-steps.yml jobs: # This job name MUST be exactly "copilot-setup-steps" copilot-setup-steps: runs-on: ubuntu-latest permissions: contents: read steps: - name: Checkout code uses: actions/checkout@v5 - name: Set up JDK 25 uses: actions/setup-java@v5 with: java-version: '25' distribution: 'temurin' cache: 'maven' - name: Verify Java version run: java -version - name: Download dependencies run: mvn dependency:go-offline -B Let's break down the key parts: 1. The Job Name is Critical jobs: copilot-setup-steps: # MUST be this exact name Copilot only recognizes the job if it's named copilot-setup-steps. Any other name will be ignored. 2. Setting Up JDK 25 with setup-java - name: Set up JDK 25 uses: actions/setup-java@v4 with: java-version: '25' distribution: 'temurin' cache: 'maven' The actions/setup-java action supports multiple JDK distributions: DistributionVendorNotestemurinEclipse AdoptiumRecommended, community standardzuluAzulGood compatibilitycorrettoAmazonAWS-optimizedmicrosoftMicrosoftAzure-optimizedoracleOracleOfficial Oracle JDKlibericaBellSoftFull and lite versions available 3. Caching Dependencies cache: 'maven' This caches your Maven dependencies between Copilot sessions, significantly speeding up subsequent runs. For Gradle projects, use cache: 'gradle'. 4. Pre-downloading Dependencies - name: Download dependencies run: mvn dependency:go-offline -B This ensures all dependencies are downloaded before Copilot starts. This is especially important if: You have private dependencies that require authenticationYour project has many dependenciesYou want faster Copilot response times Complete Example for a Maven Project Here's a production-ready configuration for a Java 25 Maven project: name: "Copilot Setup Steps" on: workflow_dispatch: push: paths: - .github/workflows/copilot-setup-steps.yml pull_request: paths: - .github/workflows/copilot-setup-steps.yml jobs: copilot-setup-steps: runs-on: ubuntu-latest permissions: contents: read steps: - name: Checkout code uses: actions/checkout@v5 - name: Set up JDK 25 uses: actions/setup-java@v5 with: java-version: '25' distribution: 'temurin' cache: 'maven' - name: Verify Java version run: | java -version echo "JAVA_HOME=$JAVA_HOME" - name: Build and cache dependencies run: | mvn dependency:go-offline -B mvn compile -DskipTests -B Gradle Configuration For Gradle projects, adjust the workflow accordingly: name: "Copilot Setup Steps" on: workflow_dispatch: push: paths: - .github/workflows/copilot-setup-steps.yml pull_request: paths: - .github/workflows/copilot-setup-steps.yml jobs: copilot-setup-steps: runs-on: ubuntu-latest permissions: contents: read steps: - name: Checkout code uses: actions/checkout@v5 - name: Set up JDK 25 uses: actions/setup-java@v5 with: java-version: '25' distribution: 'temurin' cache: 'gradle' - name: Setup Gradle uses: gradle/actions/setup-gradle@v4 - name: Build and cache dependencies run: ./gradlew dependencies --write-locks Handling Private Dependencies If your project uses private Maven repositories, you'll need to configure authentication. Create secrets in the copilot environment: Go to your repository Settings → EnvironmentsClick the copilot environment (create it if it doesn't exist)Add environment secrets for your credentials Then reference them in your workflow: jobs: copilot-setup-steps: runs-on: ubuntu-latest permissions: contents: read steps: - name: Checkout code uses: actions/checkout@v5 - name: Set up JDK 25 uses: actions/setup-java@v4 with: java-version: '25' distribution: 'temurin' cache: 'maven' server-id: private-repo server-username: ${{ secrets.MAVEN_USERNAME } server-password: ${{ secrets.MAVEN_PASSWORD } - name: Download dependencies run: mvn dependency:go-offline -B Testing Your Configuration The copilot-setup-steps.yml workflow automatically runs when you: Push changes to the workflow fileCreate a PR that modifies itManually trigger it from the Actions tab This lets you validate your setup before Copilot uses it. Tips for Java Projects Compile the Code Pre-compiling your code helps Copilot understand your codebase faster: - name: Compile project run: mvn compile test-compile -DskipTests -B Generate Sources If you use code generation (Lombok processors, annotation processors, etc.): - name: Generate sources run: mvn generate-sources generate-test-sources -B Install the Project Locally For multi-module Maven projects: - name: Install modules run: mvn install -DskipTests -B Using Larger Runners For large Java projects, consider using larger GitHub-hosted runners: jobs: copilot-setup-steps: runs-on: ubuntu-4-core # More CPU and RAM # ... rest of configuration Larger runners provide more resources for: Faster dependency downloadsQuicker compilationRunning memory-intensive tests Conclusion By creating a copilot-setup-steps.yml workflow, you ensure that GitHub Copilot coding agent has access to the exact Java version your project needs. This eliminates build failures, speeds up Copilot's work, and provides a consistent development environment. Key takeaways: Create .github/workflows/copilot-setup-steps.yml with the exact job name copilot-setup-stepsUse actions/setup-java@v5 to install JDK 25 or any version you needEnable caching for Maven or Gradle dependenciesPre-download dependencies to speed up Copilot's workTest your workflow by pushing changes or manually triggering it Now Copilot can work with your Java 25 project just as effectively as it would on your local machine! More

Data Driven API Testing in Java with Rest-Assured and TestNG: Part 1

By Faisal Khatri

CORE

Building a Sentiment Analysis Pipeline With Apache Camel and Deep Java Library (DJL)

By Vignesh Durai

Testing Legacy JSP Code

By Zoltán Csorba

Why “At-Least-Once” Is a Lie: Lessons from Java Event Systems at Global Scale

At-least-once delivery is treated like a safety net in Java event systems. Nothing gets lost. Retries handle failures. Duplicates are “a consumer problem.” It sounds practical, even mature. That assumption doesn’t survive production. At-least-once delivery is not useless. It is just not a correctness guarantee, and treating it like one is where systems get into trouble. Once you’re operating at scale, especially in regulated environments where correctness and auditability matter, at-least-once delivery stops being a comfort. It becomes a liability. Not because duplication exists, but because duplication collides with real software, real workflows, and real constraints, as the slogan never mentions. The myth survives because it fits a clean mental model. If something fails, retry. If a message appears twice, make the consumer idempotent. The pipeline keeps moving, and teams can claim reliability without confronting the harder question. Does the system preserve truth when reality gets messy? The First Crack: Semantic Duplication Two identical messages are rarely identical in effect. A retry can write a row twice, advance a workflow twice, emit a second downstream event, or trigger business logic that was never designed to be replayed. Engineers often talk about idempotency as if it were a single checkbox. In production, it is a boundary problem. Where does the “one true effect” begin and end? Getting the same message twice usually isn’t a crisis. The pain starts when that message is driving a workflow. Think of a simple chain like received, validated, and posted. The handler creates a row, updates the status, and fires an event downstream. Then a retry hits. Suddenly, the workflow can step forward again, and the next system sees a second “valid” transition. No alarms go off. Logs look normal. But you’ve just created a business error that’s hard to spot and even harder to unwind. The Second Crack: Temporal Corruption Event systems don’t only move data. They move history. Retries and replays bring old facts back into the present, where they can collide with newer states. If you have ever seen a reconciliation mismatch appear with no clear root cause, you’ve seen this failure mode. In regulated systems, the damage is not limited to incorrect numbers. The system loses its ability to explain itself. When you cannot prove how the state was derived, your “high availability” system turns into a high-availability generator of ambiguity. Replays get especially ugly when time matters. A reporting pipeline might rebuild a month-end report by replaying events from storage. If late-arriving events exist, or if the replay happens against a newer reference dataset, you can regenerate a report that does not match the report you generated last month, even with “the same events.” At that point, you do not have a reporting system. You have a narrative generator. In many environments, “we can’t reproduce last month” is not an inconvenience. It is a compliance incident. The Third Crack: False Confidence At-least-once delivery becomes a psychological shortcut. Teams stop reasoning deeply about correctness because delivery “guarantees” feel like safety. Monitoring focuses on throughput and consumer lag, not semantic accuracy. Errors don’t show up as crashes. They show up later, during audits, reconciliations, or customer disputes, after the trail is cold and the retention window has rolled forward. Recovery stops being engineering and becomes forensics. Why Java Makes This Worse Java makes these problems easier to create and harder to diagnose. Transaction boundaries rarely align with acknowledgment boundaries. Side effects leak through abstractions. ORMs flush state implicitly. Retries reenter code paths that were never designed to be replay-safe. Even the comforting presence of transactional annotations can encourage the wrong intuition, because database atomicity is not the same thing as system truth. The classic failure pattern looks like this: You process a message, you commit a database transaction, and then the acknowledgment fails because the network hiccups or the consumer process restarts. The broker redelivers. Your database now contains the effect, but your consumer sees the message again and replays the effect. Teams often believe transactional annotations save them here. They don’t. They only make the writing atomic, not the world. If you cannot detect that you have already applied the business effect, retries turn into silent duplication. What Actually Works at Scale At this point, teams often reach for exactly once semantics, hoping to buy certainty. That impulse is understandable, but it usually replaces one illusion with another. Exactly-once is not a magic property you turn on. It is a coordination problem, and at scale, it is expensive, leaky, and full of edge cases. The better framing is simpler and more honest. Reliable systems do not worship delivery guarantees. They control side effects. Controlling side effects changes how systems are designed. It pushes architects to model business state explicitly instead of letting it emerge from message flow. It forces a distinction between receiving an event and applying its meaning. In practice, this often leads to designs where messages describe intent, while durable state changes are validated, versioned, and recorded separately. This shift also changes how failures are handled. Instead of retrying blindly, systems become capable of answering a more important question. Has this effect already been applied, and under what conditions? When that answer is explicit, retries are no longer dangerous. They become boring. And boring is exactly what you want when correctness matters. This starts with treating replay as a normal operating condition, not an exceptional bug. Consumers should be designed to tolerate reprocessing without changing the business outcome. That means binding identifiers to business meaning, not transport mechanics. It means making state transitions explicit and versioned, so “apply again” can be proven to be safe. It means building an audit trail that explains what happened, when it happened, and why repeating it would not change the outcome. There is also a cost model here that teams tend to ignore. Every at-least-once system eventually pays an interest rate on duplicates. Early on, that interest is small and hidden. Later, it becomes engineering time spent writing compensations, building reconciliation jobs, cleaning up drift, and explaining mismatches to stakeholders. The cost is not the broker's. It’s the complexity tax you pay forever because you never defined what “the same effect” means. Closing Thoughts At-least-once delivery persists because it is convenient. It lowers the cost of getting started and hides complexity early. In small systems, the cost of being wrong is often tolerable. In large-scale, regulated, or globally distributed systems, costs accumulate quietly until they become intolerable. Java engineers building event-driven architectures should stop asking how often a message is delivered and start asking a harder question. Can the system ever prove what actually happened? Reliability at scale is not about delivery. It is about truth.

By Krishna Kandi

Beyond Ingestion: Teaching Your NiFi Flows to Think

If you are working with data pipelines, chances are you have crossed paths with Apache NiFi. For years, it's been the go-to way for getting data from point A to point B (and often C, D, and E). Its visual interface makes building complex routing, transformation, and delivery flows surprisingly easy, handling everything from simple log collection to intricate IoT data streams across countless organizations. It's powerful, it's flexible, and honestly, it just works really well for shuffling bits around reliably. We set up our sources, connect our processors, define our destinations, and watch the data flow — job done, right? AI Opportunity Well, mostly. While Apache NiFi is fantastic at the logistics of data movement, I started wondering: what if we could make the data smarter while it's still in motion? We hear about AI everywhere, crunching massive datasets after they've landed in a data lake or warehouse. But what about adding that intelligence during ingestion? Imagine enriching events, making routing decisions based on predictions, or flagging anomalies before the data even hits its final storage. That got me thinking about integrating AI directly within a NiFi flow. Sure, we can use processors InvokeHTTP to call out to external AI APIs, and that definitely has its place. But I couldn't find many hands-on examples showing how to embed and run an AI or machine learning model inside a custom NiFi processor using Java, while leveraging NiFi's scalability and data handling capabilities for the AI component as well. It felt like a gap, a missed opportunity to truly combine the strengths of both worlds right there in the pipeline. So, I decided to roll up my sleeves and figure out how to do it. Code In this article, I want to share what I learned. We will walk through building a custom NiFi processor in Java that loads and runs a real machine learning model (using the Deep Java Library, or DJL) to perform analysis directly on the FlowFile data as it passes through. No external calls needed for the core AI task! Let's dive into the code. You can refer to the full working code ( which generates NAR for NiFi) in my GitHub here. Below is the Java code, which is the main function where I have shown how to call PyTorch and DJL. This is a very simple use case where input is classified as positive, negative, or neutral based on incoming text. Example text: Plain Text I am very happy with the results, it exceeded my expectations --> positive This film was terribly boring and poorly acted. --> negative This also shows the score from the given model. Java public void loadModel(final ProcessContext context) { getLogger().info("Loading command line classification model..."); try { // Define criteria to load a text classification model. // *** IMPORTANT: Replace with a model fine-tuned if possible. *** // Using a generic BERT for sequence classification as a placeholder. Criteria<String, Classifications> criteria = Criteria.builder() //.optApplication(Application.NLP.TEXT_CLASSIFICATION) .setTypes(String.class, Classifications.class) // Input text, Output classification .optEngine("PyTorch") .optModelUrls("djl://ai.djl.huggingface.pytorch/distilbert-base-uncased-finetuned-sst-2-english") // If using a local model: .optModelPath(Paths.get("/path/to/your/model")) .optProgress(new ProgressBar()) .build(); this.model = criteria.loadModel(); this.predictor = model.newPredictor(); getLogger().info("Command line classification model loaded successfully."); } catch (Exception e) { getLogger().error("Failed to load command line classification model.", e); this.predictor = null; // Ensure predictor is null on failure // Throwing here will prevent the processor from starting if the model fails to load throw new RuntimeException("Failed to initialize AI model", e); } } Results: For simplicity, I have used the results as attributes of the flow file itself. Java code can be changed to add these enrichments to the flow file content, and at the end, the flow file can be routed to final outputs such as a data lakehouse or warehouse. Key Considerations Consider making our own models based on requirements. For example, if we want to find out whether the incoming URL is malicious or benign, it would be better to prepare our own model based on the organization's needs. Generally, the NiFi memory/CPU spikes won't be high if we code the NiFi customer processors in the right way, even when an AI/ML model is incorporated in them. But when we prepare our own AI model, it's better to consider these aspects of how it behaves for larger incoming flowfiles. I have used the PyTorch AI engine here, but there are other libraries to explore as well, such as TensorFlow, Apache MXNet, and ONNX Runtime.URLs can also be passed directly by the NiFi processor rather than embedded in the code. But we have to make sure all dependencies in pom.xml are already in place, else it will throw an error. Conclusion Adding AI smarts directly into NiFi with a custom processor isn't just theory; as we have seen, it's practically achievable using tools like DJL within Java. This approach lets you leverage NiFi's robust data handling while performing sophisticated analysis right in the flow. It moves AI processing closer to the data source, enabling immediate enrichment and smarter routing decisions. Give it a try — you might be surprised how much intelligence you can pack directly into your data pipelines. References https://djl.ai/ https://docs.djl.ai/master/index.htmlhttps://github.com/deepjavalibrary/djl/blob/master/docs/model-zoo.mdhttps://nifi.apache.org/docs/nifi-docs/

By Madhusudhan Dasari

Responding to HTTP Session Expiration on the Frontend via WebSockets

There is no doubt that nowadays software applications and products that have a significant contribution to our well-being are real-time. Real-time software makes systems responsive, reliable, and safe, especially in cases where timing is important — from healthcare and defense to entertainment and transportation. Such applications are helpful as they process and respond to data almost instantly or within a guaranteed time frame, which is critical when timing and accuracy directly affect performance, safety, or even user experience. As a protocol that enables real-time, two-way (full-duplex) communication between a client and a server over a single, long-lived TCP connection, WebSockets are among the technologies used by such applications. The purpose of this article isn’t to describe in detail what WebSockets are. It’s assumed the reader is familiar with these concepts. Nevertheless, it briefly highlights the general workflow, then it focuses on presenting a concrete use-case and exemplifies how WebSockets are used to address a real concern. As part of a simple web application, it is considered that the HTTP user session expires, and yet an action at the front-end level is expected. In this direction, the client (browser) is let know by the server (back-end) about the event by leveraging WebSockets. WebSockets: The Workflow Normally, communication in web or REST applications happens via HTTP, in a request-response manner — the client asks, the server replies, then the connection closes. With WebSockets, there is a slightly different architecture. Once the connection is established, both the client and the server can send data to each other, at any time. There’s no need to repeatedly open new requests, as the same connection is used. The workflow is the following: As part of the communication handshake, the client sends an HTTP request with an Upgrade: websocket header.The server responds with an HTTP 101 (Switching Protocols) status whether it supports WebSockets and basically it agrees and upgrades to the WebSocket protocolA persistent TCP connection is then established, which remains open (port 80 or 443)Both ends can push messages instantly in either direction, data is transmitted via small “frames” with minimal overhead, instead of full HTTP messages In general, when to use or if to use WebSockets is a trade-off each team shall analyze. In many cases, AJAX and HTTP streaming or long polling can be simpler and more effective. As clearly outlined in the Spring WebSocket Documentation — “It is a combination of low latency, high frequency, and high volume that makes the best case for the use of WebSocket.” Nevertheless, this article presents a slightly different use of them, one that proves to successfully solve the particular outlined challenge – to act at the front-end level when the HTTP session expires. The Initial Implementation To put the use case into practice, a simple web application running on a web container and holding an HTTP session is first created. The set-up is the following: Java 21Maven 3.9.9Spring Boot – v.3.5.5Spring Security – for application authentication, authorization and user session managementSpring WebSocket – for the WebSockets server-side implementationStomp v.2.3.3 and SockJS Client v.1.6.1 JavaScript libraries – for the WebSockets client-side implementationThymeleaf – for implementing the front-end (for simplicity, as part of the same application) Once the dependencies are selected, they are added into the pom.xml file. XML <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-security</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-thymeleaf</artifactId> </dependency> <dependency> <groupId>org.thymeleaf.extras</groupId> <artifactId>thymeleaf-extras-springsecurity6</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-websocket</artifactId> </dependency> Next, the application is sketched up. In terms of front-end and user experience, the application is reduced to the minimum, so that this experiment can be fulfilled. It is consisted of three pages: index.html – the starting point, it can be accessed without needing to authenticatelogin.html – the place the user can sign in, also accessible without authenticationhome.html – the landing-page the user is brought to once signing in successfully To be able to access these pages, the next minimal configuration class is added, together with three view controllers, respectively. Java @Configuration public class WebConfig implements WebMvcConfigurer { @Override public void addViewControllers(ViewControllerRegistry registry) { registry.addViewController("/") .setViewName("index"); registry.addViewController("/login") .setViewName("login"); registry.addViewController("/home") .setViewName("home"); } } The context path of the application is set in the application.properties file as /app. Properties files server.servlet.context-path = /app With what we have so far, as spring-boot-starter-security is discovered in class path, if the application is launched, Spring Boot generates a default security password that is displayed in the logs upon start-up and can be used to sign into the application (default username is user). In order to better control the behavior, the default security configuration is overwritten. Java @Configuration @EnableWebSecurity public class SecurityConfig { @Bean public SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception { http .headers(AbstractHttpConfigurer::disable) .authorizeHttpRequests(authorizeHttpRequestsCustomizer -> authorizeHttpRequestsCustomizer.requestMatchers("/").permitAll() .anyRequest().authenticated()) .formLogin(formLoginCustomizer -> formLoginCustomizer.loginPage("/login") .permitAll() .defaultSuccessUrl("/home")) .logout(logoutCustomizer -> logoutCustomizer.invalidateHttpSession(true) .deleteCookies("JSESSIONID") .logoutSuccessUrl("/")) .sessionManagement(sessionManagementConfigurer -> sessionManagementConfigurer.maximumSessions(1) .expiredUrl("/home")); return http.build(); } @Bean public PasswordEncoder passwordEncoder() { return PasswordEncoderFactories.createDelegatingPasswordEncoder(); } @Bean public UserDetailsService userDetailsService(PasswordEncoder passwordEncoder) { UserDetails user = User.builder() .username("horatiucd") .password(passwordEncoder.encode("a")) .roles("USER") .build(); return new InMemoryUserDetailsManager(user); } } Very briefly, the simplistic UserDetailsService is configured to have only one in-memory user, whose password is encrypted using a DelegatingPasswordEncoder with the default mappings. The security filter chain capable of being matched against an HttpServletRequest is built and the restrictions on the pages are the ones specified above. The login page is accessible without authentication. Once signed in successfully, the user is taken to the home page. When signing out, the HttpSession is invalidated, the JSESSIONID cookie deleted and the user redirected to the root application path. Regarding session management, as this is an aspect of interest in this article, the application is configured to have just one session per each user. In addition, the duration of the session is customized in the application.properties file to last 1 minute. Such a value is not recommended in real applications, but it’s fine for the sake of this experiment. Properties files server.servlet.session.timeout = 1m With these pieces of configuration in place, the application is re-run. If accessed at http://localhost:8080/app/, it displays the index.html view. From here, a user can go to login.html page, provide the credentials, and sign in. To log out, the user can press the Sign Out button and then return to the application root. As of now, as soon as a user session expires due to inactivity, a redirect to the home page is done at the very next action. In some use cases, this behavior is acceptable and perfectly fine, while in others, it is not. Let’s assume in this case it’s not. The Improved Implementation With the statement in the last sentence in mind, we will further exemplify how the interaction between the client and the server can be enhanced in case of session expiration. WebSockets are used to help the back-end communicate with the front-end when the user session has just expired and allow the client to decide how to further act to this event. Here, just a reload of the current page is done, basically forcing the redirect to the login page. At server-side level, in order to be aware when sessions are created and / or terminated, a custom HttpSessionListener is added. According to the Java documentation, implementers “are notified of changes to the list of active sessions in a web application” and that’s exactly what’s needed here. Java @Component public record CustomHttpSessionListener() implements HttpSessionListener { private static final Logger log = LoggerFactory.getLogger(CustomHttpSessionListener.class); @Override public void sessionCreated(HttpSessionEvent event) { log.info("Session (ID: {}) created.", event.getSession().getId()); } @Override public void sessionDestroyed(HttpSessionEvent event) { log.info("Session (ID: {}) destroyed.", event.getSession().getId()); } } Additionally, the below HttpSessionEventPublisher bean instance is added so that HttpSessionApplicationEvents are published into the Spring WebApplicationContext. Java @Bean public HttpSessionEventPublisher httpSessionEventPublisher() { return new HttpSessionEventPublisher(); } With this configuration in place, we are now able to depict lines as the ones below upon session creation or termination, respectively. Plain Text INFO 34104 --- [spring-security-app] [nio-8080-exec-8] c.h.s.l.CustomHttpSessionListener: Session (ID: 85891C66D081D6D24DFF6224FE54D21E) created. ... INFO 34104 --- [spring-security-app] [alina-utility-2] c.h.s.l.CustomHttpSessionListener: Session (ID: 85891C66D081D6D24DFF6224FE54D21E) destroyed. At this point, at least at server-side level, we could act further when such events are published. In order to notify the front-end, a STOMP message will be sent from the above CustomHttpSessionListener, from the sessionDestroyed() method. To be able to do this, STOMP messaging is enabled at Spring configuration level. Java @Configuration @EnableWebSocketMessageBroker public class WebSocketConfig implements WebSocketMessageBrokerConfigurer { @Override public void configureMessageBroker(MessageBrokerRegistry registry) { registry.enableSimpleBroker("/topic"); registry.setApplicationDestinationPrefixes("/ws"); } @Override public void registerStompEndpoints(StompEndpointRegistry registry) { registry.addEndpoint("/session-websocket") .setHandshakeHandler(new UserHandshakeHandler()) .withSockJS(); } } Adding the @EnableWebSocketMessageBroker annotation enables broker-back-end messaging over WebSocket using a higher-level messaging sub-protocol, here STOMP. Further customization is done by implementing WebSocketMessageBrokerConfigurer. While the former method is self-explanatory, the latter registers the mapping of each STOMP endpoint to a specific URL. In this experiment, one endpoint is enough – /session-websocket. One last observation is worth making here. In the brief WebSocket introduction, it was mentioned that during the connection handshake, the client sends an HTTP request with an Upgrade header. When applications are integrated over the Internet, the messages’ exchange via WebSockets might be impacted by possible proxies’ or firewalls’ configurations that don’t permit passing the Upgrade headers. One possible and handy solution is to attempt to primarily use WebSocket and then, if that doesn’t work, to fall back on HTTP implementations that emulate the WebSocket interaction and expose the same application-level API, here SockJS. Fortunately, Spring Framework provides support for the SockJS protocol. Coming back to the configuration above, the endpoint is configured with SockJS fallback. Moreover, the used HandshakeHandler is set to the a custom one. Java public class UserHandshakeHandler extends DefaultHandshakeHandler { private static final Logger log = LoggerFactory.getLogger(UserHandshakeHandler.class); @Override protected Principal determineUser(ServerHttpRequest request, WebSocketHandler wsHandler, Map<String, Object> attributes) { ServletServerHttpRequest servletRequest = (ServletServerHttpRequest) request; SecurityContext securityContext = (SecurityContext) WebUtils.getSessionAttribute(servletRequest.getServletRequest(), HttpSessionSecurityContextRepository.SPRING_SECURITY_CONTEXT_KEY); String user = "anonymousUser"; if (securityContext != null && securityContext.getAuthentication() != null) { user = securityContext.getAuthentication().getName(); log.info("User connected via web socket: {}.", user); } return new UserPrincipal(user); } } The default contract for processing the WebSocket handshake request is modified by overriding the determineUser() method. Specifically, the currently logged in user is read from the SecurityContext session attribute, if any. This identification is needed so that each user (session) has its own private WebSocket channel and thus, when messages are sent from the server towards the client, only the designated ones are sent and received respectively. With this configuration in place, the service for sending messages can be created. Java @Service public class WebSocketService { private final SimpMessagingTemplate template; public WebSocketService(SimpMessagingTemplate template) { this.template = template; } public void notifyUser(String user, String message) { template.convertAndSendToUser(user, "/topic/user-messages", new WebSocketMessage(message)); } } The method is straightforward and uses the SimpMessagingTemplate to send messages to the particular user that’s identified when the WebSocket connection is established, as previously described. Since the messages in this example are only used to signal an event, they include as content just a string value. Java public record WebSocketMessage(String content) {} To finish the server implementation, the WebSocketService is injected into the CustomHttpSessionListener and the sessionDestroyed() method enhanced to use it. Java @Override public void sessionDestroyed(HttpSessionEvent event) { SecurityContext securityContext = (SecurityContext) event.getSession() .getAttribute(HttpSessionSecurityContextRepository.SPRING_SECURITY_CONTEXT_KEY); if (securityContext != null && securityContext.getAuthentication() != null) { Authentication auth = securityContext.getAuthentication(); String user = auth.getName(); if (auth.isAuthenticated() && !"anonymousUser".equals(user)) { log.info("User's {} session expired.", user); webSocketService.notifyUser(user, "Session expired"); } } log.info("Session (ID: {}) destroyed.", event.getSession().getId()); } At the client-side level, a few configurations need to be done as well. First, the two JavaScript libraries are imported into the home.html page, together with jquery. HTML <html xmlns="http://www.w3.org/1999/xhtml" xmlns:th="https://www.thymeleaf.org" xmlns:sec="https://www.thymeleaf.org/thymeleaf-extras-springsecurity6" lang="en"> <head> <title>Home</title> <meta data-fr-http-equiv="Content-Type" content="text/html; charset=UTF-8"/> <script src="https://code.jquery.com/jquery-3.7.1.min.js"></script> <script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/sockjs.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/stomp.js/2.3.3/stomp.min.js"></script> <script th:src="@{/app.js}"></script> </head> <body> <div th:inline="text"><span th:remove="tag" sec:authentication="name"></span>, welcome!</div> <div> <form th:action="@{/logout}" method="post"> <div> <button type="submit">Sign Out</button> </div> </form> </div> </body> </html> Additionally, a local script — app.js — is added, which contains the client WebSocket initialization. The significant part is below. When the page loads, the connection is initialized and upon connection the client subscribes to the designated topic, through which it receives private user messages transmitted by the server. JavaScript $(document).ready(function () { connect(); }); function connect() { let socket = new SockJS('/app/session-websocket'); let stompClient = Stomp.over(socket); stompClient.connect({}, function (frame) { stompClient.subscribe('/user/topic/user-messages', function (message) { console.log("Received message " + JSON.parse(message.body).content); window.location.reload(); }); }); socket.onclose = function(event) { onSocketClose(); }; } If the application is restarted and the user signs in successfully, if the client console is examined, it’s observed the WebSocket connection has been established successfully. From a user experience point of view, the behavior of the application is different. If the session expires, the user is automatically brought to the sign in page, where before, the next user interaction would have determined that. Conclusion There is no doubt WebSockets play a very important role in real time applications nowadays as they provide them with the capability of being dynamic and interactive. With their lightweight layer on top of TCP, WebSockets are really suitable when needing to exchange messages between clients and servers. Yet, in addition to these, there are numerous other possible use cases in which this technology can be applied, not necessarily for enhancing the responsiveness and user experience, but to outcoming potential technical challenges, just as the simple one exemplified in this article. Resources Application source code is here.Spring WebSockets DocumentationThe picture was taken in Sinaia, Romania.

By Horatiu Dan

CORE

Java Developers: Build Something Awesome with Copilot CLI and Win Big Prizes!

Here’s today’s invitation: join the GitHub Copilot CLI Challenge and build something with Copilot right in your terminal. Visit the challenge page for the rules, FAQ, and submission template. Why I’m Excited About Copilot CLI (especially for Java) If you write Java for a living, you already know the truth: the terminal is where we build and test. It’s where feedback loops are short and where most productivity gains come from “small wins” repeated hundreds of times. Most Java developers use Maven or Gradle, and IDEs (especially IntelliJ) have fantastic support for both. But in practice, we still drop to the terminal quite regularly: Run a very specific Maven goal or Gradle taskReproduce CI as closely as possibleAdd flags to isolate one failing testCheck output in the same environment your teammates (and CI) will seeRun a single test the same way CI does, even if you don’t remember the exact incantation If we’re already inside the terminal running commands, we might as well ask Copilot CLI to help us do the right thing faster. GitHub Copilot CLI brings an agentic workflow to the place where those loops happen: the command line. And the best part? You can keep it grounded in your repo and your actual build output. The Challenge (Quick Overview) The challenge is quite open-ended: build an application using GitHub Copilot CLI. But there are judging criteria: Use of GitHub Copilot CLIUsability and user experienceOriginality and creativity The challenge has been running since January 22nd, but there’s still time. Submissions are due by February 15th at 11:59 PM PST, and the winners will be announced on February 26th. This challenge comes with some really cool prizes: The top 3 winners receive $1,000 USD, a GitHub Universe 2026 ticket, and a winner badgeThe next 25 runners-up receive a 1-year GitHub Copilot Pro+ subscription and a runner-up badgeAll valid submissions receive a GitHub completion badge Ship a Java Tool (for End Users or Developers) It can be a Spring Starter, a Quarkus Extension, a JavaFX application, a web application, Maven or Gradle plugins, a Java Swing application, plugins for IntelliJ or Eclipse, or even Apache JMeter! This is the kind of challenge where a small, well-executed tool can be more impressive than a giant “AI demo.” If you’re on the fence, here’s my recommendation: pick a problem you hit every weekbuild a thin vertical slice in a daymake it pleasant to usewrite a clean “how to run it” sectiontell the story of how Copilot CLI helped you iterate Need Ideas? These Java-friendly ideas fit the judging criteria and are realistic, shippable, and easy for judges to evaluate: Test failure triage assistant for Maven/Gradle: Parse Surefire output, summarize likely causes, suggest next commandsLog explainer: Ingest a stack trace and environment info, generate a focused explanation and remediation checklistRepo onboarding CLI: Generate a “first 30 minutes” guide (build, tests, conventions, release process)Changelog helper: Read Git history and propose a changelog entry and release notes draftOpenAPI → Spring Boot starter: Take an OpenAPI spec and scaffold a production-ready service layout Prompts You Can Try Today (Copy/Paste Inspiration) In your terminal (inside a repo), try asking Copilot CLI: “Summarize why my Maven tests are failing from this output, then suggest the next 3 commands I should run.”“Generate a JUnit 5 test for this class focusing on boundary cases.”“Explain this stack trace like I’m onboarding to the project; point me to the likely source file and fix.”“Propose a refactor that reduces duplication but keeps the public API stable.”“Write a README section explaining how to run this tool, with examples.” The key is to keep the agent grounded in real inputs: actual logs, actual code, real constraints. Programmatic Control via the Copilot SDK for Java If you want to build a Java app that programmatically drives Copilot CLI, start here: https://github.com/copilot-community-sdk/copilot-sdk-java More Resources Use these to bootstrap your project quickly: Kotlin MCP development collection: https://github.com/github/awesome-copilot/blob/main/collections/kotlin-mcp-development.mdJava MCP development collection: https://github.com/github/awesome-copilot/blob/main/collections/java-mcp-development.mdJava development collection (Spring Boot, Quarkus, JUnit, Javadoc, upgrade guides): https://github.com/github/awesome-copilot/blob/main/collections/java-development.mdOpenAPI → Spring Boot application collection: https://github.com/github/awesome-copilot/blob/main/collections/openapi-to-application-java-spring-boot.mdCopilot SDK for Java: https://github.com/copilot-community-sdk/copilot-sdk-java Ready? Here’s Your Next Step If you build Java tools, this challenge is a great excuse to ship something useful and learn an agentic workflow you can reuse. Join the challenge and start your submission here: https://dev.to/challenges/github-2026-01-21 If you do build something, tag me on DEV or social media — I’d love to see what you ship.

By Bruno Borges

Bootstrapping a Java File System

So, what does a file system mean to you? Most think of file systems as directories and files accessed via your computer: local disk, remotely shared via NFS or SMB, thumb drives, something else. Sufficient for those who require basic file access, nothing more, nothing less. That perspective on file systems is too limited: VCS repositories, archive files (zip/jar), and remote systems can be treated as file systems, potentially accessed via the same APIs used for local file access while still meeting security and data requirements. Or how about a file system that automatically transcodes videos to different formats or extracts audio metadata for vector searches? Wouldn’t it be cool to use standard APIs rather than create something customized? Definitely! Java provides a file system abstraction that enables solution-specific implementations accessed via the APIs used for traditional disk-backed file systems. Potentially overwhelming at first blush, getting the basics bootstrapped is remarkably straightforward, with the implementation effort dependent on what your requirements need. In this post, I’ll explain the basics of Java’s file systems to get you started. I created a starter project, which is a bare-bones Java file system with two operations implemented (create directory and exists), used via a demo class. If you’re a glutton for punishment, you can also clone/fork my neo4j-filesystem project, which is an almost fully functional file system, minus some edge cases. History of File Systems Within Java A short, flippant, perhaps not even completely correct history of the Java APIs for file systems. Not required reading, jump ahead if you’re getting antsy to start actual work! The initial release of Java 1.0 provided access to the operating system’s file system via java.io.File, a simple implementation built on blocking I/O and single-threaded operations. Usable but limited, adequate performance, but definitely not scalable. Acceptable for a small target audience, perhaps viewed as more proof-of-concept than anything; doubtful that James Gosling or anyone at Sun envisioned the behemoth Java has since become. Java's continued growth and inroads in software engineering led to its First Age of (I/O) Enlightenment: Java 1.4 introduced Java NIO, providing better abstractions, non-blocking I/O, multi-threaded operations, and more. Performance improved. And there was some rejoicing (maybe). However, Java NIO was difficult to use (or so I’ve read), leading to the Second Age of (I/O) Enlightenment: Java 1.7 introduced Java NIO.2 to address perceived usability issues and implemented a File System abstraction that allowed customized file systems. No longer restricted to an OS perspective, one can now implement a file system based on one's specific requirements. The crown jewels: java.nio.file.Files that delegate operations to whichever file system based on a file/directory Path. And there was more rejoicing. Prior to a recent project, I had not used (actually avoided) Java NIO and did not understand its value. Yes, java.nio.file.Files simplifies repetitive, templated I/O operations, nothing more, nothing less. I had a superficial understanding at best, legacy java.io.File was sufficient… until it wasn’t. Most important are the Java NIO.2 changes, which allow solutions to implement their file systems based on their requirements that seamlessly integrate with the JVM. Third-party or customized implementations are no longer necessary, which should greatly simplify many aspects of your solution. Before You Start Implementing your first custom file system will not be quick, straight-forward nor painless, so I recommend you consider the following to create a high-level, conceptual design before you start coding. The design is not immutable; in fact, I fully expect course corrections and refinements as you get deeper. Even an in-your-head design makes future decisions easier to contextualize and implement. You’ll thank me later! Note: Java is POSIX-biased — unsurprising considering its Sun Solaris origins — and therefore so is its file system abstraction: path separator, owner user/groups, access modes, file permissions, file types, etc. Moving away from POSIX likely means working around, rather than with, Java’s file systems. Possible? Yes. Recommended? No.. You’ve been warned. URI Design URIs have two important purposes within Java file systems: The URI identifies the specific file system with which to work.The URI identifies a specific file or directory within the specified file system. The URI is the core concept upon which a Java file system is built, and understanding its construction and purpose is required. A URI has four components that are relevant for file systems: scheme: Uniquely identifies to which file system implementation Java should delegate operations. Required and must be unique among all file systems present in the JVM. [The exception is the JVM’s default file system, typically file:// for OS local storage.]user:password: Optional username/password for user authentication, such as authenticating to a remote system before operations are executed. Note: passwords in cleartext are not secure. Use at your own risk.host: Optional. Host names generally identify the remote system to connect to, but also are used to partition the file system for security or data management purposes. For example, each user or customer may have a dedicated partition./path/to/file: A fully-qualified or relative path to identify a specific directory or file. The most common path separator is a slash; using other special characters is possible (somewhat) but introduces other problems. If you require directories, stick with slashes. Again, you have been warned. URI query strings may be used for unique requirements, but generally are not necessary. File Tree Management You need to track the directories and files comprising your file tree, but where? Relational and document-based NoSQL database systems are obvious choices, but your requirements may lead you elsewhere. What data or properties are stored with each entry? Definitely directory/file name and parent/child info. Security? Owners? Timestamps? Encryption key? Something else? Consider carefully what you include and what you don’t. Managing a file tree is challenging: operating systems structures — e.g., *nix iNodes or NTFS‘s master file table in Windows — have evolved to be highly optimized and efficient. Your challenge is to provide efficient access and navigation while maintaining your organization’s non-functional requirements. I’ve experienced customized file systems struggle to provide both efficient tree management (write) and navigation (read), leading to ongoing hacks to fix the performance problem of the day. File trees are simple graphs, but arbitrary directory depth and the number of files/directories within a directory are problematic. Binary Storage Where does your file system store the actual file, the raw bytes representing the file uploaded to your file system? Storing locally is feasible, though counterintuitive; storing files externally is more likely, such as AWS S3, Azure Blob Storage, Google Cloud Storage, or even a database that supports blobs. Each approach has different functionality, limitations, and costs, so choose wisely for the sake of your implementation. Your solution may require functionality not always available to your file storage approach. Must files be automatically encrypted/unencrypted? Do you need the file’s metadata to be extracted and stored separately? Can users request previous versions of a file? Anything else? File storage is as simple or as complicated as defined by your requirements. The Bare Minimum Four components must be present to bootstrap your Java file system. Clone the starter file system repository if you want to follow along in your IDE: after reviewing, you’ll understand how little magic is actually involved. Implement Path: Represents a file system’s directory or file based on its URI representation, created either by direct calls to FileSystem.getPath() or indirectly via Path.of(). Implement the methods: Constructor to which a generic Path or path as a string are passed in; the associated FileSystem instance may also be useful.getFileName()getParent()subpath()toUri() Extend FileSystem: An instance of a file system, usually identified by a truncated URI (e.g., scheme and hostname). For example, a zip file has its own ZipFileSysteminstance with which to interact with the zip file via the Java file system. Implement the methods: Constructor which accepts a FileSystemProvidergetPath()provider() Extend FileSystemProvider: The power engine of a Java file system, as the functionality implemented determines what/how your file system operates. A singleton is registered with the JVM at startup. Once registered, operations for the defined scheme are forwarded to the FileSystemProvider instance. Implement the methods: getScheme()newFileSystem()getFileSystem()getPath() Register FileSystemProvider: Create the resource file META-INFO/services/java.nio.file.spr.FileSystemProvider. The file must contain the fully-qualified class name for your file system provider. That’s it. Really. Review the starter project. You now have a working Java file system that does absolutely nothing. At this point, you are ready to implement the functionality required by your custom file system. So far, so good! Next Steps Implement, implement, implement. Now the real fun work begins. Some suggestions: Database layer: Managing the file tree is fundamental to every operation of your file system, so you’ll need at least the basics in place immediately. Define the entries supported – directories, files, maybe symbolic links – with the necessary metadata. Implement the CRUD operations. Emulate path navigation to ensure arbitrary depth doesn’t cause problems.My first operations: I started with basic directory operations that don’t require storing files: createDirectory(), exists(), delete(), deleteIfExists(). Test by creating a FileSystem and making calls to Files. Start to get a feel for how things fit together.File storage: After directories, work comes files, so you are unable to avoid figuring out your file management strategy. Early on, a local disk may actually be sufficient to allow you to proceed. Define a robust interface that allows additional implementations without requiring larger rework. A bit-bucket file storage implementation provides for large-scale file tree work when actual files aren’t required.My first file operations: Implement FileSystem.newInputStream() and FileSystem.newOutputStream() to start creating files using Files.createFile() or Files.copy(). Now you’ve got something vaguely useful.Local file system testing: My file system is intended to be a fully-functioning POSIX-based file system, so I dug deep into the code for working with a local file for better understanding: method return values, exceptions thrown, edge cases, enum interpretation, how attributes are implemented, etc. Create demos/tests using Files and see what works, what doesn’t, debug, refactor, etc.Patience: Frustrating initially, rewarding later. The abstraction makes some things much more difficult than I expected/wanted, but little by little, the pieces start fitting together. This is not an afternoon’s work! Final Thoughts With a better understanding, I have additional ideas for unique ways to leverage Java file systems. Could I have asked AI to do this for me? Sure, but what fun would that have been! Creating a custom Java file system was both geeky and fun, perhaps more fun than I have had in a while, plus I have a deeper understanding of a core Java concept. Score! References Links https://github.com/scsosna99/java-file-system-starterhttps://github.com/scsosna99/neo4j-filesystemhttps://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/nio/file/package-summary.htmlhttps://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/nio/file/spi/FileSystemProvider.html Image Credits “James Gosling 2008” by Peter Campbell is licensed under CC BY-SA 4.0.“URI Format” generated by Claude.AI based on my prompts.“File Tree” © 2026 Scott C Sosna“NetApp FAS270” by mondopiccolo is licensed under CC BY-NC 2.0.“Code Snippets” © 2026 Scott C Sosna As originally published at https://scottsosna.com/2026/01/28/bootstrapping-a-java-file-system/.

By Scott Sosna

CORE

Jakarta EE 12 M2: Entering the Data Age of Enterprise Java

Every major Jakarta EE release tends to have a defining theme. Jakarta EE 11 was about modernization: a new baseline with Java 17, forward compatibility with Java 21, and a decisive cleanup of long-standing technical debt. Jakarta EE 12 builds directly on that momentum, but its direction is different. This release is less about removing the past and more about aligning the future. Jakarta EE 12 is best understood as the Data Age of enterprise Java. This is the release where persistence, querying, repositories, NoSQL, configuration, and even AI-related concerns begin to converge into a more coherent platform story. With the second milestone (M2) scheduled for the end of January 2026, Jakarta EE 12 is already showing how that alignment is taking shape. From Jakarta EE 11 to Jakarta EE 12: Why This Release Matters After the release of Jakarta EE 11 in June 2025, work on Jakarta EE 12 moved quickly. The goal was never to introduce a disruptive rewrite, but to improve integration, consistency, configuration, and developer productivity across the platform. Jakarta EE 12 follows a familiar milestone-based process: early milestones for experimentation and feedback, followed by refinement milestones, culminating in the final platform vote by the Jakarta EE Steering Committee. Milestone 2 is particularly important because it is where architectural direction becomes visible and individual specifications start to align. At a platform level, Jakarta EE 12 reflects several irreversible shifts: Java 21 becomes the baseline, with support for Java 25The deprecated Java SecurityManager has been fully removedAmbiguous and legacy APIs are cleaned upThe platform explicitly prepares for modern protocols such as HTTP/3 These changes form the foundation on which the rest of the platform evolves. CDI 5.0: Toward More Predictable Application Startup Dependency injection remains the backbone of Jakarta EE applications, and Jakarta CDI 5.0 continues the trend of simplifying and clarifying the programming model. One recurring request from developers is better control over application startup behavior, especially for components that must be initialized eagerly. While CDI 5.0 does not yet standardize eager initialization semantics, the specification and ecosystem increasingly encourage explicit patterns for it. A common, future-facing approach looks like this: Java @ApplicationScoped @Eager // conceptual example – not yet standardized public class CacheInitializer { @Inject DataSource dataSource; public void init() { // warm up cache, validate connections, preload data } } This pattern reflects where CDI is heading: explicit lifecycle control, predictable startup, and better alignment with modern deployment environments such as Kubernetes and serverless platforms. Jakarta MVC 3.1: Simplicity for Web-Facing Applications While REST dominates back-end communication, traditional MVC still plays an important role in web applications. Jakarta MVC 3.1 continues to offer a clean, minimal programming model for server-side rendered applications. A simple controller illustrates the philosophy: Java @Controller @Path("/hello") public class HelloController { @GET public String hello() { return "hello.jsp"; } } Jakarta MVC deliberately avoids complexity. It integrates cleanly with CDI, Jakarta REST, and the broader platform, making it a practical choice for applications that require server-side views without adopting heavyweight frameworks. The Data Age: Persistence, Repositories, and Query Alignment Jakarta EE 12 earns the title of the Data Age not because it introduces yet another persistence abstraction, but because it finally aligns several long-evolving ones into a coherent model. Instead of forcing developers into a single way to access data, the platform now embraces multiple styles — repository-centric, ORM-centric, and query-centric — while ensuring they fit together cleanly. At the repository level, Jakarta Data 1.1 continues to define how applications express intent when accessing data. But Jakarta EE 12 shifts attention away from basic CRUD — which was already stable in Jakarta EE 11 — and toward how repositories express queries dynamically and portably. This is where restrictions become central. Restrictions provide a fluent, type-safe way to construct queries programmatically, without embedding query strings or leaking database semantics into application code. They are especially important in Jakarta EE 12 because they sit precisely at the intersection of repositories, querying, and database portability. Given a repository like this: Java @Repository public interface Countries extends CrudRepository<Country, String> { @Find List<Country> filter(Restriction<Country> restriction); } A restriction represents a reusable predicate over the domain model. Simple queries remain expressive without becoming opaque: Java List<Country> european = countries.filter( Restrict.all( _Country.code.in("FI", "FR", "DE", "IT") ) ); More complex queries can be composed fluently, combining multiple conditions with clear semantics: Java List<Country> result = countries.filter( Restrict.all( _Country.code.notNull(), _Country.code.in("FI", "FR", "GR"), _Country.code.notEqualTo("FR") ) ); Logical alternatives are just as explicit: Java List<Country> americasOrAsia = countries.filter( Restrict.any( _Country.code.equalTo("CO"), _Country.code.equalTo("MY") ) ); What matters here is not syntactic convenience, but semantic stability. These restrictions mean the same thing whether the underlying store is relational or NoSQL. Jakarta EE 12 leans into this model because it captures the shared behavior of persistence systems, without flattening their differences. This repository-level querying model aligns naturally with Jakarta Query, which becomes a first-class platform specification in Jakarta EE 12. Jakarta Query defines the Jakarta Common Query Language (JCQL), extracting the common subset of JPQL and Jakarta Data Query into a single place. String-based queries and restriction-based queries are no longer competing ideas; they are complementary tools within the same conceptual framework. On the ORM side, Jakarta Persistence 4.0 evolves independently while remaining compatible. As explained by Gavin King in his Jakarta Persistence 4.0 milestone article, the focus is on clarity, modern Java usage, and the provision of alternative programming models — not on absorbing repository semantics. One example of this is the new EntityAgent, which allows direct, stateless interaction with the database: Java var book = factory.callInTransaction(EntityAgent.class, agent -> { return agent.get(Book.class, isbn); }); book.setTitle("Hibernate in Action"); factory.runInTransaction(EntityAgent.class, agent -> { agent.update(book); }); This model does not replace repositories. Instead, it exists alongside them, serving developers who prefer explicit database interaction over persistence contexts. Jakarta EE 12 makes room for both approaches, without forcing either. Query alignment continues with improvements in projection and typing. With Jakarta Persistence 4.0 and Jakarta Query, record-based projections no longer require special syntax: Java record Summary(String title, String isbn, LocalDate date) {} var summaries = agent.createQuery(""" select title, isbn, pubDate from Book where title like ?1 """) .ofType(Summary.class) .setParameter(1, "%Jakarta%") .getResultList(); Meanwhile, Jakarta NoSQL 1.1 adopts the same query concepts, exposing a Query API that mirrors the relational model while remaining suitable for non-relational databases: Java Query query = Query.of("where code = :code") .bind("code", "FI"); Optional<Country> country = query.singleResult(Country.class); Results can be consumed uniformly as Optional, List, or Stream, reinforcing the idea that query semantics belong to the platform, not to individual storage technologies. Taken together, these changes explain why Jakarta EE 12 is best described as the Data Age. The platform no longer treats persistence as a single abstraction centered on tables or entities. Instead, it provides a layered and aligned model: Jakarta Data expresses intent through repositories and restrictionsJakarta Query defines shared query semanticsJakarta Persistence provides relational lifecycle and ORM controlJakarta NoSQL brings non-relational stores into the same conceptual space The result is not simplification by removing options, but simplification by alignment. Jakarta EE 12 gives developers the freedom to choose how they work with data — while ensuring those choices remain consistent, portable, and predictable across the platform. Jakarta EE Meets AI: The Rise of Agentic Applications Another notable addition to Jakarta EE 12 is the emergence of Jakarta Agentic AI, a new specification that has passed its creation review. Its goal is to define vendor-neutral APIs for building, deploying, and operating AI agents within Jakarta EE runtimes. This is not about embedding AI frameworks directly into the platform. Instead, it provides structural concepts — agents, triggers, decisions, actions, and outcomes — that integrate naturally with CDI, persistence, and enterprise services. A simplified example illustrates the intent: Java @Agent public class FraudDetectionAgent { @Inject LargeLanguageModel model; @Inject EntityManager entityManager; @Trigger void processTransaction(@Valid BankTransaction transaction) { // decide whether this transaction should be evaluated } @Decision Result checkFraud(BankTransaction transaction) { String output = model.query( "Is this a fraudulent transaction?", transaction); return new Result(isFraud(output)); } @Action void handleFraud(Fraud fraud, BankTransaction transaction) { // notify systems and users } @Outcome void finalizeTransaction(BankTransaction transaction) { // persist result } } The significance here is architectural. Jakarta EE is acknowledging that AI-driven workflows are becoming a first-class concern in enterprise systems — and that they must integrate cleanly with data, transactions, and security. Conclusion Jakarta EE 12 is not defined by a single headline feature. Its importance lies in alignment. Data access converges through Jakarta Data, Persistence, NoSQL, and QueryConfiguration, integration, and cleanup improve developer productivityCDI and MVC continue to evolve without unnecessary reinventionAI enters the platform in a structured, enterprise-ready way Calling Jakarta EE 12 the Data Age is not marketing — it reflects a deliberate architectural shift. Enterprise Java is no longer optimizing for a single persistence model or programming style. It is optimizing for coherence across diversity. Milestone 2 makes that direction visible. The final release will make it real.

By Otavio Santana

CORE

Next-Level Persistence in Jakarta EE: How We Got Here and Why It Matters

Enterprise Java persistence has never really been about APIs. It has always been about assumptions. Long before frameworks, annotations, or repositories entered the picture, the enterprise Java ecosystem was shaped by a single, dominant belief: persistence meant relational databases. That assumption influenced how applications were designed, how teams reasoned about data, and how the Java platform itself evolved. This article is inspired by a presentation given by Arjan Tijms, director of OmniFish, titled “Next-level persistence in Jakarta EE: Jakarta Data and Jakarta NoSQL.” Delivered in 2024, the talk offers a clear and pragmatic view of why Jakarta EE persistence needed to evolve, how Jakarta Data fits into the platform, and how it relates to Jakarta Persistence and Jakarta NoSQL. While the presentation provides the technical backbone, this article expands on the historical context and architectural motivations behind that evolution. When Java entered the enterprise world, relational databases were already well established, operationally understood, and widely trusted. It was therefore natural that early persistence efforts focused on SQL-based systems. JDBC provided low-level connectivity, and later Jakarta Persistence (JPA) formalized object–relational mapping as the primary way to bridge object-oriented code and relational schemas. For transactional systems with stable schemas and strong consistency requirements, this approach worked extremely well. Over time, JPA became so successful that it effectively defined what “persistence” meant in enterprise Java. Persistence was treated as an ORM problem, and the database was implicitly assumed to be relational. For many years, this assumption held true, and there was little incentive to challenge it. That changed as systems began to scale. The Rise of NoSQL and Polyglot Persistence As enterprise applications grew more distributed and data-intensive, new requirements emerged that did not always align well with relational databases. Horizontal scalability, flexible schemas, low-latency access, and data models optimized for specific access patterns became increasingly important. This shift led to the rise of NoSQL databases, including document stores, key-value stores, column databases, and graph databases, each designed to excel in a particular context rather than to serve as a universal solution. From this diversity emerged the idea of polyglot persistence. Instead of forcing all data into a single model, architects began selecting the most appropriate database for each part of the system. The analogy is linguistic rather than technical: a polyglot speaker chooses the best language depending on the context. In the same way, a system might use a relational database for financial transactions, a document store for evolving aggregates, and a key-value store for high-throughput lookups. While this approach gained traction architecturally, enterprise Java lagged behind at the platform level. There was even a public promise around Java EE 7 to include NoSQL integration as part of the standard platform. That promise, however, never materialized. As a result, developers who wanted to use NoSQL databases were forced to rely on vendor-specific APIs, framework-level abstractions, or entirely separate programming models. Persistence in Java became fragmented, with little guidance on how different approaches should coexist. This gap eventually triggered a community-driven effort to bring NoSQL back into the Java standard ecosystem. On July 14th, 2016, Otávio Santana proposed a project to reintroduce NoSQL integration into enterprise Java. Initially named Diana and targeted at the Apache Software Foundation, the project was motivated by a simple observation: persistence in Java had become too tightly coupled to relational assumptions. When Java EE transitioned to Jakarta EE under the Eclipse Foundation, the project followed. It was renamed Eclipse JNoSQL and officially moved to the Eclipse Foundation on November 10th, 2016. The initial ambition was broad, aiming to define common abstractions across very different NoSQL databases. Over time, however, the scope was deliberately reduced. Instead of attempting to standardize everything, the focus narrowed to areas where meaningful commonality actually exists, such as mapping and basic access patterns. That reduction was not a failure; it was a necessary architectural decision. After several years of iteration, Jakarta NoSQL reached its first official release on March 27th, 2024. During this process, an important realization emerged. Many persistence operations developers perform daily are not fundamentally ORM problems. They are application-level data access problems: creating, retrieving, updating, and deleting data; applying simple queries; projecting partial views; handling pagination; and managing transactional boundaries. These concerns exist regardless of whether the underlying database is relational. That realization led directly to the creation of Jakarta Data. Jakarta Data Jakarta Data, released as version 1.0 on June 6th, 2024, introduces a repository-style API that standardizes data access across both relational and NoSQL databases. Rather than replacing Jakarta Persistence, it complements it by operating at a different abstraction level. Jakarta Persistence remains an ORM API, strongly tied to SQL semantics, JDBC data sources, and provider-specific behavior. Jakarta Data, by contrast, is largely store-agnostic and optimized for fast, application-level data access. This difference in abstraction is essential to understanding why Jakarta Data exists. Jakarta Persistence addresses how an object is mapped and stored in a relational database. Jakarta Data answers a different question: how an application interacts with data, independent of how that data is physically stored. At the core of Jakarta Data is the repository concept. A repository is defined as a mediator between an application’s domain logic and the underlying data storage, whether that storage is relational, NoSQL, or something else entirely. While this idea will feel familiar to developers who have used Spring Data or similar frameworks, Jakarta Data is careful not to enforce a single repository style. Early approaches favored DAO-style repositories, with one repository per entity and method-name–based queries. This approach is easy to understand but can feel unnatural in real-world domains, where many operations span multiple entities. Jakarta Data, therefore, also supports non-generic repositories, allowing engineers to group operations in ways that reflect domain concerns rather than database structure. The emphasis is on type safety and intentional design, not on mechanical uniformity. This layered approach becomes clearer when looking at the programming model. With Jakarta NoSQL, developers work with familiar mapping annotations and a template-based API. For example: Java @Entity public class NoSQLEntity { @Id private String id; @Column private String name; } Using the Jakarta NoSQL template, interacting with the database is explicit and straightforward: Java @Inject Template template; Optional<NoSQLEntity> entity = template.find(NoSQLEntity.class, "1"); NoSQLEntity newEntity = new NoSQLEntity("1", "test"); template.insert(newEntity); Jakarta Data builds on top of this foundation by introducing repositories that express intent more directly. A simple repository example looks like this: Java @Repository public interface Library { @Insert void addToCollection(Book book); @Delete void removeFromCollection(Book book); } Within a Jakarta Data repository, developers can define default methods, lifecycle methods annotated with @Insert, @Update, @Delete, or @Save, derived query methods annotated with @Find, explicit query methods annotated with @Query, projections, pagination, and resource accessor methods. The goal is not to hide complexity, but to concentrate it within a consistent, well-defined abstraction. Jakarta Data also introduces an important distinction in execution style. In Jakarta EE 11, repositories are stateless: changes must be explicitly persisted by calling a save method. Jakarta EE 12, currently in progress, will introduce managed, stateful repositories that automatically track and persist changes to entities. This allows teams to choose between explicit control and managed convenience based on architectural needs. Jakarta Data did not appear in isolation. It builds on lessons learned from earlier efforts such as GORM, Spring Data, DeltaSpike Data, and Panache. What differentiates Jakarta Data is not novelty, but standardization. It provides a vendor-neutral model that integrates cleanly with the rest of the Jakarta EE platform. Today, Eclipse JNoSQL 1.1.10 is the only implementation that spans NoSQL databases and supports Jakarta Data. Through its Jakarta Persistence driver, it can run on top of any JPA provider, such as Hibernate or EclipseLink. This enables the same repository abstraction to be applied across fundamentally different storage technologies. Looking ahead, the next major step is Jakarta Query, a specification designed to align query languages across Jakarta Persistence, Jakarta Data, and Jakarta NoSQL. Together, these efforts mark what many describe as the beginning of the “Data Age” in Jakarta EE. Jakarta Data is not “JPA 2.0,” and Jakarta NoSQL is not an ORM for non-relational databases. Instead, they represent a shift in perspective. Persistence in enterprise Java is no longer defined solely by how data is stored, but by how applications interact with data in a world where multiple storage paradigms coexist. That shift has taken more than a decade — and Jakarta EE 11 is where it finally becomes visible. References and Inspiration Arjan Tijms – Next-level persistence in Jakarta EE: Jakarta Data and Jakarta NoSQL:https://omnifish.ee/next-level-persistence-in-jakarta-ee-jakarta-data-and-jakarta-nosql-jfall-slides/Slides:https://drive.google.com/file/d/1qQH919SU4dcMFLclJrml4l-EhoB4V85x/view

By Otavio Santana

CORE

Best Java GUI Frameworks for Modern Applications

Java has become one of the world’s most versatile programming languages, chosen for its adaptability, stability, and platform independence. Its extensive ecosystem encompasses virtually every application type, from web development to enterprise solutions, game design, the Internet of Things (IoT), and beyond. With an estimated 51 billion active Java Virtual Machines (JVMs) globally, it goes without question that Java powers a substantial portion of modern software infrastructure. However, designing dynamic and visually engaging applications takes more than coding skills — it requires the right tools. Java Graphical User Interface (GUI) frameworks are essential tools that transform basic code into visually appealing, interactive applications. This article explores the best Java GUI frameworks, highlighting their unique strengths, limitations, and ideal use cases to help you choose the best fit for your next project. What to Consider When Choosing a Java GUI Framework Selecting the right GUI framework for Java is pivotal to creating applications that excel in functionality and user experience (UX). This is because each framework offers distinct features that cater to specific requirements. Here’s a brief overview of some critical factors to consider: Performance: Java GUI frameworks vary in their ability to handle resource-intensive applications. Some are optimized for faster execution and better memory management, while others may trade performance for ease of development.Scalability: As your project grows, the GUI framework you choose should seamlessly support expansion. Some frameworks are particularly well-suited for applications that handle large datasets or high user volumes.Cross-Platform Compatibility: Not all Java GUI frameworks perform equally across operating systems. While some are truly cross-platform, others may require additional adjustments. Consider where your application will run and choose accordingly.Ease of Use & Learning Curve: Frameworks vary in complexity. If you want faster adoption, look for strong documentation and community support.Community & Support: An active user base and thorough documentation make development and debugging easier and ensure long-term maintainability. 8 Best Java GUI Frameworks for Modern Applications Below is an overview of the top eight Java GUI frameworks, each with unique features, strengths, and ideal use cases. 1. Swing Swing is one of Java’s oldest and most widely used GUI frameworks. Built on top of the Abstract Window Toolkit (AWT), it provides a rich set of pre-built components such as buttons, tables, and lists. Pros: Highly customizable components for advanced UI designsPlatform-independent across operating systemsPart of the Java Standard Library, making integration easy Cons: Slower performance for highly graphical applicationsOutdated look and feel without customizationLimited support for modern styling Best Use Case: Desktop applications requiring flexibility and cross-platform compatibilityApplications where a fully customizable UI is required 2. SWT (Standard Widget Toolkit) Originally developed by IBM for the Eclipse IDE, SWT uses native OS widgets to provide a natural look and feel. Pros: Fast performance using native widgetsPlatform-specific appearanceStrong support for productivity tools Cons: Less portable across platformsReliance on native libraries complicates distributionMore challenging to customize Best Use Case: Desktop applications that need to resemble native OS applications closelyApplications requiring native OS integration and high performance 3. JGoodies JGoodies extends Swing with libraries such as JGoodies Forms and Binding to simplify layout management and data binding. Pros: Cleaner, more modern look than standard SwingPowerful layout managers for complex UIs, including FormLayoutSimplified data binding and validation Cons: Requires additional librariesSmaller community compared to JavaFXLimited support for highly custom UI components Best Use Case: Business applications requiring advanced layouts and data validationWhere complex data binding and validation are required 4. JavaFX JavaFX is a modern GUI framework designed for visually rich applications with support for 3D graphics, media streaming, and many other advanced UI types. Pros: Strong multimedia and 3D supportScene Builder simplifies UI designHigh performance for complex visuals Cons: Steeper learning curveLarger memory footprint than simpler frameworksLimited support in legacy applications Best Use Case: Applications that require advanced graphics, animations, and media playbackHigh-performance desktop applications with modern UI design 5. JIDE JIDE is an enterprise-grade GUI framework offering advanced components for data-intensive applications. Pros: Extensive enterprise-focused component libraryHigh flexibility and customizationIdeal for complex data-driven UIs Cons: Overkill for small projectsExpensive licensingLimited open-source community Best Use Case: Enterprise applications with complex UI requirementsData-intensive applications with advanced interaction needs 6. Apache Pivot Apache Pivot is an open-source GUI toolkit using XML-based layouts. Pros: Lightweight and cross-platformSimple XML UI definitionsSmall footprint, ideal for less resource-intensive applications Cons: Limited advanced UI componentsSmaller communityLower performance for complex visuals Best Use Case: Lightweight cross-platform applications needing easily maintainable UIsProjects that prioritize simplicity over advanced visual features Note: Apache Pivot moved to the Attic in January 2025. 7. Hibernate Hibernate is an ORM framework, not a GUI framework, but it complements GUI development through robust data persistence and retrieval. Pros: Strong database integration and support for data handlingReduces boilerplate codeIdeal for data-driven applications Cons: No UI components, primarily a database-oriented toolRequires database expertiseNot suitable for graphical applications Best Use Case: Data-driven applications with minimal UI requirementsBackend-heavy applications that need strong database interaction 8. Spring Spring is a versatile framework used primarily for backend development but can support GUI applications through integrations. Pros: Highly scalableStrong backend and integration supportExtensive documentation and community Cons: Not GUI-focusedComplex for small applicationsSteeper learning curve for full-stack integration Best Use Case: Enterprise applications requiring robust backend services with GUI integration.Web applications integrated with Java backend services. Tabular Comparison of the Best Java GUI Frameworks Let's look at the critical elements of each Java GUI framework in a tabular form: FrameworkKey FeaturesBest Use CasesProsConsSwingPre-built componentsCross-platform desktop applicationsHighly customizableLower Performance for graphical appsSWTNative OS IntegrationNative-looking desktop appsFast natural OS FeelLess portable across platformsJGoodiesBasic, lightweight componentsBusiness applications, advanced layout customization, complex data binding and validationup-to-date clean look, powerful layout for managers, validation toolsunsuitable for apps requiring widespread use of custom UI componentsJavaFX3D Graphics, media supportVisually rich appAdvanced graphics and multimediaSteeper learning curveJIDEEnterprise-grade componentsData-driven enterprise applicationsExtensive library for complex UIsCostly advanced componentsApache PivotXML-based design, cross platformLightweight cross-platform applicationsSimple UI definitionsLimited high-performance featuresHibernateDatabase bindingData-heavy backend applicationsExcellent for data handlingPrimarily a database frameworkSpringBackend integration, scalablecomplex, large scale enterprise applicationsExtensive backend supportNot GUI-focused complex setup Conclusion Choosing the right Java GUI framework is a decisive factor in your project’s success. Each framework offers strengths tailored to specific use cases. Aligning your choice with requirements such as performance, scalability, cross-platform compatibility, and ease of use will help you build a robust, user-centric application that meets both current and future needs.

By Rodolfo Ortega

Rate Limiting Beyond “N Requests/sec”: Adaptive Throttling for Spiky Workloads (Spring Cloud Gateway)

Most teams add rate limiting after an outage, not before one. I’ve done it both ways, and the “after” version usually looks like this: someone picks a number (say 500 rps), wires up a filter, and feels safer. Then the next incident happens anyway — because the problem wasn’t the number. The real problems tend to be: Burstiness (traffic arrives in clumps, not a smooth stream)Retry amplification (timeouts → retries → more load → more timeouts)Noisy neighbors (one client or tenant degrades everyone)Capacity drift (your service is “fine” at 10:00 am and struggling at 10:10 am) This article is about building a rate-limiting system at the gateway — still simple enough to run in a normal Spring Cloud Gateway (SCG) setup — but smarter than a single static “N req/sec.” What “Good” Looks Like I judge a limiter by these outcomes, not by the algorithm name: Stability: The system doesn’t spiral when traffic spikes.Fairness: One client can’t starve others.Predictability: Allowed requests don’t suffer runaway tail latency.Graceful degradation: When capacity drops, you get a controlled brownout, not a full blackout. If your limiter returns 429 but your downstream still melts, you didn’t actually protect anything — you just moved the pain around. Token Bucket vs. Leaky Bucket You’ll hear “token bucket vs. leaky bucket” a lot. Here’s the version that matters in real systems. Token Bucket: Lets You Burst, Enforces an Average A token bucket is like a wallet that refills at a steady rate. Each request spends a token. If you have tokens saved up, you can burst. If you don’t, you wait (or get a 429). Why it’s popular at the gateway: Bursts are common and often harmless. A user clicking “refresh” three times shouldn’t be punished if your system can handle it. Where it bites you: If you set the bucket too large, a burst can still shock a fragile downstream. Token bucket doesn’t magically make your database love bursts. Leaky Bucket: Smooths Output, But Queues Can Become a Slow Failure The leaky bucket tries to output at a steady rate. If input is higher, requests pile up. Smoothing is useful — but queuing is not free. Queues add latency, and latency triggers retries. I’ve seen systems “look fine” on throughput charts while users time out because everything is stuck in a backlog. My rule of thumb: Use a token bucket at the edge (gateways) to handle normal burstiness.Use smoothing closer to fragile dependencies only if you can bound the queue and the time spent waiting. Spring Cloud Gateway Setup: The Part You Can Implement Today SCG has a built-in RequestRateLimiter filter backed by Redis (RedisRateLimiter). It’s a token-bucket style limiter and it’s easy to wire up. Snippet 1: Per-Client Fairness With a KeyResolver application.yml YAML spring: redis: host: localhost port: 6379 cloud: gateway: default-filters: # Optional: standardize error responses a bit - AddResponseHeader=X-Gateway, scg routes: - id: api_route uri: http://localhost:8081 predicates: - Path=/api/** filters: - name: RequestRateLimiter args: redis-rate-limiter.replenishRate: 50 redis-rate-limiter.burstCapacity: 100 key-resolver: "#{@clientKeyResolver}" KeyResolver (use a header, API key, or principal): Java import org.springframework.cloud.gateway.filter.ratelimit.KeyResolver; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import reactor.core.publisher.Mono; @Configuration public class RateLimitKeys { @Bean public KeyResolver clientKeyResolver() { return exchange -> { // Pick something stable and hard to spoof in your environment: // API key, JWT subject, client id, mTLS cert fingerprint, etc. String clientId = exchange.getRequest().getHeaders().getFirst("X-Client-Id"); if (clientId == null || clientId.isBlank()) clientId = "anonymous"; return Mono.just(clientId); }; } } Snippet 2: Add a Global Ceiling (Second Limiter) This is the step a lot of teams miss. They build per-client limits and still overload downstream because all clients spike together. SCG lets you stack filters. Add a second limiter keyed to a constant string. application.yml YAML spring: cloud: gateway: routes: - id: api_route uri: http://localhost:8081 predicates: - Path=/api/** filters: # Per-client fairness - name: RequestRateLimiter args: redis-rate-limiter.replenishRate: 50 redis-rate-limiter.burstCapacity: 100 key-resolver: "#{@clientKeyResolver}" # Global system budget - name: RequestRateLimiter args: redis-rate-limiter.replenishRate: 500 redis-rate-limiter.burstCapacity: 800 key-resolver: "#{@globalKeyResolver}" Java @Bean public KeyResolver globalKeyResolver() { return exchange -> Mono.just("global"); } The Part That Makes It “beyond N req/sec”: Adaptive Throttling Static limits assume your capacity is static. It isn’t. Capacity changes because of deployments, garbage collection, a slow dependency, cache cold starts, noisy neighbors in your cluster, you name it. If your limiter doesn’t respond, the gateway will happily admit traffic at a rate your service can’t handle right now. I’ve learned to keep the controller simple and stable. Avoid overfitting. Pick a few signals you trust and apply conservative adjustments. What Signals Actually Work You can start with two or three: p95 latency at the gateway or a key downstream hop5xx ratesaturation (thread pool queue depth, connection pool usage, CPU, or consumer lag) How They Should Behave (This Matters More Than the Formula) Tighten fast when things look bad (protect quickly).Relax slowly when things look good (avoid oscillation).Add hysteresis: don’t flip-flop based on one bad interval. Snippet 3: Pseudocode Controller (Implementation-Agnostic) Plain Text Every 10 seconds: bad = (p95_latency_ms > 300) OR (error_rate_5xx > 1%) OR (saturation > 0.85) if bad: limit = limit * 0.85 # tighten quickly healthy_streak = 0 else: healthy_streak += 1 if healthy_streak >= 6: limit = limit * 1.05 # relax slowly healthy_streak = 6 # cap streak to avoid runaway limit = clamp(limit, min=100, max=800) publish(limit) -> gateway config source “Publish(limit)” Without Turning Your System Into a Science Project You have options. The fastest path is usually one of these: Store the current limit in Redis and read it in a custom limiterUse Spring Cloud Config + refresh (works, but can be heavy if you refresh too often)Use a feature flag system if you already have one If you want to keep SCG’s RedisRateLimiter but make it dynamic, you typically end up writing a small wrapper/custom filter so you can pull replenishRate and burstCapacity from a dynamic source. That’s a good second iteration; don’t start there if you’re still proving the concept. Testing It Like You Mean It (Not Just “it returns 429”) If you only test “send 1000 requests, see some 429,” you’ll miss the failure modes that matter. Here’s the test plan I actually use: Baseline: Steady traffic from multiple clients. Confirm p95 stays stable.Noisy neighbor: One client ramps aggressively; others remain steady. Confirm fairness.Aggregate spike: All clients ramp together. Confirm the global ceiling protects the downstream.Degradation: Inject latency or reduce downstream capacity. Confirm adaptive throttling tightens quickly.Recovery: Remove the injection. Confirm throttling relaxes slowly (no flapping). If you do this once with a simple load tool (k6, Gatling, JMeter), you’ll learn more than debating bucket algorithms for a week. JavaScript import http from "k6/http"; import { sleep } from "k6"; export const options = { vus: 20, duration: "30s" }; export default function () { const clientId = __VU % 2 === 0 ? "clientA" : "clientB"; http.get("http://localhost:8080/api/hello", { headers: { "X-Client-Id": clientId } }); sleep(0.1); } Closing Thought Rate limiting is usually presented as a switch: on or off, limit or no limit. In practice, it’s closer to a control system that protects your service from the physics of traffic — bursts, retries, and shifting capacity. Spring Cloud Gateway gives you solid primitives. The “beyond N req/sec” part is combining them: fairness, global budgets, and an adaptive loop that reacts before users feel the outage. If you tell me what you use for metrics (Prometheus/Grafana, CloudWatch, Datadog, etc.), I can add a short, practical subsection showing exactly which two or three signals to start with and where to hook them from SCG without making the post longer or more complex.

By Varun Pandey

How Global Payment Processors like Stripe and PayPal Use Apache Kafka and Flink to Scale

The recent announcement that Global Payments will acquire Worldpay for $22.7 billion has once again put the spotlight on the payment processing space. This move consolidates two giants and signals the growing importance of real-time, global payment infrastructure. But behind this shift is something deeper: data streaming has become the backbone of modern payment systems. From Stripe’s 99.9999% Kafka availability to PayPal streaming over a trillion events per day, and Payoneer replacing its existing message broker with data streaming, the world’s leading payment processors are redesigning their core systems around streaming technologies. Even companies like Worldline, which developed its own Apache Kafka management platform, have made Kafka central to their financial infrastructure. Payment processors operate in a world where every transaction must be authorized, routed, verified, and settled in milliseconds. These are not just financial operations — they are critical real-time data pipelines. From fraud detection and currency conversion to ledger updates and compliance checks, the entire value chain depends on streaming architectures to function without delay. This transformation highlights how data streaming is not just an enabler but a core requirement for building fast, secure, and intelligent payment experiences at scale. What Is a Payment Processor? A payment processor is a financial technology company that moves money between the customer, the merchant, and the bank. It verifies and authorizes transactions, connects with card networks, and ensures settlement between parties. In other words, it acts as the middleware that powers digital commerce — both online and in person. Whether swiping a card at a store, paying through a mobile wallet, or making an international wire transfer, a payment processor is often involved. Key responsibilities include: Validating payment credentialsPerforming security and fraud checksRouting payment requests to the right financial institutionConfirming authorization and handling fund settlement Top Payment Processors Based on transaction volume, global presence, and market impact, here are some of the largest players today: FIS / Worldpay – Enterprise focus, global reachGlobal Payments – Omnichannel, merchant servicesFiserv / First Data – Legacy integration, banks & retailersPayPal – Global digital wallet and checkoutStripe – API-first, developer-friendly platformAdyen – Unified commerce for large enterprisesBlock (Square) – SME focus, hardware + softwarePayoneer – Cross-border B2B paymentsWorldline – Strong European presence Each of these companies operates at massive scale, handling millions of transactions per day. Their infrastructure needs to be secure, reliable, and real time. That’s where data streaming comes in. Payment Processors in the Context of Financial Services Payment processors don’t work in isolation. They integrate with a wide range of: Banking systems: to access accounts and settle fundsCard networks: such as Visa, Mastercard, and AmexB2B applications: such as ERP, POS, and eCommerce platformsFintech services: including lending, insurance, and digital wallets This means their systems must support real-time messaging, low latency, and fault tolerance. Downtime or errors can mean financial loss, customer churn, or even regulatory fines. Why Payment Processors Use Data Streaming Traditional batch processing isn’t good enough in today’s payment ecosystem. There are 20+ problems and challenges with batch processing. Financial services must react in real time to critical events such as: Payment authorizationsSuspicious transaction alertsCurrency conversion requestsAPI-based microtransactions In this context, Apache Kafka and Apache Flink, often deployed via Confluent, are becoming foundational technologies. A data streaming platform enables an event-driven architecture, where each payment, customer interaction, or system change is treated as a real-time event. This model supports the decoupling of services, meaning systems can scale and evolve independently — an essential feature in mission-critical environments. The Benefits of Apache Kafka and Flink for Payments Apache Kafka is the de facto backbone for modern payment platforms because it supports the real-time, mission-critical nature of financial transactions. Together with Apache Flink, it enables a data streaming architecture that meets the demands of today’s fintech landscape. Key benefits include: Low latency – Process events in millisecondsHigh throughput – Handle millions of transactions per dayScalability – Easily scale across systems and regionsHigh availability and failover – Ensure global uptime and disaster recoveryExactly-once semantics – Maintain financial integrity and avoid duplicationEvent sourcing – Track every action for audit and complianceDecoupled services – Support modular, microservices-based architecturesData reusability – Stream once, use across multiple teams and use casesScalable integration – Bridge modern APIs with legacy banking systems In the high-stakes world of payments, real-time processing is no longer a competitive edge — it is a core requirement. Data streaming provides the infrastructure to make that possible. Apache Kafka is not just for big data analytics. It enables transactional processing without compromising performance or consistency. See the full blog post for detailed use cases and architectural insights: Analytics vs. Transactions with Apache Kafka. Data Streaming with Apache Kafka in the Payment Processor Ecosystem Let’s look at how top payment processors use Apache Kafka. Stripe: Apache Kafka for 99.9999% Availability Stripe is one of the world’s leading financial infrastructure platforms for internet businesses. It powers payments, billing, and financial operations for millions of companies — ranging from startups to global enterprises. Every transaction processed through Stripe is mission-critical, not just for Stripe itself, but for its customers whose revenue depends on seamless, real-time payments. At the heart of Stripe’s architecture lies Apache Kafka, which serves as the financial source of truth. Kafka is used by the vast majority of services at Stripe, acting as the backbone for processing, routing, and tracking all financial data across systems. The core use case is Stripe’s general ledger, which models financial activity using a double-entry bookkeeping system. This ledger covers every fund movement within Stripe’s ecosystem and must be accurate, complete, and observable in real time. Kafka ensures that all financial events — from payment authorizations to settlements and refunds — are processed with exactly-once semantics. This guarantees that the financial state is never overcounted, undercounted, or duplicated, which is essential in regulated environments. Stripe also integrates Apache Pinot as a real-time OLAP engine, consuming events directly from Kafka for instant analytics. This supports everything from operational dashboards to customer-facing reporting, all without pulling from offline batch systems. One of Stripe’s most advanced Kafka deployments supports 99.9999% availability, referred to internally as “six nines.” This is achieved through a custom-built multi-cluster proxy layer that routes producers and consumers globally. The proxy system ensures high availability even in the event of a cluster outage, allowing maintenance and upgrades with zero impact on live systems. This enables critical services, like Stripe’s “charge path” (the end-to-end flow of a transaction), to remain fully operational under all conditions. Stripe’s use of Kafka illustrates how a data streaming platform can serve not just as a pipeline but as a foundational, highly available, and globally distributed layer of financial truth. PayPal: Streaming Over a Trillion Events a Day PayPal is one of the most widely used digital payment platforms in the world, serving hundreds of millions of users and businesses. As a global payment processor, its systems must be fast, reliable, and secure — especially when every transaction impacts real money movement across borders, currencies, and regulatory frameworks. Apache Kafka plays a central role in enabling PayPal’s real-time data streaming infrastructure. Kafka supports some of PayPal’s most critical workloads, including fraud detection using AI/ML models, user activity tracking, risk and compliance, and application health monitoring. These are not optional features — they are fundamental to the integrity and safety of PayPal’s payment ecosystem. The platform processes over 1 trillion Kafka messages per day during peak periods, such as Black Friday or Cyber Monday. Kafka enables PayPal to detect fraud in real time, respond to risk events instantly, and maintain continuous observability across services. PayPal also uses Kafka to reduce analytics latency from hours to seconds. Streaming data from Kafka into tools like Google BigQuery allows the company to generate real-time insights that support payment routing, user experience, and operational decisions. As a payment processor, PayPal cannot afford downtime or delays. Kafka ensures low latency, high throughput, and real-time responsiveness, making it a critical backbone for keeping PayPal’s global payments platform secure, compliant, and always available. Payoneer: Migrating to Kafka for Scalable Event-Driven Payments Payoneer is a global payment processor focused on cross-border B2B transactions. It enables freelancers, businesses, and online sellers to send and receive payments across more than 200 countries. Operating at this scale requires not only financial compliance and flexibility, but also a robust, real-time data infrastructure. As Payoneer transitioned from monolithic systems to microservices, it faced serious limitations with its legacy event-driven setup, which relied on RabbitMQ, Apache NiFi, and custom-built routing logic. These technologies supported basic asynchronous communication and change data capture (CDC), but they quickly became bottlenecks as the number of services and data volumes grew. NiFi couldn’t keep up with CDC throughput, RabbitMQ wasn’t scalable for multi-consumer setups, and the entire architecture created tight coupling between services. To solve these challenges, Payoneer reengineered its internal event bus using Apache Kafka and Debezium. This shift replaced queues with Kafka topics, unlocking critical benefits like high throughput, multi-consumer support, data replay, and schema enforcement. Application events are now published directly to Kafka topics, while CDC data from SQL databases is streamed using Debezium Kafka connectors. By decoupling producers from consumers and removing the need for complex rule-based queue routing, Kafka enables a cleaner, more resilient event-driven architecture. It also supports advanced use cases like event filtering, enrichment, and real-time analytics. For Payoneer, this migration wasn’t just a technical upgrade — it was a foundational shift that enabled the company to scale securely, communicate reliably between services, and process financial events in real time. Kafka is now a critical backbone for Payoneer’s modern payment infrastructure. Worldline: Building an Enterprise-Grade Kafka Management Platform Worldline is one of Europe’s leading payment service providers, processing billions of transactions annually across banking, retail, and mobility sectors. In this highly regulated and competitive environment, real-time reliability and observability are critical. These requirements made Apache Kafka a strategic part of Worldline’s infrastructure. While Worldline has not publicly detailed specific payment-related use cases, the company’s investment in Kafka is clear from its public GitHub projects, technical articles, and job postings. Kafka has been adopted across multiple teams and projects, powering asynchronous processing, data integration, and event-driven architectures that support Worldline’s role as a global payment processor. One of the strongest indicators of Kafka’s importance at Worldline is the development of its own Kafka Manager tool. Built in-house and tailored to enterprise needs, this tool allows teams to monitor, manage, and troubleshoot Kafka clusters with greater efficiency and security — critical in PCI-regulated environments. The tool’s advanced features, such as offset tracking, message inspection, and cluster-wide configuration management, reflect the operational maturity required in the payments industry. A 2021 blog post provides a good overview of the motivation and includes an interesting (though somewhat outdated) comparison explaining why Worldline built its own Kafka UI. Another notable project is wkafka, a wrapper around the Kafka library designed to initialize and run microservices in the Go programming language. Worldline also continues to hire Kafka engineers and developers, indicating ongoing investment in scaling and optimizing its streaming infrastructure. Even without detailed public, use-case-focused stories, the message is clear: Kafka is essential to Worldline’s ability to deliver secure, real-time payment services at scale. Beyond Payments: Value-Added Services for Payment Processors Powered by Data Streaming Payment processing is just the beginning. Data streaming supports a growing number of high-value services: Fraud prevention: Real-time scoring using historical and streaming featuresAnalytics: Instant business insights from live dashboardsEmbedded finance: Integration with third-party platforms like eCommerce, ride-sharing, or SaaS billingWorking capital loans: Credit decisions made in-stream based on transaction history These services not only add revenue but also help payment processors retain customers and build ecosystem lock-in. Looking Ahead: IoT, AI, and Agentic Systems The future of payment processing is being shaped by: IoT payments: Cars, smart devices, and wearables triggering payments automaticallyAI and GenAI: Personal finance agents that manage subscriptions, detect overcharges, and negotiate fees in real timeAgentic AI systems: Acting on behalf of businesses to manage invoicing, reconciliation, and cash flow, streaming events into Kafka and out to various APIs Data streaming is the foundation for these innovations. Without real-time data, AI can’t make smart decisions, and IoT can’t interact with payment systems at the edge. Modern payments are not just about money movement — they’re about data movement. Payment processors are leading examples of how companies create value by processing streams of data, not just rows in a database. A data streaming platform is no longer just a tool; it is a strategic enabler of real-time business. As the fintech space continues to consolidate and evolve, those who invest in event-driven architectures will have the edge in innovation, customer experience, and operational agility.

By Kai Wähner

CORE

Java

DZone's Featured Java Resources

Top Java Experts

The Latest Java Topics