May 17, 2021
Exploring ZooKeeper-less Kafka
Sometimes, less is more. One case where that’s certainly true is dependencies. And so it shouldn’t come at a surprise that the Apache Kafka community is eagerly awaiting the removal of the dependency to the ZooKeeper service, which currently is used for storing Kafka metadata (e.g. about topics and partitions) as well as for the purposes of leader election in the cluster.
The Kafka improvement proposal KIP-500 ("Replace ZooKeeper with a Self-Managed Metadata Quorum") promises to make life better for users in many regards:
Better getting started and operational experience by requiring to run only one system, Kafka, instead of two
Removing potential for discrepancies of metadata state between ZooKeeper and the Kafka controller
Simplifying configuration, for instance when it comes to security
Better scalability, e.g. in terms of number of partitions; faster execution of operations like topic creation
Read More...
Apr 26, 2021
The Anatomy of ct.sym — How javac Ensures Backwards Compatibility
One of the ultimate strengths of Java is its strong notion of backwards compatibility: Java applications and libraries built many years ago oftentimes run without problems on current JVMs, and the compiler of current JDKs can produce byte code, that is executable with earlier Java versions.
For instance, JDK 16 supports byte code levels going back as far as to Java 1.7; But: hic sunt dracones. The emitted byte code level is just one part of the story. It’s equally important to consider which APIs of the JDK are used by the compiled code, and whether they are available in the targeted Java runtime version. As an example, let’s consider this simple "Hello World" program:
Read More...
Mar 8, 2021
FizzBuzz – SIMD Style!
Java 16 is around the corner, so there’s no better time than now for learning more about the features which the new version will bring. After exploring the support for Unix domain sockets a while ago, I’ve lately been really curious about the incubating Vector API, as defined by JEP 338, developed under the umbrella of Project Panama, which aims at "interconnecting JVM and native code".
Vectors?!? Of course this is not about renewing the ancient Java collection types like java.util.Vector (<insert some pun about this here>), but rather about an API which lets Java developers take advantage of the vector calculation capabilities you can find in most CPUs these days. Now I’m by no means an expert on low-level programming leveraging specific CPU instructions, but exactly that’s why I hope to make the case with this post that the new Vector API makes these capabilities approachable to a wide audience of Java programmers.
Read More...
Jan 31, 2021
Talking to Postgres Through Java 16 Unix-Domain Socket Channels
Update Feb 5: This post is discussed on Hacker News
Reading a blog post about what’s coming up in JDK 16 recently, I learned that one of the new features is support for Unix domain sockets (JEP 380). Before Java 16, you’d have to resort to 3rd party libraries like jnr-unixsocket in order to use them. If you haven’t heard about Unix domain sockets before, they are "data communications [endpoints] for exchanging data between processes executing on the same host operating system". Don’t be put off by the name btw.; Unix domain sockets are also supported by macOS and even Windows since version 10.
Read More...
Dec 28, 2020
jlink's Missing Link: API Signature Validation
Discussions around Java’s jlink tool typically center around savings in terms of (disk) space. Instead of shipping an entire JDK, a custom runtime image created with jlink contains only those JDK modules which an application actually requires, resulting in smaller distributables and container images.
But the contribution of jlink — as a part of the Java module system at large — to the development of Java application’s is bigger than that: with the notion of link time it defines an optional complement to the well known phases compile time and application run-time:
Link time is an opportunity to do whole-world optimizations that are otherwise difficult at compile time or costly at run-time. An example would be to optimize a computation when all its inputs become constant (i.e., not unknown). A follow-up optimization would be to remove code that is no longer reachable.
Read More...
Dec 21, 2020
ByteBuffer and the Dreaded NoSuchMethodError
The other day, a user in the Debezium community reported an interesting issue; They were using Debezium with Java 1.8 and got an odd NoSuchMethodError:
Read More...
Dec 16, 2020
Towards Continuous Performance Regression Testing
Functional unit and integration tests are a standard tool of any software development organization, helping not only to ensure correctness of newly implemented code, but also to identify regressions — bugs in existing functionality introduced by a code change. The situation looks different though when it comes to regressions related to non-functional requirements, in particular performance-related ones: How to detect increased response times in a web application? How to identify decreased throughput?
These aspects are typically hard to test in an automated and reliable way in the development workflow, as they are dependent on the underlying hardware and the workload of an application. For instance assertions on the duration of specific requests of a web application typically cannot be run in a meaningful way on a developer laptop, which differs from the actual production hardware (ironically, nowadays both is an option, the developer laptop being less or more powerful than the actual production environment). When run in a virtualized or containerized CI environment, such tests are prone to severe measurement distortions due to concurrent load of other applications and jobs.
This post introduces the JfrUnit open-source project, which offers a fresh angle to this topic by supporting assertions not on metrics like latency/throughput themselves, but on indirect metrics which may impact those. JfrUnit allows you define expected values for metrics such as memory allocation, database I/O, or number of executed SQL statements, for a given workload and asserts the actual metrics values — which are obtained from JDK Flight Recorder events — against these expected values. Starting off from a defined base line, future failures of such assertions are an indicator for potential performance regressions in an application, as a code change may have introduced higher GC pressure, the retrieval of unneccessary data from the database, or SQL problems commonly induced by ORM tools, like N+1 SELECT statements.
Read More...
Dec 13, 2020
Smaller, Faster-starting Container Images With jlink and AppCDS
A few months ago I wrote about how you could speed up your Java application’s start-up times using application class data sharing (AppCDS), based on the example of a simple Quarkus application. Since then, quite some progress has been made in this area: Quarkus 1.6 brought built-in support for AppCDS, so that now you just need to provide the -Dquarkus.package.create-appcds=true option when building your project, and you’ll find an AppCDS file in the target folder.
Things get more challenging though when combining AppCDS with custom Java runtime images, as produced using the jlink tool added in Java 9. Combining custom runtime images with AppCDS is very attractive, in particular when looking at the deployment of Java applications via Linux containers. Instead of putting the full Java runtime into the container image, you only add those JDK modules which your application actually requires. (Parts of) what you save in image size by doing so, can be used for adding an AppCDS archive to your container image. The result will be a container image which still is smaller than before — and thus is faster to push to a container registry, distribute to worker nodes in a Kubernetes cluster, etc. — and which starts up significantly faster.
Read More...
Nov 28, 2020
Quarkus and Testcontainers
The Testcontainers project is invaluable for spinning up containerized resources during your (JUnit) tests, e.g. databases or Kafka clusters.
For users of JUnit 5, the project provides the @Testcontainers extension, which controls the lifecycle of containers used by a test. When testing a Quarkus application though, this is at odds with Quarkus' own @QuarkusTest extension; it’s a recommended best practice to avoid fixed ports for any containers started by Testcontainers. Instead, you should rely on Docker to automatically allocate random free ports. This avoids conflicts between concurrently running tests, e.g. amongst multiple Postgres containers, started up by several parallel job runs in a CI environment, all trying to allocate Postgres' default port 5432. Obtaining the randomly assigned port and passing it into the Quarkus bootstrap process isn’t possible though when combining the two JUnit extensions.
Read More...
Oct 14, 2020
Class Unloading in Layered Java Applications
Layers are sort of the secret sauce of the Java platform module system (JPMS): by providing fine-grained control over how individual JPMS modules and their classes are loaded by the JVM, they enable advanced usages like loading multiple versions of a given module, or dynamically adding and removing modules at application runtime.
The Layrry API and launcher provides a small plug-in API based on top of layers, which for instance can be used to dynamically add plug-ins contributing new views and widgets to a running JavaFX application. If such plug-in gets removed from the application again, all its classes need to be unloaded by the JVM, avoiding an ever-increasing memory consumption if for instance a plug-in gets updated multiple times.
In this blog post I’m going to explore how to ensure classes from removed plug-in layers are unloaded in a timely manner, and how to find the culprit in case some class fails to be unloaded.
Read More...