Talking to Postgres Through Java 16 Unix-Domain Socket Channels
Update Feb 5: This post is discussed on Hacker News
Reading a blog post about what’s coming up in JDK 16 recently, I learned that one of the new features is support for Unix domain sockets (JEP 380). Before Java 16, you’d have to resort to 3rd party libraries like jnr-unixsocket in order to use them. If you haven’t heard about Unix domain sockets before, they are "data communications [endpoints] for exchanging data between processes executing on the same host operating system". Don’t be put off by the name btw.; Unix domain sockets are also supported by macOS and even Windows since version 10.
Databases such as Postgres or MySQL use them for offering an alternative to TCP/IP-based connections to client applications running on the same machine as the database. In such scenario, Unix domain sockets are both more secure (no remote access to the database is exposed at all; file system permissions can be used for access control), and also more efficient than TCP/IP loopback connections.
A common use case are proxies for accessing Cloud-based databases, such as as the GCP Cloud SQL Proxy. Running on the same machine as a client application (e.g. in a sidecar container in case of Kubernetes deployments), they provide secure access to a managed database, for instance taking care of the SSL handling.
My curiousity was piqued and I was wondering what it’d take to make use of the new Java 16 Unix domain socket for connecting to Postgres. It was your regular evening during the pandemic, without much to do, so I thought "Let’s give this a try". To have a testing bed, I started with installing Postgres 13 on Fedora 33. Fedora might not always have the latest Postgres version packaged just yet, but following the official Postgres instructions it is straight-forward to install newer versions.
In order to connect with user name and password via a Unix domain socket,
one small adjustment to /var/lib/pgsql/13/data/pg_hba.conf is needed:
the access method
for the local
connection type must be switched from the default value peer
(which would try to authenticate using the operating system user name of the client process) to md5
.
...
# TYPE DATABASE USER ADDRESS METHOD
# "local" is for Unix domain socket connections only
local all all md5
...
Make sure to apply the changed configuration by restarting the database
(systemctl restart postgresql-13
),
and things are ready to go.
The Postgres JDBC Driver
The first thing I looked into was the Postgres JDBC driver. Since version 9.4-1208 (released in 2016) it allows you to configure custom socket factories, a feature which explicitly was added considering Unix domain sockets. The driver itself doesn’t come with a socket factory implementation that’d actually support Unix domain sockets, but a few external open-source implementations exist. Most notably junixsocket provides such socket factory.
Custom socket factories must extend javax.net.SocketFactory
,
and their fully-qualified class name needs to be specified using the socketFactory driver parameter.
So it should be easy to create SocketFactory
implementation based on the new UnixDomainSocketAddress class, right?
public class PostgresUnixDomainSocketFactory extends SocketFactory {
@Override
public Socket createSocket() throws IOException {
var socket = new Socket();
socket.connect(UnixDomainSocketAddress.of(
"/var/run/postgresql/.s.PGSQL.5432")); (1)
return socket;
}
// other create methods ...
}
1 | Create a Unix domain socket address for the default path of the socket on Fedora and related systems |
It compiles just fine;
but it turns out not all socket addresses are equal,
and java.net.Socket
only connects to addresses of type InetSocketAddress
(and the PG driver maintainers seem to sense some air of mystery around these "unusual" events, too):
org.postgresql.util.PSQLException: Something unusual has occurred to cause the driver to fail. Please report this exception.
at org.postgresql.Driver.connect(Driver.java:285)
...
Caused by:
java.lang.IllegalArgumentException: Unsupported address type
at java.base/java.net.Socket.connect(Socket.java:629)
at java.base/java.net.Socket.connect(Socket.java:595)
at dev.morling.demos.PostgresUnixDomainSocketFactory.createSocket(PostgresUnixDomainSocketFactory.java:19)
...
Now JEP 380 solely speaks about SocketChannel
and stays silent about Socket
;
but perhaps obtaining a socket from a domain socket channel works?
public Socket createSocket() throws IOException {
var sc = SocketChannel.open(UnixDomainSocketAddress.of(
"/var/run/postgresql/.s.PGSQL.5432"));
return sc.socket();
}
Nope, no luck either:
java.lang.UnsupportedOperationException: Not supported
at java.base/sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:226)
at dev.morling.demos.PostgresUnixDomainSocketFactory.createSocket(PostgresUnixDomainSocketFactory.java:17)
Indeed it looks like JEP 380 is concerning itself only with the non-blocking SocketChannel
API,
while users of the blocking Socket
API do not get to benefit from it.
It should be possible to create a custom Socket
implementation based on the socket channel support of JEP 380,
but that’s going beyond the scope of my little exploration.
The Vert.x Postgres Client
If the Postgres JDBC driver doesn’t easily benefit from the JEP, what about other Java Postgres clients then? There are several non-blocking options, including the Vert.x Postgres client and R2DBC. The former is used to bring Reactive capabilities for Postgres into the Quarkus stack, too, so I turned my attention to it.
Now the Vert.x Postgres Client already has support for Unix domain sockets, by means of adding the right Netty native transport dependency to your project. So purely from functionality perspective, there’s not that much to be gained here. But being able to use domain sockets also with the default NIO transport would still be nice, as it means one less dependency to take care of. So I dug a bit into the code of the Postgres client and Vert.x itself and figured out, that two things needed adjustment:
-
The NIO-based
Transport
class of Vert.x needs to learn about the fact thatSocketChannel
now also supports Unix domain sockets (currently, an exception is raised when trying to use them without a Netty native transport) -
Netty’s
NioSocketChannel
needs some small changes, as it tries to obtain aSocket
from the underlyingSocketChannel
, which doesn’t work for domain sockets as we’ve seen above
Step 1 was quickly done by creating a custom sub-class of the default Transport
class.
Two methods needed changes:
channelFactory()
for obtaining a factory for the actual Netty transport channel,
and convert()
for converting a Vert.x SocketAddress
into a NIO one:
public class UnixDomainTransport extends Transport {
@Override
public ChannelFactory<? extends Channel> channelFactory(
boolean domainSocket) {
if (!domainSocket) { (1)
return super.channelFactory(domainSocket);
}
else {
return () -> {
try {
var sc = SocketChannel.open(StandardProtocolFamily.UNIX); (2)
return new UnixDomainSocketChannel(null, sc);
}
catch(Exception e) {
throw new RuntimeException(e);
}
};
}
}
@Override
public SocketAddress convert(io.vertx.core.net.SocketAddress address) {
if (!address.isDomainSocket()) { (3)
return super.convert(address);
}
else {
return UnixDomainSocketAddress.of(address.path()); (4)
}
}
}
1 | Delegate creation of non domain socket factories to the regular NIO transport implementation |
2 | This channel factory returns instances of our own UnixDomainSocketChannel type (see below), passing a socket channel based on the new UNIX protocol family |
3 | Delegate conversion of non domain socket addresses to the regular NIO transport implementation |
4 | Create a UnixDomainSocketAddress for the socket’s file system path |
Now let’s take a look at the UnixDomainSocketChannel
class.
I was hoping to get away again with creating a sub-class of the NIO-based implementation,
io.netty.channel.socket.nio.NioSocketChannel
in this case.
Unfortunately, though, the NioSocketChannel
constructor invokes the taboo SocketChannel#socket()
method.
Of course that’d not be a problem when doing this change in Netty itself,
but for my little exploration I ended up copying the class and doing the required adjustments in that copy.
I ended up doing two small changes:
-
Avoiding the call to
SocketChannel#socket()
in the constructor:public UnixDomainSocketChannel(Channel parent, SocketChannel socket) { super(parent, socket); config = new NioSocketChannelConfig(this, new Socket()); (1) }
1 Passing a dummy socket instead of socket.socket()
, it shouldn’t be accessed in our case anyways -
A few methods call the
Socket
methodsisInputShutdown()
andisOutputShutdown()
; those should be possible to be by-passed by keeping track of the two shutdown flags ourselves -
As I was creating the
UnixDomainSocketChannel
in my own namespace instead of Netty’s packages, a few references to the non-public methodNioChannelOption#getOptions()
needed commenting out, which again shouldn’t be relevant for the domain socket case
You can find the complete change in this commit. All in all, not exactly an artisanal piece of software engineering, but the little hack seemed good enough at least for taking a quick glimpse at the new domain socket support. Of course a real implementation could be done much more properly within the Netty project itself.
So it was time to give this thing a test ride.
As we need to configure the custom Transport
implementation,
retrieval of a PgPool
instance is a tad more verbose than usual:
PgConnectOptions connectOptions = new PgConnectOptions()
.setPort(5432) (1)
.setHost("/var/run/postgresql")
.setDatabase("test_db")
.setUser("test_user")
.setPassword("topsecret!");
PoolOptions poolOptions = new PoolOptions()
.setMaxSize(5);
VertxFactory fv = new VertxFactory();
fv.transport(new UnixDomainTransport()); (2)
Vertx v = fv.vertx();
PgPool client = PgPool.pool(v, connectOptions, poolOptions); (3)
1 | The Vert.x Postgres client constructs the domain socket path from the given port and path (via setHost() );
the full path will be /var/run/postgresql/.s.PGSQL.5432, just as above |
2 | Construct a Vertx instance with the custom transport class |
3 | Obtain a PgPool instance using the customized Vertx instance |
We then can can use the client instance as usual, only that it now will connect to Postgres using the domain socket instead of via TCP/IP. All this solely using the default NIO-based transports, without the need for adding any Netty native dependency, such as its epoll-based transport.
I haven’t done any real performance benchmark at this point;
in a quick ad-hoc test of executing a trivial SELECT
query on a primay key 200,000 times,
I observed a latency of ~0.11 ms when using Unix domain sockets — with both, netty-transport-native-epoll and JDK 16 Unix domain sockets — and ~0.13 ms when connecting via TCP/IP.
So definitely a significant improvement which can be a deciding factor for low-latency use cases,
though in comparison to other reports,
the latency reduction of ~15% appears to be at the lower end of the spectrum.
Some more sincere performance evaluation should be done, for instance also examining the impact on garbage collection. And it goes without saying that you should only trust your own measurements, on your own hardware, based on your specific workloads, in order to decide whether you would benefit from domain sockets or not.
Other Use Cases
Database connectivity is just one of the use cases for domain sockets; highly performant local inter-process communication comes in handy for all kinds of use cases. One which I find particularly intriguing is the creation of modular applications based on a multi-process architecture.
When thinking of classic Java Jakarta EE application servers for instance,
you could envision a model where both the application server and each deployment are separate processes,
communicating through domain sockets.
This would have some interesting advantages, such as stricter isolation
(so for instance an OutOfMemoryError
in one deployed application won’t impact others) and re-deployments without any risk of classloader leaks, as the JVM of an deployment would be restarted.
On the downside, you’d be facing a higher overall memory consumption
(although that can at least partly be mitigated through class data sharing, which also works across JVM boundaries) and more costly (remote) method invocations between deployments.
Now the application server model has fallen out of favour for various reasons, but such multi-process design still is very interesting, for instance for building modular applications that should expose a single web endpoint, while being assembled from a set of processes which are developed and deployed by several, independent teams. Another use case would be desktop applications that are made up of a set of processes for isolation purposes, as it’s e.g. done by most web browsers noawadays with distinct processes for separate tabs. JEP 380 should facilitate this model when creating Java applications, e.g. considering rich clients built with JavaFX.
Another, really interesting feature of Unix domain sockets is the ability to transfer open file descriptors from one process to another. This allows for non-disruptive upgrades of server applications, without dropping any open TCP connections. This technique is used for instance by Envoy Proxy for applying configuration changes: upon a configuration change, a second Envoy instance with the new configuration is started up, takes over the active sockets from the previous instance and after some "draining period" triggers a shutdown of the old instance. This approach enables a truly immutable application design within Envoy itself, with all its advantages, without the need for in-process configuration reloads. I highly recommend to read the two posts linked above, they are super-interesting.
Unfortunately, JEP 380 doesn’t seem to support file descriptor transfers. So for this kind of architecture, you’d still have to refrain to the aforementioned junixsocket library, which explicitly lists file transcriptor transfer support as one of its features. While you couldn’t take advantage of that using Java’s NIO API, it should be doable using alternative networking frameworks such as Netty. Probably a topic for another blog post on another one of those pandemic weekends ;)
And that completes my small exploration of Java 16’s support for Unix domain sockets. If you want to do your own experiments of using them to connect to Postgres, make sure to install the latest JDK 16 EA build and grab the source code of my experimentation from this GitHub repo.
It’d be my hope that frameworks like Netty and Vert.x make use of this JDK feature fairly quickly, as only a small amount of code changes is required, and users get to benefit from the higher performance of domain sockets without having to pull in any additional dependencies. In order to keep compatibility with Java versions prior to 16, multi-release JARs offer one avenue for integrating this feature.