Ben Rowland

I recently repaired a MIDI keyboard which has been causing me trouble for a long while. This was my first go at doing electronic board repair with surface mount components, and it went well!

See the LCD display flicker on and off as the unit resets itself at random moments:

This model is pretty old now, and many of these units suffered a problem dubbed the “blue screen of death”. This relates to how the LCD would go dead and the blue backlight was the only sign of life.

I found a walkthrough of this repair in this excellent YouTube video. Fortunately I was able to follow along with that repair video and replicate the steps, both testing to find the problem and also the fix.

Too long ago in 2004, this website started. I vividly recall the satisfaction when I typed www.benrowland.net into a browser, and it worked. I ran the website on Tomcat running in a DOS window on my Windows desktop, so it wasn’t quite a production-grade deployment. I had pointed my domain name to the IP address allocated to my home computer by my ISP, and there was something magical about watching the internet routing happen the way I’d hoped it would.

When upgrading a hobby Spring Boot project to Java 11, there were a few issues which reminded me of how closely related Gradle, the Spring Boot plugin, and the java version you’re using are. Naively trying a gradlew bootRun on the project (which was using Gradle 3.1) resulted in this: Could not determine java version from '11.0.5' Fortunately I’d seen that one before, so I upgraded my version of Gradle to 5.

As software developers, we often want to talk to services over the internet, usually using HTTP. However, it’s now very common to see online services using HTTPS – an extension of HTTP which enables secure communications over a network. This move has made life more interesting for developers who want to interact with these services. Most of the time, connecting to a host over HTTPS “just works” from Java, but sometimes things don’t quite work … and Java can be a bit cryptic in how it reports the failure.

Chain of Trust When a Java program connects to a host over HTTPS, it’s important to know you’re really communicating with who you think you are. If I write a program which connects to https://www.example.com, then I’m sending information over a secure channel. How can I be sure I’m not talking to a malicious third party who is somehow intercepting my traffic, such as user credentials? The identity of a server is proven with certificates.

Ciphers So far, we’ve looked at how certificates can support the Authentication property of HTTPS. The certificate also enables a second property of HTTPS – Encryption of the traffic. This encryption is possible because the certificate contains a public key which allows clients to encrypt data, so only those holding the corresponding private key (the website owner) will be able to decrypt. This encryption is achieved using ciphers. There may be a range of ciphers available on both ends of the connection, and the cipher chosen for the communication will be agreed in the initial TLS handshake.

Perfect Forward Secrecy At the present time, with the possibility of data breaches and digital eavesdropping, privacy is an important subject. What’s to prevent a malicious third-party from collecting traffic going to or from a website? It might seem adequate to use HTTPS to encrypt traffic to a website, meaning it’s impossible to decrypt the traffic unless you also have access to the private key. But what if a malicious actor collects encrypted traffic over a long period of time?

Normally in Java, if the main thread starts one or more non-daemon threads, the Java process will not terminate until the last non-daemon thread terminates.

Yet, I was surprised to find that a particular JUnit test completed normally, despite never calling shutdown() on a ThreadPoolExecutor it had started. No Java process was left behind. This was the case both when running the test from within IntelliJ and also from Maven (using the surefire plugin). Replicating the test code in a vanilla main() method led to the expected behaviour: a “hanging” process.

So what was going on? Surely something fascinating and enlightening, right? Running the JUnit test from my IDE revealed the underlying java invocation in the Console pane (abbreviated):

java -classpath "some-massive-classpath" com.intellij.rt.execution.junit.JUnitStarter -ideVersion5 MyTest,someMultiThreadedTest

So, the main method which launches the JUnit tests is in the class called JUnitStarter, which is an internal class within IntelliJ. A quick look at the code for JUnitStarter reveals the answer is very simple: an explicit call to System.exit() before main() returns. Maven Surefire’s ForkedBooter does the same thing.

As always, some strange behaviour turns out to be something entirely simple! But this is something to watch out for. Ideally, unit tests wouldn’t test multithreaded code (rather, they would test logic which is abstracted from the surrounding threaded environment). But if you must test multi-threaded production code, then be aware that your tests could give a misleading positive result in cases such as this.

Bufferbloat - it’s making our internet slow. But what is it?

After reading this thought-provoking article about bufferbloat, I wanted to do two things: have a better understanding of the concept, and find evidence of its occurrence within my own set-up.

The term ‘bufferbloat’ was coined by Jim Gettys in 2010 as an explanation of much of today’s internet congestion, which can lead to very poor performance over an apparently “high bandwidth” network connection. In this article I will attempt to explain bufferbloat in a way accessible to those who are not network professionals.

Disclaimer: I am not a network professional either; I simply enjoy researching things. This article is purely an attempt to digest what I’ve learned, and hopefully pass on something interesting to others. I will also document how I solved one particular instance of the problem in my own network.

The internet and indeed, any system of connected components - is made up of communication channels, each capable of a particular throughput. This can be visualised as a network of interconnected pipes, all of varying widths. Any point where a “large pipe” (high bandwidth) feeds into a smaller one (low bandwidth) can become a bottleneck when traffic levels are high.

To be clear, this situation of interconnected links with varying bandwidths is normal - for example where a backbone link carrying national traffic feeds into a smaller network servicing a particular set of subscribers. Usually the subset of traffic coming through the bottleneck would not usually exceed that which the small pipe can service, otherwise the situation would clearly be inadequate.

However, temporary spikes in traffic during unusually busy periods can occur. At this point, one of two things can happen. Either the excess traffic is stored up in a buffer (a “holding area”) for the duration of the spike, or else the narrower link must reject the excess traffic as there’s nowhere for it to go.

In the first scenario, the excess traffic would slowly fill the buffer for the duration of the spike. The buffer would be drained into the smaller pipe as fast as can be supported. Once traffic levels return to normal, the buffer would empty back to its normal level. The upstream components would not be aware of this situation, as they would not experience any rejected traffic (dropped packets).

However, if the traffic spike is prolonged, then the buffer becomes full, and the situation is similar to that where no buffer exists: packets are dropped.

From the upstream producer’s point of view, the packet would need to be re-sent (as no acknowledgement was received). The re-sending process would continue whilst the bottleneck is in effect, and would appear as a slow (or stalled) data transfer.

To be clear, these buffers are good to have. In the early days of the internet (c. 1986), buffers were insufficiently sized. This led to heavy packet loss during times of even moderate contention, to the point where most of the traffic was retransmitted packets. This was clearly inadequate, and so the use of larger buffers was recommended. Importantly, congestion control algorithms were also brought into play in each link which transmits data. These algorithms attempt to detect the size of the downstream pipe by slowly ramping up traffic to the point where no packets are dropped.

So where’s the problem? The problem surfaces when the size of buffers is set too high. A buffer is just an area of memory, and as memory has become cheap, buffers have become larger, without adequate consideration of the consequences. A buffer which is too large gives a false indication that a bottlenecked pipe is bigger than it really is. If a very large buffer is in use, then your data transfer is simply filling this buffer, making the pipe look bigger than it really is. The buffer doesn’t even serve its original purpose, as it is permanently full.

Why is this bad? If you’re doing a large upload (for example, sending a video to YouTube or backing up music to cloud storage) where an oversized transmit buffer is present, then web pages may appear to load very slowly (many seconds). The reason is that the tail-end of the large upload is sat in a large queue. A request to Google would sit at the back of the queue, and would have to wait until the buffer is emptied before it is sent on to the next link.

The solution is to tune the size of the buffer, such that it is only used to absorb temporary spikes in traffic, rather than giving false indications of high bandwidth during periods of contention. To be fair, the real solution is fairly complex, involving Active Queue Management to signal the onset of congestion so the rate of flow can be backed off before the buffer becomes full.

In many cases, these buffers exist in network equipment (such as routers) which is controlled by ISPs and similar organisations, but there are places under your own control where you can identify and fix this phenomenon. For my own situation, the issue was that during a large backup of files from my netbook to another computer on my network, it was virtually impossible to do anything else network-related on the netbook. During a large file upload to another computer on my LAN, a very slow wireless connection is a permanent bottleneck, with an observed effective throughput of 400kB/s (shown by the scp command), or 3Mbps.

By default, Linux allocates a transmit buffer maximum size of about 3MB (obtained via the following command, which gives minimum, default and maximum memory for the TCP transmit buffer):

sysctl -a | grep net.ipv4.tcp_wmem

If I start off a large upload and watch the size of this transmit buffer, the tx_queue settles at around 1.7MB. This value was obtained via:

cat /proc/net/tcp

1.7MB of data was permanently sat in the buffer; this would take around 4 seconds to drain over a 400kB/s network link. So any requests for web pages whilst the transfer is going on will be sat in a 4 second queue. Not good. This setting certainly needed to be tweaked in my case. Setting it too low however would result in small windows of data being sent per-roundtrip, which would prevent TCP from ever ramping up to full throughput.

The article quoted earlier suggests the recommended buffer size is the Bandwidth Delay Product. This is the bottleneck bandwidth, multiplied by the delay (or latency) that packets in the buffer take to reach their destination.

So, my buffer size of 1.7MB with a latency of 1ms (over my home network) correlates to an imaginary bandwidth of 1.7MB/s, or around 14Mbps (in contrast to the real bottleneck bandwidth of around 3Mbps). So, the TCP transmit buffer was five times too large for my particular environment. Setting the TCP transmit buffer size to the approximately correct size of around 256Kb mostly fixed the problem. I settled for a figure of 128Kb - on my system this is a good compromise between bandwidth for large uploads, and latency for other interactive activity such as browsing or SSHing. This setting can be changed by editing /etc/sysctl (the interface into kernel parameters).

Follow this with a refresh of the parameters, and you’re done:

sudo sysctl -p

Caveat: Your own mileage certainly may vary if you choose to tweak these settings. You’d be mad to do this on anything important without knowing exactly what you’re doing.

Note: There are a number of articles which suggest increasing the size of the network buffers in Linux, using a similar approach. Based on my understanding and experiences, this is fine if raw bandwidth is your goal, and particularly if you have a healthy upstream bandwidth. If you don’t have this bandwidth, then setting these buffers too high could harm your interactive network activity, while being unable to improve utilisation in an already saturated link.

It’s common to see “find the number of possibilities” problems in Computer Science. This kind of problem stems from Discrete Maths - an important pre-requisite for doing anything beyond the trivial, for example Cryptography or Graph Theory.

I found one of these problems on Project Euler. Project Euler is a collection of mathematically-inclined programming problems - probably more than you could ever solve in a lifetime (some of them are still unsolved by anybody). The particular problem which drew my attention doesn’t actually require any programming to solve.

The problem is based on the idea of finding routes between two points on a grid:

Starting in the top left corner of a 2x2 grid, there are 6 routes (without backtracking) to the bottom right corner. 
How many routes are there through a 20x20 grid?

This is pretty fundamental maths, but I find these kind of techniques are always worth re-visiting, as it seems to be a case of “use it or lose it”.

Following is my approach, so don’t read any further if you want to try it yourself first!

I started by drawing a tree structure for the 2x2 grid, where each node had two choices = ‘R’ or ‘D’ (for go Right, or Down). This gave me a feel for things. Towards the end of some paths, there was clearly some pruning - where the only option is to head for the goal (rather than back-tracking or going out of bounds).

It then became clear that any plan for getting to the goal simply involved two Rs and two Ds. You clearly need to take two steps Right, and two steps Down to reach the goal, whatever your route. So the problem can be re-stated as “how many ways are there of arranging two Rs and two Ds?” Or more vividly: “If I have a bag containing two Kit-Kats and two Mars Bars, how many distinct ways can I eat them in sequence?”

Of course, the stated problem involves twenty each of Kit-Kats and Mars Bars. So if I was really hungry, how many ways could I eat them all? Suitably motivated, it’s time for some fun with combinatorics.

For the moment, let’s go back to the 2x2 grid, and ignore the repetition of Right and Down moves. This means we must take four distinct steps to reach the goal. So let’s assume that we have a bag of four chocolate bars - all different. How many ways can we draw them in sequence? Or more properly, how many permutations are there?

For the first choice, we have four options. Once we’ve made this first selection, we have three left to choose from. Then two, and finally there’s only one left. This naturally leads us to the factorial function:

4! = 4 x 3 x 2 x 1 = 24

So there are 24 ways (permutations) to draw four tasty, chocolate treats. Now let’s amend our calculation, taking into account that two of the chocolate bars are identical. Say, two Milky Ways, one Kit-Kat, and one Mars Bar. This is easy to work out - out of our 24 original permutations, we need to omit the repeated permutations of the two identical items. There are 2! (2 x 1 = 2) ways to arrange two chocolate bars, so we adjust our answer for this.

4!/2! = 12 permutations

Now, it’s only one more step to re-discover the example solution, by taking into account that there are two classes of two identical ‘objects’ (Right moves and Down moves), and so we end up with:

4!/(2!*2!) = 6 permutations.

Now it’s really easy to solve the stated problem - I won’t give away the solution of course!

Axiom 49 Keyboard Repair

How it started vs. How it's going - www.benrowland.net

Spring Boot, Gradle and Java 11

HTTPS and Java - Pitfalls and Best Practices - Part 1

HTTPS and Java - Pitfalls and Best Practices - Part 2

HTTPS and Java - Pitfalls and Best Practices - Part 3

HTTPS and Java - Pitfalls and Best Practices - Part 4

JUnit and Non-Daemon Threads

Banishing Bufferbloat

An Appetite for Combinatorics