Programming – The Cafes

Code Coverage Has a Blind Spot

Elliotte Rusty Harold — Sun, 31 Dec 2023 14:29:31 +0000

Here’s the coverage report on a recent PR of mine:

All modified and coverable lines are covered by tests ?

    Comparison is base (a765aef) 85.95% compared to head (fe02e1b) 85.95%.

Additional details and impacted files

@@            Coverage Diff            @@
##             master     #546   +/-   ##
=========================================
  Coverage     85.95%   85.95%           
  Complexity     3480     3480           
=========================================
  Files           230      230           
  Lines          8225     8225           
  Branches        960      960           
=========================================
  Hits           7070     7070           
  Misses          865      865           
  Partials        290      290

Precisely identical. What happened? Did I change a comment? Well, no. In fact I added tests for
situations that were not currently covered, so why didn’t coverage increase?

The two new tests covered exceptional cases that Codecov doesn’t see because they test different arguments that can be passed to a public method. Code coverage percentage check line and sometimes branch coverage, but they have a blind spot for the much larger number of permutations of input values.

In this case those arguments are null, and the tests cover null pointer exceptions, but really they could have been any different arguments to the method not already covered. Real coverage increased but the metrics don’t show it.

This is another example of why it’s important to write the tests first. Relying on tools to identify cases that aren’t covered misses a lot. Without delving too deep into the commit history, I’d bet that what happened here is that someone changed the behavior of an existing method. The existing tests failed, and they commented them out rather than understanding the change in behavior and deciding whether it made sense.

What they should have done is written a test for the desired new behavior or changed the existing tests to test for the new behavior. Of course, if the new behavior wasn’t desired then they should have instead reverted it, or fixed their new code so the tests didn’t break. Again, without delving into commit history, I’m not sure if this change was intended or not. For now, I’m sticking with the current behavior since it seems reasonable. But test first changes, or at the very least not commenting out the test failures, would have alleviated this uncertainty.

There are opportunities for code coverage tools to do better, though I don’t yet know of any that do. For instance, if a method is declared to throw a certain exception, or has an @throws Javadoc tag indicating a certain runtime exception, the tool could check that the exception was indeed thrown at some point during the tests.

Similarly it could inspect the arguments passed to a public method during a test and warn if the method only received positive numbers, or non-null objects, or non-empty strings, or whatever. It could even check that common edge conditions like 1, -1, 0, Inf, NaN, and Integer.MAX_INT are covered.

Why Python is Better than Java

Elliotte Rusty Harold — Sun, 10 Dec 2023 19:02:42 +0000

Reason 1: Mocking.

unittest.mock, Python’s mocking framework is so much more powerful than EasyMock, Mockito, or any other Java mock framework I’ve ever used. You can replace any method you like with essentially arbitrary code. No longer do you have to contort APIs with convoluted dependency injection just to mock out network connections or reproduce error conditions.

Instead you just identify a method by name and module within the scope of the test method. When that method is invoked, the actual code is replaced with the mock code. You can do this long after the class being mocked was written. Model classes do not need to participate in their own mocking. You can mock any method anywhere at any time, in your own code or in dependencies. No dependency injection required. You can even mock fields.

By contrast Java only lets you mock objects (not methods) and only when you have an available API to insert the mock in place of the real thing.

Reason 2: None is its own type.

In Java null can be anything. In Python only None is None. An str can’t be None. A Foo can’t be None. An int can’t be None. Union[None, Foo] is not the same type as Foo. This is beautifully simple and obvious, once you break Java habits of thinking of None as essentially whatever type you need. It helps to find and prevent bugs and reason about code.

Java has had @Nullable and @NonNull for some years now, but these have never been integrated into the language, require special tools to check them, and aren’t nearly as powerful as Python’s simple decision that None is its own thing.

Reason 3: Named method arguments with defaults

These are so much more readable and less error prone than method overloading, I hardly know where to begin. In fact, these are so wonderful I’m tempted to write all my Python code with only named arguments.

Reason 4: Calling super.__init__ last

In Java the superclass constructor is always called first, before any other statements in a subclass constructor. There are reasons for this, but it’s often inconvenient and can require a lot of hanky panky to get values ready before they’re naturally available.

Python is just easier here. Call super.__init__ whenever you’re ready to call it: first, last, in the middle. It doesn’t matter.

Reason 5: f strings

Relatively new in Python — version 3.7? — but far more readable and less error prone than what we had before.

When Java added varargs, printf, and java.lang.Formatter circa Java 1.5, we had decades of experience in C and C++ to teach us this was a bad API that led to dangerous bugs. Sadly, no one paid attention to that. Instead of creating a modern, sane format like Python’s f-strings, they mindlessly copied a misbegotten 30 year old kludge from C. At least Python eventually learned from Java’s mistake even Java didn’t learn from C’s.

Why Java is Better than Python

Elliotte Rusty Harold — Sun, 10 Dec 2023 18:59:17 +0000

Reason 1: Strong, static, compile time typing

I used to take Pythonistas at their word that they didn’t actually need strong, static compile time type checking. That was before I spent over a year writing Python more or less full time.

I am constantly blocked by not understanding which variables have which types. I am frequently spelunking through many levels of code and popping open the debugger to find out what type a variable actually has when. Not having explicit, enforced types makes code far harder to understand and edit.

Corollary: var is very bad idea for Java and should not be used.

This is well known in the Python community today. PEP 484 is basically an admission that inline typing is a necessity for robust code, and Guido has admitted as much. It’s why large Python shops like Meta and Google have invested in tools like Pyre and Pytype to add strong typing to the language. These tools help, but they’re not as good as Java’s strong, reliable, static type declarations and type checking.

Reason 2: Checked exceptions

There are only two ways exceptions are handled in Python:

Method 1: Crash and dump the stack.

Method 2: Catch all exceptions, log an error message and a stack trace for each one, and then crash.

Yes, Python programmers could do better. No, in practice they don’t. I’m not talking about one-off hobbyist scripts either. Even major programs from big tech companies like Google fail badly here. When I was doing a lot of work in GCP, I routinely crashed the Cloud SDK, written in Python, leading to a stack trace that was not in my code and that I could do nothing about. When you’re contacting a network service, e.g. to deploy a new version of an App Engine app, it’s entirely possible the remote server will return an error. When it does, the Cloud SDK should look at the error code and report the problem to the end user in an intelligible fashion. Instead, more often than not, it prints a stack trace from the Cloud SDK’s own code, even though this is not the code that’s at fault. If any code was buggy, it was on the server but that stack trace wasn’t available.

This simply doesn’t happen nearly as often in Java, and checked exceptions are the main reason why it doesn’t. Java programmers have to go out of their way to ignore network failures. Python programmers have to go out of their way to handle network failures.

Even programmers who want to do better can’t because their dependencies don’t do better. Libraries can and do throw any exception and any time, and change which exceptions they throw when in patch releases. None of this is ever documented. Unless you control 100% of the code in your project, you can’t assume anything about which exceptions might be thrown when. Even if you think you control all the code, there’s nothing to prevent another programmer, or even your future self, from changing the exceptions in the code you depend on.

Reason 3: Performance

Python is slow. Sometimes this doesn’t matter. Sometimes it does. It matters more often than you think. For instance, even when I’m working on a program without strong performance requirements, Python routinely takes a minute or more to start up and run a unit test. That breaks my flow and prevents the fast feedback cycle Python was supposed to give us by being interpreted instead of compiled.

By contrast in Java, fast methods run fast. If a test isn’t doing very much, I can run it in less than a second in Eclipse or IntelliJ. Nothing in Python in VSCode is ever that fast. Maybe VSCode is the problem, I’m not sure, but I do know that I iterate code-test-debug way faster in Java than in Python.

Reason 4: Access Protection

Python doesn’t have private. Double underscores are a polite suggestion not to access something, not a rule. Python’s general philosophy on this is often summed up as “We are all adults here”.

Having now dealt with very large Python code bases written by top notch, very smart, highly paid Python programmers, I am prepared to say no, no we are not. I am very tired of being unable to change a dunder Python method in my own teams’ code because some alleged adult twelve time zones away made their mission critical product depend on the exact signature and behavior of our “private” method.

Tight dependency coupling is a bad traffic jam in Java. In Python it’s a Richter 9 earthquake that destroys all roads in a city and sets the gas stations on fire. The problems just aren’t at the same scale.

Reason 5: Javadoc

Python has doc comments but they’re rarely used or published. By contrast Java has a robust culture of documenting public APIs that enables Google to provide reliable answers for many questions. Outside the standard libraries, Python third party libraries rarely document methods adequately or make them available in a form Google can surface.

Python is not a DSL

Elliotte Rusty Harold — Sun, 16 Apr 2023 11:01:31 +0000

How many times have you seen someone use a hammer to pound screws because they are a hammer expert, they are comfortable with hammers, they don’t know how to use a screwdriver, and they don’t want to take a week to learn how to use a screwdriver? Maybe not so much if you’re a carpenter, but if you’re a software developer it happens all the time.

I’ve noticed a common anti-pattern of defining declarative DSLs in Turing complete languages — specifically Python — to avoid the overhead of learning new syntax and tools, XML or JSON. Instead programmers define the DSL as a Python library and reuse the Python compiler with predictable results. Blaze/Bazel, Airflow, dataswarm, and many other projects have gone down this road. Gradle made the same mistake, only with Ruby instead of Python.

This is massive tech debt that causes massive problems (security, indeterminancy, irreproducibility) and has heavy cost. Never do this. It always leads to a huge expensive effort to redefine the language as its own thing (not Python) that still looks like Python, and the team ends up writing a complete parser in addition to everything else. XML is not that hard. Nut up and learn it.

Do not write declarative configs in a Turing complete language.
Do not invent Python subsets for config files. Starlark

Happy 20th Birthday Java!

Elliotte Rusty Harold — Thu, 21 May 2015 11:21:44 +0000

Happy 20th Birthday Java! Next year I’ll buy you a drink. InfoWorld has published some of my thoughts on the occasion, “Java at 20: How it changed programming forever”.

Why java.util.Arrays uses Two Sorting Algorithms

Elliotte Rusty Harold — Sat, 30 Mar 2013 11:51:03 +0000

java.util.Arrays uses quicksort (actually dual pivot quicksort in the most recent version) for primitive types such as int and mergesort for objects that implement Comparable or use a Comparator. Why the difference? Why not pick one and use it for all cases? Robert Sedgewick suggests that “the designer’s assessment of the idea that if a programmer’s using objects maybe space is not a critically important consideration and so the extra space used by mergesort maybe’s not a problem and if the programmer’s using primitive types maybe performance is the most important thing so we use the quicksort”, but I think there’s a much more obvious reason.

Quicksort is faster in both cases. Mergesort is stable in both cases. But for primitive types quicksort is stable too! That’s because primitive types in Java are like elementary particles in quantum mechanics. You can’t tell the difference between one 7 and another 7. Their value is all that defines them. Sort the array such [7, 6, 6, 7, 6, 5, 4, 6, 0] into [0, 4, 5, 6, 6, 6, 6, 7, 7]. Not only do you not care which 6 ended up in which position. It’s a meaningless question. The array positions don’t hold pointers to the objects. They hold the actual values of the objects. We might as well say that all the original values were thrown away and replaced with new ones. Or not. It just doesn’t matter at all. There is no possible way you can tell the difference between the output of a stable and unstable sorting algorithm when all that’s sorted are primitive types. Stability is irrelevant with primitive types in Java.

By contrast when sorting objects, including sorting objects by a key of primitive type, you’re sorting pointers. The objects themselves do have an independent nature separate from their key values. Sometimes this may not matter all that much—e.g. if you’re sorting java.lang.Strings—but sometimes it matters a great deal. To borrow an example from Sedgewick’s Algorithms I class, suppose you’re sorting student records by section:

public class Student {

  String lastname;
  String firstName;
  int section; 
  
}

Suppose you start with a list sorted by last name and then first name:

John	Alisson	2
Nabeel	Aronowitz	3
Joe	Jones	2
James	Ledbetter	2
Ilya	Lessing	1
Betty	Lipschitz	2
Betty	Neubacher	2
John	Neubacher	3
Katie	Senya	1
Jim	Smith	3
Ping	Yi	1

When you sort this again by section, if the sort is stable then it will still be sorted by last name and first name within each section:

Ilya	Lessing	1
Katie	Senya	1
Ping	Yi	1
John	Alisson	2
Joe	Jones	2
James	Ledbetter	2
Betty	Lipschitz	2
Betty	Neubacher	2
Nabeel	Aronowitz	3
John	Neubacher	3
Jim	Smith	3

However if you use quicksort, you’ll end up with something like this and have to resort each section by name to maintain the sorting by name:

Ilya	Lessing	1
Katie	Senya	1
Ping	Yi	1
Betty	Lipschitz	2
Betty	Neubacher	2
John	Alisson	2
Joe	Jones	2
James	Ledbetter	2
Jim	Smith	3
John	Neubacher	3
Nabeel	Aronowitz	3

That’s why stable sorts make sense for object types, especially mutable object types and object types with more data than just the sort key; and mergesort is such a sort. But for primitive types stability is not only irrelevant. It’s meaningless.

Why Functional Programming in Java is Dangerous

Elliotte Rusty Harold — Sun, 20 Jan 2013 11:47:14 +0000

In my day job I work with a lot of very smart developers who graduated from top university CS programs such as MIT, CMU, and Chicago. They cut their teeth on languages like Haskell, Scheme, and Lisp. They find functional programming to be a natural, intuitive, beautiful, and efficient style of programming. They’re only wrong about one of those.

The problem is that my colleagues and I are not writing code in Haskell, Scheme, Lisp, Clojure, Scala, or even Ruby or Python. We are writing code in Java, and in Java functional programming is dangerously inefficient. Every few months I find myself debugging a production problem that ultimately traces back to a misuse of functional ideas and algorithms in a language and more importantly a virtual machine that just wasn’t built for this style of programming.

Recently Bob Martin came up with a really good example that shows why. Here’s a bit of Clojure (a real functional language) that returns a list of the first 25 integers:

(take 25 (squares-of (integers)))

This code runs, and it runs reasonably quickly. The output is:

(1 4 9 16 25 36 49 64 … 576 625)

Now suppose we want to reproduce this in Java. If we write Java the way Gosling et al intended Java to be written, then the code is simple, fast, and obvious:

for (int i=1; i<=25; i++)
    System.out.println(i*i);
}

But now suppose we do it functionally! In particular suppose we naively reproduce the Clojure style above:

import java.util.ArrayList;
import java.util.List;

public class Take25 {

    public static void main(String[] args) {    
        for (Object o : take(25, squaresOf(integers()))) {
            System.out.println(o);
        }
    }
    
    public static List take(int n, List list) {
        return list.subList(0, n);
    }
    
    public static List squaresOf(List list) {
        List result = new ArrayList();
        for (Integer number : list) {
            result.add(number.intValue() * number.intValue());
        }
        return result;
    }
    
    public static List integers() {
        List result = new ArrayList();
        for (int i = 1; i <= Integer.MAX_VALUE; i++) {
            result.add(i);
        }
        return result;
    }
    
}

Try to run that. Go ahead. I dare you....OK, recovered from the heap dump yet?

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOf(Arrays.java:2760)
	at java.util.Arrays.copyOf(Arrays.java:2734)
	at java.util.ArrayList.ensureCapacity(ArrayList.java:167)
	at java.util.ArrayList.add(ArrayList.java:351)
	at Take25.integers(Take25.java:30)
	at Take25.main(Take25.java:9)

How did Clojure handle a function that returns every single int, while Java crapped out? The answer is that Clojure, like pretty much all true functional languages (and unlike Java) does lazy evaluation. It doesn't compute values it doesn't use. And it can get away with this because Clojure, unlike Java, is really and truly functional. It can assume that variables aren't mutated, that the order of evaluation doesn't matter, and thus that it can perform optimizations that a Java compiler can't. And this is why functional programming in Java is dangerous. Because Java isn't a true functional language, the JIT and javac can't optimize functional constructs as aggressively and efficiently as they can in a real functional language. Standard functional operations like returning infinite lists are death for a Java program. That's why functional programming in Java is dangerous.

You may object that I've set up a straw man here. OK, you can't return a list of all the integers (or even all the ints) in Java; but surely no one would really do that. Let's look at a more realistic approach. Here again I use recursion to compute the squares rather than a loop:

public class Squares {
    
    public static void main(String args[]) {
        squareAndPrint(1, Integer.parseInt(args[0]));
    }
    
    public static void squareAndPrint(int n, int max) {
        System.out.println(n * n);
        if (max > n) {
            squareAndPrint(n + 1, max);
        }
    }
    
}

That will run. But now suppose I don't want the first 25 squares but the first 25,000:

Ooops. Stack overflow. This is why in XOM I was very careful to use loops rather than recursion, even in places where recursion was much clearer. Otherwise a carefully configured XML document could have caused a XOM-using program to dump core. Avoiding arbitrarily large recursion in non-functional languages like Java and C isn't just a performance requirement, it's a security requirement too!

Now before the flames begin, let me be clear about what I am not saying. I am not saying that functional programming is a bad idea. I am not saying that functional programming is inefficient. I actually love functional programming. Like my colleagues I find find functional programming to be a natural, intuitive, and beautiful style of programming but only when it's done in a language that was designed for it from the beginning like Haskell. Functional idioms in Java are performance bugs waiting to bite you.

1% Problems

Elliotte Rusty Harold — Sun, 22 Jul 2012 15:58:29 +0000

I hate 1% problems. No this isn’t an OWS slogan. I’m thinking of those code issues that really aren’t a problem 99% of the time, but when they bite, they’re really hard to debug and they cause real pain. Several common cases in Java:

Using java.util.Date or java.util.Calendar instead of JodaTime.
Not specifying a Locale when doing language sensitive operations such as toLowerCase() and toUpperCase().
Not escaping strings passed to SQL, XML, HTML or other external formats.

What I hate most is that it’s really, really hard to convince other developers that these are problems they should take seriously. The excuses are common:

“No, I don’t have to specify a locale here because the strings are ASCII.”

“I’m only getting a timestamp; I don’t need a proper timezone.”

“The data we’re encoding is coming from a web service we control, and we know it’s not going to send us any formfeeds or null characters.”

“This string is a constant so we clearly don’t need to escape it”, and so on.

All these answers reduce to, “yes, there’s sort of a theoretical problem here; and maybe FindBugs is complaining; but it doesn’t really matter in this case, and I’ve got more important things to spend my time on.”

And you know what? The nay sayers are right, 99% of the time. The problem is that every one of these issues can bite badly that 1% of the time, and it’s usually not obvious when you’re in a 1% case. For instance, even because the string being used to construct an HTML attribute value today is a literal, doesn’t mean it won’t be refactored into a variable next year, and then a variable built from user input a year later. Suddenly there’s an XSRF vulnerability in your code that two years ago everyone agreed clearly couldn’t happen, and thus no effort was put into preventing it.

Worse yet, although these problems are very easy to spot at the source code level–indeed can often be detected algorithmically by tools such as PMD or FindBugs–it’s usually not obvious what the cause of the problem is once it does manifest itself. For instance, out of all the myriad reasons a SOAP call might be consistently failing, is the possibility that the data contains an invisible form feed character the first thing that comes to mind?

I have seen major production problems caused by every one of these (#2 just this past week, and #3 the week before) and every one many times more than once. In the case of the failure to properly escape web service input before generating XML, the bug had lived in the code for years before an errant form feed showed up in the data stream and cost several engineer days trying to understand and fix the problem.

These aren’t hard or costly problems to prevent or fix, if we just develop good coding habits. Anytime you see a SQL statement built by string concatenation, alarm bells ought to be sounding. Anytime you see getBytes() invoked on a string without specifying a character set, you shouldn’t have to think twice about changing it to getBytes(Charsets.UTF-8). Anytime you see java.util.Date or java.util.Calendar in code, you should know that something is likely to go wrong, and probably at the worst possible time.

It’s like seeing a large stack of heavy boxes piled in front of an emergency exit. You don’t have to think about it, estimate the risk of fixing it compared to the risk of leaving it as is, file bug reports, or prioritize it compared to everything else you have to do. You just fix it as quickly as you can. These are dangerous situations; they’re easy to spot; and as professionals we have a duty to fix them when we find them and not to cause them in the first place.

Don’t Design for Reuse

Elliotte Rusty Harold — Sat, 14 Jul 2012 21:17:06 +0000

Last week one of my colleagues hit me with an idea that was so obvious when he pointed it out I wondered why I hadn’t realized it before:

If you’re designing for reuse, you’re doing it wrong.

In 2012 the only code you should be writing is what’s needed for the immediate task at hand. Don’t design for reuse. Don’t consider reuse. Don’t waste one minute of your day making code reusable.

The fact is any reusable code you need already exists. Want to connect to an HTTP server with full support for authentication and cookies? That sounds like something a lot of projects could use, so you should wrap it up in a nice HTTP class or library, right? Wrong. You should use Apache HttpClient instead.

Do you need to solve some initial value problems with a shooting method? Don’t crack out your numerical analysis textbook and start coding. Just download Flanagan’s Java Scientific Library or buy a NAG license instead. Need a date chooser widget and want to share it with your colleagues? Just tell them about JCalendar instead. Maybe that doesn’t have exactly the look and feel you were aiming for? Fair enough. Write your own component or fork the existing one, but realize that your very specific look isn’t likely to fit other people’s apps any better than JCalendar does, so don’t waste any time making yours reusable.

These examples are for Java, but the same is true in any major language including Perl, Python, Ruby, C++, C#, and Scala. In fact, if the language doesn’t have a library that solves the reusable parts of your problem, you shouldn’t be using that language for that problem.

Are there exceptions to this rule? I can think of two (and so far this feels like an exhaustive list).

The first exception is when you’re writing code for something so new that libraries don’t exists yet, and you’re the first one out of the gate, then make the code reusable. For instance, when I first wrote my XIncluder libraries the XInclude spec was still in development, and there really weren’t any alternatives in Java. These libraries became part of the proof of implementability that allowed the spec to advance to full recommendation status some years later. (That effort very nearly got me condemned to invited expert status on the working group, though fortunately saner heads prevailed.) Writing my own XInclude library made sense ten years ago, but I certainly wouldn’t repeat it today.

The second exception is for experts only, and I’m not even sure about this one. If you really are an expert in the field that the reusable code addresses, and if you have made a careful survey of the existing options and concluded that they are inadequate and you see how to do a better job, then, and only then, might you consider writing your own reusable code. This is what I did with XOM. Only after I had written a several hundred page book exhaustively documenting all the then current APIs for processing XML with Java and their stengths and weaknesses, did I sit down to design an API that improved on them. And although I do think I came up with the best such API yet designed, I’m still not sure that was the best use of my time. XOM is, IMHO, superior to what came before it; but it hasn’t been superior enough to really replace those other libraries in many applications. The need just wasn’t that great. As time passes, the code already available for reuse approaches “good enough” and the cost/benefit ratio of improving on it goes way up.

Are there other exceptions? Other times you really should write reusable code? I can’t think of any. Too many developers have spent too much time exploring the problem space, and made their work available for free at sites like Sourceforge and Github. While there will always be new problems to be solved, there’s just not a lot of benefit to be gained by solving the old problems one more time. The next time you find yourself designing for reuse, stop and ask yourself whether you should be reusing someone else’s code instead.

My New Year’s Resolution

Elliotte Rusty Harold — Sat, 01 Jan 2011 14:41:36 +0000

In 2011 my New Year’s resolution is to do more things the easy way, even if it takes longer the first time. I am going to stop using brute force to solve problems. In particular:

I am finally going to memorize how one redirects both stderr and stdout to the same stream. (2>&1 |)
I am going to learn the sed? trick my advisor showed me 20 years ago for repeating a command from the shell history while substituting one word for another, instead of just using the arrow key to backup to and erase the string. (^string1^string2^ or !!:s/string1/string2/ or for global substitution, not just the first occurrence !!:gs/string1/string2/)
I am going to increase my regex fu and use regular expressions consistently instead of just editing 20 lines of copy and paste code. (This would be easier if every editor didn’t have subtly different syntax.)
I am going to use Python to automate repetitive tasks.