All modified and coverable lines are covered by tests ? Comparison is base (a765aef) 85.95% compared to head (fe02e1b) 85.95%. Additional details and impacted files @@ Coverage Diff @@ ## master #546 +/- ## ========================================= Coverage 85.95% 85.95% Complexity 3480 3480 ========================================= Files 230 230 Lines 8225 8225 Branches 960 960 ========================================= Hits 7070 7070 Misses 865 865 Partials 290 290
Precisely identical. What happened? Did I change a comment? Well, no. In fact I added tests for
situations that were not currently covered, so why didn’t coverage increase?
The two new tests covered exceptional cases that Codecov doesn’t see because they test different arguments that can be passed to a public method. Code coverage percentage check line and sometimes branch coverage, but they have a blind spot for the much larger number of permutations of input values.
In this case those arguments are null, and the tests cover null pointer exceptions, but really they could have been any different arguments to the method not already covered. Real coverage increased but the metrics don’t show it.
This is another example of why it’s important to write the tests first. Relying on tools to identify cases that aren’t covered misses a lot. Without delving too deep into the commit history, I’d bet that what happened here is that someone changed the behavior of an existing method. The existing tests failed, and they commented them out rather than understanding the change in behavior and deciding whether it made sense.
What they should have done is written a test for the desired new behavior or changed the existing tests to test for the new behavior. Of course, if the new behavior wasn’t desired then they should have instead reverted it, or fixed their new code so the tests didn’t break. Again, without delving into commit history, I’m not sure if this change was intended or not. For now, I’m sticking with the current behavior since it seems reasonable. But test first changes, or at the very least not commenting out the test failures, would have alleviated this uncertainty.
There are opportunities for code coverage tools to do better, though I don’t yet know of any that do. For instance, if a method is declared to throw a certain exception, or has an @throws
Javadoc tag indicating a certain runtime exception, the tool could check that the exception was indeed thrown at some point during the tests.
Similarly it could inspect the arguments passed to a public method during a test and warn if the method only received positive numbers, or non-null objects, or non-empty strings, or whatever. It could even check that common edge conditions like 1, -1, 0, Inf, NaN, and Integer.MAX_INT
are covered.
unittest.mock, Python’s mocking framework is so much more powerful than EasyMock, Mockito, or any other Java mock framework I’ve ever used. You can replace any method you like with essentially arbitrary code. No longer do you have to contort APIs with convoluted dependency injection just to mock out network connections or reproduce error conditions.
Instead you just identify a method by name and module within the scope of the test method. When that method is invoked, the actual code is replaced with the mock code. You can do this long after the class being mocked was written. Model classes do not need to participate in their own mocking. You can mock any method anywhere at any time, in your own code or in dependencies. No dependency injection required. You can even mock fields.
By contrast Java only lets you mock objects (not methods) and only when you have an available API to insert the mock in place of the real thing.
Reason 2: None
is its own type.
In Java null
can be anything. In Python only None
is None
. An str
can’t be None
. A Foo
can’t be None
. An int
can’t be None. Union[None, Foo]
is not the same type as Foo
. This is beautifully simple and obvious, once you break Java habits of thinking of None
as essentially whatever type you need. It helps to find and prevent bugs and reason about code.
Java has had @Nullable
and @NonNull
for some years now, but these have never been integrated into the language, require special tools to check them, and aren’t nearly as powerful as Python’s simple decision that None is its own thing.
Reason 3: Named method arguments with defaults
These are so much more readable and less error prone than method overloading, I hardly know where to begin. In fact, these are so wonderful I’m tempted to write all my Python code with only named arguments.
Reason 4: Calling super.__init__
last
In Java the superclass constructor is always called first, before any other statements in a subclass constructor. There are reasons for this, but it’s often inconvenient and can require a lot of hanky panky to get values ready before they’re naturally available.
Python is just easier here. Call super.__init__
whenever you’re ready to call it: first, last, in the middle. It doesn’t matter.
Reason 5: f strings
Relatively new in Python — version 3.7? — but far more readable and less error prone than what we had before.
When Java added varargs, printf, and java.lang.Formatter
circa Java 1.5, we had decades of experience in C and C++ to teach us this was a bad API that led to dangerous bugs. Sadly, no one paid attention to that. Instead of creating a modern, sane format like Python’s f-strings, they mindlessly copied a misbegotten 30 year old kludge from C. At least Python eventually learned from Java’s mistake even Java didn’t learn from C’s.
I used to take Pythonistas at their word that they didn’t actually need strong, static compile time type checking. That was before I spent over a year writing Python more or less full time.
I am constantly blocked by not understanding which variables have which types. I am frequently spelunking through many levels of code and popping open the debugger to find out what type a variable actually has when. Not having explicit, enforced types makes code far harder to understand and edit.
Corollary: var
is very bad idea for Java and should not be used.
This is well known in the Python community today. PEP 484 is basically an admission that inline typing is a necessity for robust code, and Guido has admitted as much. It’s why large Python shops like Meta and Google have invested in tools like Pyre and Pytype to add strong typing to the language. These tools help, but they’re not as good as Java’s strong, reliable, static type declarations and type checking.
Reason 2: Checked exceptions
There are only two ways exceptions are handled in Python:
Method 1: Crash and dump the stack.
Method 2: Catch all exceptions, log an error message and a stack trace for each one, and then crash.
Yes, Python programmers could do better. No, in practice they don’t. I’m not talking about one-off hobbyist scripts either. Even major programs from big tech companies like Google fail badly here. When I was doing a lot of work in GCP, I routinely crashed the Cloud SDK, written in Python, leading to a stack trace that was not in my code and that I could do nothing about. When you’re contacting a network service, e.g. to deploy a new version of an App Engine app, it’s entirely possible the remote server will return an error. When it does, the Cloud SDK should look at the error code and report the problem to the end user in an intelligible fashion. Instead, more often than not, it prints a stack trace from the Cloud SDK’s own code, even though this is not the code that’s at fault. If any code was buggy, it was on the server but that stack trace wasn’t available.
This simply doesn’t happen nearly as often in Java, and checked exceptions are the main reason why it doesn’t. Java programmers have to go out of their way to ignore network failures. Python programmers have to go out of their way to handle network failures.
Even programmers who want to do better can’t because their dependencies don’t do better. Libraries can and do throw any exception and any time, and change which exceptions they throw when in patch releases. None of this is ever documented. Unless you control 100% of the code in your project, you can’t assume anything about which exceptions might be thrown when. Even if you think you control all the code, there’s nothing to prevent another programmer, or even your future self, from changing the exceptions in the code you depend on.
Reason 3: Performance
Python is slow. Sometimes this doesn’t matter. Sometimes it does. It matters more often than you think. For instance, even when I’m working on a program without strong performance requirements, Python routinely takes a minute or more to start up and run a unit test. That breaks my flow and prevents the fast feedback cycle Python was supposed to give us by being interpreted instead of compiled.
By contrast in Java, fast methods run fast. If a test isn’t doing very much, I can run it in less than a second in Eclipse or IntelliJ. Nothing in Python in VSCode is ever that fast. Maybe VSCode is the problem, I’m not sure, but I do know that I iterate code-test-debug way faster in Java than in Python.
Reason 4: Access Protection
Python doesn’t have private. Double underscores are a polite suggestion not to access something, not a rule. Python’s general philosophy on this is often summed up as “We are all adults here”.
Having now dealt with very large Python code bases written by top notch, very smart, highly paid Python programmers, I am prepared to say no, no we are not. I am very tired of being unable to change a dunder Python method in my own teams’ code because some alleged adult twelve time zones away made their mission critical product depend on the exact signature and behavior of our “private” method.
Tight dependency coupling is a bad traffic jam in Java. In Python it’s a Richter 9 earthquake that destroys all roads in a city and sets the gas stations on fire. The problems just aren’t at the same scale.
Reason 5: Javadoc
Python has doc comments but they’re rarely used or published. By contrast Java has a robust culture of documenting public APIs that enables Google to provide reliable answers for many questions. Outside the standard libraries, Python third party libraries rarely document methods adequately or make them available in a form Google can surface.
]]>I’ve noticed a common anti-pattern of defining declarative DSLs in Turing complete languages — specifically Python — to avoid the overhead of learning new syntax and tools, XML or JSON. Instead programmers define the DSL as a Python library and reuse the Python compiler with predictable results. Blaze/Bazel, Airflow, dataswarm, and many other projects have gone down this road. Gradle made the same mistake, only with Ruby instead of Python.
This is massive tech debt that causes massive problems (security, indeterminancy, irreproducibility) and has heavy cost. Never do this. It always leads to a huge expensive effort to redefine the language as its own thing (not Python) that still looks like Python, and the team ends up writing a complete parser in addition to everything else. XML is not that hard. Nut up and learn it.
Do not write declarative configs in a Turing complete language.
Do not invent Python subsets for config files. <cough>Starlark</cough>
If I don’t like the choice, I’ll just change it back, easy peasy.
More significantly, I already made an explicit choice to do this by seeking out an option several levels deep in the UI. This is not an accident. Just do what I said.
For some reason, when installing some random editor, Apple allows the app to take over all existing file mappings it wants without even informing the user, but when I make a very explicit choice to take this fully reversible action, Apple thinks I need to be warned away from it and double confirmed.
Here’s a suggestion: when the user tells the computer to do something, it should do it. Don’t ask the user if they really, really want to unless the action is both irreversible and potentially harmful. For instance, printing on paper might be irreversible but it’s unlikely to be harmful so it shouldn’t be confirmed. Design UIs so the user makes explicit choices they understand, and give them the option to undo when they choose. If there’s a confirmation step in the UI, something is wrong.
]]>This article is primarily focused on desktop browsers. I might have more to say about mobile platforms in a future post.
tldr; Use Firefox 70 or later with these three extensions:
This isn’t really achievable today, especially once logins and credit cards are considered, but you can get a lot closer. Here’s how.
Firefox is my recommended browser. Use version 70 or later.
Install these extensions:
Given these add-ons I don’t think it’s necessary any longer to worry about third party cookie preferences or clearing cookies on Exit in Firefox. I might be wrong about that, though.
Is HTTPS Everywhere needed in 2019? It doesn’t seem to be. I’ve deleted it from my computers.
Do not install Ghostery. It merely duplicates functionality of the above list.
Do not install the Google Analytics Opt-out Browser Add-on. Privacy Badger handles this.
Do not install the Duck Duck Go Privacy Tools. The only features of this you need are duplicated by the above list.
The one additional extension I’m considering adding to this list is Facebook Container,
though personally I almost never lo9g into Facebook and do so in a private window when necessary. It might be redundant with the above.
Please comment if you know something about this.
Same basic list of extensions: uBlock Origin, Privacy Badger, and Cookie Autodelete.
Configure these settings:
I tend not to install any extensions in Safari. I don’t use it for day-to-day browsing and often leave it as a backup for when I need to see what a web site looks like without any of these content blocking add-ons in place.
This isn’t a browser issue, but it is important. The goal here is to keep your ISP (Comcast, Verizon, your employer, etc.) from spying on you. I recommend installing and routinely using a VPN. I currently use Private Internet Access, but I’ve also heard good things about NordVPN.
You might need to turn this off to watch Netflix, and some countries and employers block them.
Many of these extensions have the scary permission to “Access your data for all websites” or “Read and change your data on all websites.” This is because the extension permission model in the browser is insufficiently granular. For instance, it’s not usually possible to grant an extension permission to read and change cookies without granting it the permission to read the DOM of every page you see. Browser vendors need to improve this.
All of this needs to be done without carefully manually curating which sites I visit in which tabs and containers and incognito mode. It should be automatic and easy to use.
This changes almost by the month. I certainly would not have given the same recommendations a year ago, and probably won’t a year from now.
Comments, suggestions, and updates are much appreciated. I’d love to hear about other addons and tricks to protect browser privacy. If you’d like to explain why I might want to consider a different extension to fill the same need, that’s good too.
]]>I used Google’s instructions for setting up a new WordPress site but that only got me to an empty site. I then had to import the old data and database which was non-trivial. Surprisingly the mysqldump of the old database is only about 10 MB.
The custom theme I use here had numerous deprecated functions. I had to turn on debugging info to fix some other issues, and that revealed the deprecated functions. I’ve now updated the theme to replace those.
Chrome tells me “This page is trying to load scripts from unauthenticated sources.” That seems to have something to do with the Google and Amazon ads. I haven’t figured out how to fix that yet. Probably it’s because they’re configured for http://cafe.elharo.com and temporarily this is on https://mokka-186313.appspot.com/. I don’t see the warning on the old site. If that doesn’t go away when I repoint the domain name, I’ll have to delve deeper.
The first hard nut to crack was getting the database connection set up. My first attempt failed because I used the password for the root database user instead of the wordpress database user. A doc bug has been filed to clarify that.
The second hard nut to crack was this bit of code in wp-db.php:
if ( WP_DEBUG ) {
mysqli_real_connect($this->dbh, $host, $this->dbuser, $this->dbpassword,
null, $port, $socket, $client_flags);
} else {
@mysqli_real_connect($this->dbh, $host, $this->dbuser, $this->dbpassword,
null, $port, $socket, $client_flags);
}
As best I can tell, this code leads to a non-critical DNS lookup error on the App Engine environment that doesn’t affect the running site directly. However it does output some extra junk onto the page before the rest of the page has loaded. This prevents WordPress from setting cookies which breaks logins. The error looks like a forgotten password problem, but it isn’t.
WordPress updates are likely to be a little more/less painful since I can’t run them from WordPress itself since the file system is read only. Instead I need to use the WordPress cli locally and update with:
gcloud app deploy --promote --stop-previous-version app.yaml cron.yaml
I need to figure out if I can download my whole WordPress install from the cloud or if I have to copy it from one machine to the next. On the plus side, that is considerably more secure than a classic WordPress install that expects to write directly into the file system. I definitely had some problems with that on the old host.
I also need to run this for a month and see what sort of costs it incurs. pair.com cost me $17.95 a month which covered four sites. However HTTPS would have cost me a lot more, and they were never very helpful when I tried to set that up.
Although the initial migration preserved all my images, they now appear to have vanished. I suspect turning on one of the App Engine WordPress plugins makes WordPress look for them in Google Cloud Storage instead of the file system. Hmm, looks like the problem is that the images are all pointing to the old site at URLs like http://cafe.elharo.com/wp-content/uploads/2016/02/noservertypedefinition.png instead of with relative URLs. That’s weird. In any case, should be fixed now.
If you notice any remaining problems, please leave a comment. (Assuming comments work. I haven’t tested that yet.)
Once I’ve got the bugs worked out of this site, I’ll try to move Mokka mit Schlag over as well. That’s going to be trickier though because www.elharo.com is a mix of WordPress and classic static HTML. I need to figure out if App Engine Standard can handle that, or if I’ll need to move to something more complex like GCE.
]]>There are a lot of things to complain about here, but the main one is not obvious from a static screenshot (though it turns out a static screenshot has exactly the problem):
The message cannot be copied.
The first thing any competent computer user does when presented with an incomprehensible error message such as “No server type definition” is to Google it. Only why should the user have to type it in? This message should be able to be copied right out of the UI.
And why stop at error dialogs? Any confusing UI element should be able to be selected and copied. This is critical for tech support, debugging, documentation, and anything that happens when apps go wrong. Here’s a history window from Crashplan. None of these messages can be copied and emailed to tech support. If I want to tell them what Crashplan is doing wrong (something they refuse to believe can possibly be happening) I have to laboriously retype all this information:
This is one of the (few) things web apps often get right and desktop and mobile apps usually get wrong. As long as the UI is HTML, you can probably select it. Here’s a recent Firefox error message. The complete message is selectable so I can, for example, copy and paste it into an email to the site owner alerting her to the problem:
That’s not the only use for copyable text either. I can (and did) easily use it for alt
text of the above img
element, which improves the searchability of the problem. I can search for the error message in the Firefox source code repository, if I feel like fixing the grammatical error in the message. There are probably a dozen other uses I haven’t yet imagined that are enabled by having easy access to the message as text. Text is fluid and accessible, and admits many different uses, in a way screenshots don’t.
When writing UI code, resist the urge to display non-selectable labels. For example, in Swing never use the JLabel
class because JLabels are not selectable without a lot of error-prone custom event handling code. Instead use a JTextPane
and configure it so it looks like a JLabel:
JTextPane label = new JTextPane();
label.setText("Something bad happened!");
label.setEditable(false);
label.setBackground(null);
label.setBorder(null);
In some circumstances, you may also want to change the font to match what users expect in a label. Exploding Pixels has instructions for doing that if you find it necessary.
I blame Oracle for making this necessary. Why do toolkits even allow non-selectable text? There should be no such thing as an unselectable, uncopyable label. Uneditable, yes; but not uncopyable. JLabel
must die.
Widget libraries (GTK+, Cocoa, AWT, Swing, SWT, etc.) should remove non-selectable labels. The default should be to allow the user to select and copy UI text.
There are a few widgets that have non-selectable text because clicking them already does something else. Enabling selection of menu items would probably confuse less dexterous users who select a menu item they mean to click. Perhaps buttons as well. But all labels — that is, all strings of text whose primary purpose is to be shown to the user — should be able to be copied and pasted. It’s simple respect for the user.
]]>The new project is more traditional git: many branches, many developers, many forks. Perhaps the git/bitkeeper distributed model makes sense for projects like the Linux kernel where there are many independent repositories on many developers’ machines, none authoritative. However for a traditional single team, single repository project, git feels far too heavyweight and complex for my tastes. I find it slows me way down. However like most developers I’m slowly getting used to it, and developing my own small subset of the vast corpus of git functionality that I actually use.
Git is designed to support frequent commits, and pass change requests back and forth as lists of commits so the development work is tracked, rather than by passing file diffs back and forth like most other systems. Now what really confuses me though is that no one seems to actually use it this way. if you want to submit a patch, you do not in fact send the list of commits that shows the history and ongoing work. Instead you rebase everything against master and send a single commit that squashes all the changes together, which seems to be exactly what git is designed to make unnecessary. In other words, we’re using git as if it were a traditional single-master system such as Subversion. Why? And does any project actually expect developers to send their full list of commits rather than a single squashed commit?
(Side note: Perforce is the best of both worlds here. To my knowledge, Perforce and its clones are the only version control system that manages to separate out the ongoing work in a change list and the final commit, and show you both depending on what you want to see.)
Regardless of the wisdom of discarding all history before submitting, like removing all the scaffolding before publishing a mathematical proof, it is how almost all git-based projects operate. Like most (all?) operations in git, it is far from obvious how to actually squash a series of commits down so it’s one clean diff with the current master. And also like most operations in git, there are multiple ways to do this. What follows is the approach I’ve found easiest and most reliable:
Update: As of April 1, 2016, this post is out of date for developers working on Github. Just use the new
Confirm Squash and Merge button instead.
Assumptions:
There may be other, implicit assumptions about workflow I don’t realize yet. E.g. I don’t know if this works with a non-Github system.
Here’s the short version.
$ git checkout master $ git fetch $ git pull $ git checkout feature_branch $ git merge master $ git reset origin/master $ git add any/untracked/new/files $ git commit -a -m "Here's what this feature does" $ git push -f origin feature_branch
Finally, go to the github UI and merge origin/feature_branch into origin/master. Of course, this may change if your team has a different workflow or does not use github.
In more detail:
$ git checkout master $ git fetch $ git pull
$ git checkout feature_branch $ git merge master
At this point, you’ll be presented with a bunch of screens in your editor of choice. Just save them all.
and then commit the change:
$ git add path/to/untracked/file1 path/to/untracked/file2 ...
$ git add path/to/untracked/file1 path/to/untracked/file2 ...
$ git commit -a -m "Here's what this feature does
The one thing you lose in this process is your old commit message, so you need to enter it again.
$ git commit -a -m "Here's what this feature does
The one thing you lose in this process is your old commit message, so you need to enter it again.
$ git push -f origin feature_branch
There are other ways to squash git commits, in particular using the rebase command. However, in my experience rebasing gets very confusing after more than a few commits, especially if you’ve had to merge changes from other developers or branches into your feature_branch in the meantime. Resetting the origin effectively does a diff between your local branch and head, which is a lot easier to follow.
]]>Copyright Barebones Software 1992-2015.
Has it really been more than 20 years since those early freeware versions on System 7? What is it about text editors that enables them to last so long? emacs and vi are even older.
This got me thinking. What software am I still using today that I was using 20 or more years ago?
In the Windows and Mac worlds, almost nothing. The operating systems have changed completely and aren’t remotely the same. Photoshop and Illustrator, I suppose, have a continuous history going back that far; and would be recognizable to a time traveling graphic designer from 1990. Adobe Acrobat is almost that old. QuarkXPress still exists, though PageMaker is dead. The Microsoft Office suite of Word, Excel, and PowerPoint is still going strong, though like the operating systems it runs on, it’s been rewritten and revised so much that little other than the brand name survives.
The Unix world has been kinder to its senior citizens (though as with Windows and Mac the operating system itself has changed out from under them). gcc is still the compiler of choice. Most of the other gnu tools are shipped with every Linux distro. My day job recently launched a LaTeX project of all things. If you told me back in my undergraduate days that I’d be fighting with LaTeX for the next 30 years, it might have convinced me to drop physics and take the MCATs. It’s scary to think I might retire before LaTeX does.
Scientific software has also lasted. Mathematica goes back to 1988, Maple and Matlab even further. SPSS and SAS are still going strong though R and Python are the choice of a new generation of statisticians. I’ve never used AutoCad, but it keeps chugging right along.
How much of today’s software will still be relevant 25 years from now? I don’t expect to still be using Microsoft Office in 2040. Photoshop and Ilustrator could go the way of Persuasion, Freehand, and PageMaker. I hate to admit it, but LaTeX will probably still be confounding graduate students for decades to come. TextMate may join emacs and BBEdit in the ranks of editors burned into programmers’ muscle memory. If Adobe doesn’t kill the golden goose with more annual licensing shenanigans, Lightroom could outlast Photoshop.
Nonetheless, most of the software I use day to day (web browsers, eReaders, IDEs, etc.) is really just a display mechanism for data in a standard format and could be replaced by something better tomorrow. If the data lasts, the software is replaceable. And that is as it should be.
]]>