Another Java question LLMs can’t answer

This time I asked Gemini, “How does Maven version comparison differ from semver version comparison?”, and while this time its answer didn’t say anything patently false, it did fail to find the most important distinction. It’s operating at the level of an undergraduate with better than average writing skills but who still doesn’t really understand the subject matter. It didn’t find anything new, although in this case there was something new to be found. (I found it myself a few weeks ago, but haven’t gotten around to publishing it on the Web yet.)

The answer Gemini gave is:

The core distinction between Maven and SemVer version comparison lies in their underlying philosophies and the strictness of their rules:

Semantic Versioning (SemVer):

  • Semantic Meaning:
    • SemVer is designed to convey the meaning of changes. The “Major.Minor.Patch” structure explicitly indicates the type of change (breaking, feature, bug fix).
    • This allows developers to understand the impact of upgrading to a new version.
  • Strict Rules:
    • SemVer adheres to a strict specification, defining how versions should be incremented and compared. This promotes predictability and consistency.
    • Clear Precedence:
      • It has very clear rules on how pre-release versions are handled, which is important for establishing a clear order.

Maven Version Comparison:

  • Token-Based Ordering:
    • Maven’s version comparison is primarily based on tokenizing the version string and comparing the resulting segments.
    • While it can handle many common versioning schemes, its logic is more focused on establishing an order.
  • Flexibility and Ambiguity:
    • Maven offers greater flexibility in version formats, which can lead to ambiguity in certain cases.
    • Maven’s version comparison can handle more diverse version string formats than strict SemVer.
  • Less Semantic Focus:
    • Maven’s comparison is less concerned with the semantic meaning of version changes and more focused on determining which version is “newer.”

Key Differences in Practice:

  • Pre-release Handling:
    • While both systems handle pre-release versions, their comparison logic can differ in edge cases. SemVer has very specific rules, where Maven’s can be more varied.
  • Version String Parsing:
    • Maven’s version comparison algorithm is designed to be more tolerant of different version string formats, while SemVer is very strict. This means that Maven might be able to compare version strings that SemVer would consider invalid.
  • Semantic Information:
    • Semver encodes semantic information into the version number, Maven does not.

In summary, SemVer is about meaning and strict rules, while Maven is about ordering and flexibility. This leads to differences in how they handle pre-release versions and interpret version strings.

What’s interesting here is that Gemini correctly incorporates the content about this topic that’s already on the web. However, it completely misses the point that is not stated explicitly on the Web (yet) but that can be found by understanding the Semver spec and the Maven code base. Specifically it did not notice that Maven version comparison is case insensitive and Semver is case sensitive. That’s a pretty big difference. It’s one I noticed by reading a lot less than Gemini did, and then by thinking about it. Then I whipped up a quick little experiment in the form of a unit test to prove that what I thought was true was in fact true. Turned out it was true, and I’ll publish that on the Maven web site soonish.

It’s possible that I’m cherry picking Gemini here, simply because it’s the one I happen to have a subscription for. I should try other models too. But I’ll be surprised if any of them notice this. I went ahead and asked ChatGPT, and it also failed to notice the distinction in case sensitivity. It also repeated some ambiguous text from the Maven documentation that I’m in the process of replacing with more precise language.

I also went ahead and asked ChatGPT “Give me examples of valid semver version strings that Maven considers to be equal but semver considers to be different.” It still completely missed that Semver is case-sensitive and Maven is not, but also began to make factually incorrect statements that in a person would demonstrate a severe misunderstanding of how Maven compares version strings.

This is why we need experienced senior developers and technical writers. They can discover and create new knowledge. So far I don’t see a lot of evidence of LLMs creating anything new, at least not in software development. They’re very good at doing the same thing again very quickly, as computers always have been, and that’s important. However so far I don’t see a lot of evidence of LLMs discovering new facts, processes, and techniques that we don’t already know. The protein folding work is truly impressive, but it’s impressive in the same way that numerical solutions of differential equations are impressive. They’re not doing anything we don’t know how to do. They’re just doing it faster on a lot more data.

I know a few cases where maybe an ML model has discovered something new in biology. For example, James Collins’s group at the MIT Jameel Clinic have been using deep learning to find new antibiotics, and have made progress against drug-resistant bacteria. It appears the ML models have revealed things they would not have otherwise found. That’s pretty important. Possibly this is because these are not generic LLMs studying the literature, but very targeted ML models studying the raw data where they can see patterns others haven’t seen yet. I need to look into that.

Leave a Reply