Python is not a DSL

Sunday, April 16th, 2023

How many times have you seen someone use a hammer to pound screws because they are a hammer expert, they are comfortable with hammers, they don’t know how to use a screwdriver, and they don’t want to take a week to learn how to use a screwdriver? Maybe not so much if you’re a carpenter, but if you’re a software developer it happens all the time.

I’ve noticed a common anti-pattern of defining declarative DSLs in Turing complete languages — specifically Python — to avoid the overhead of learning new syntax and tools, XML or JSON. Instead programmers define the DSL as a Python library and reuse the Python compiler with predictable results. Blaze/Bazel, Airflow, dataswarm, and many other projects have gone down this road. Gradle made the same mistake, only with Ruby instead of Python.

This is massive tech debt that causes massive problems (security, indeterminancy, irreproducibility) and has heavy cost. Never do this. It always leads to a huge expensive effort to redefine the language as its own thing (not Python) that still looks like Python, and the team ends up writing a complete parser in addition to everything else. XML is not that hard. Nut up and learn it.

Do not write declarative configs in a Turing complete language.
Do not invent Python subsets for config files. <cough>Starlark</cough>