Declarative programming is, currently, the dominant paradigm of an extensive and diverse set of domains such as databases, templating and configuration management.
In a nutshell, declarative programming consists of instructing a program on what needs to be done, instead of telling it how to do it. In practice, this approach entails providing a domain-specific language (DSL) for expressing what the user wants, and shielding them from the low-level constructs (loops, conditionals, assignments) that materialize the desired end state.
While this paradigm is a remarkable improvement over the imperative approach that it replaced, I contend that declarative programming has significant limitations, limitations that I explore in this article. Moreover, I propose a dual approach that captures the benefits of declarative programming while superseding its limitations.
CAVEAT: This article emerged as the result of a multi-year personal struggle with declarative tools. Many of the claims I present here are not thoroughly proven, and some are even presented at face value. A proper critique of declarative programming would take considerable time, effort, and I would have to go back and use many of these tools; my heart is not in such an undertaking. The purpose of this article is to share a few thoughts with you, pulling no punches, and showing what worked for me. If you’ve struggled with declarative programming tools, you might find respite and alternatives. And if you enjoy the paradigm and its tools, don’t take me too seriously.
If declarative programming works well for you, I’m in no position to tell you otherwise.
The Merits Of Declarative Programming
Before we explore the limits of declarative programming, it is necessary to understand its merits.
Arguably the most successful declarative programming tool is the relational database (RDB). It might even be the first declarative tool. In any case, RDBs exhibit the two properties that I consider archetypical of declarative programming:
- A domain specific language (DSL): the universal interface for relational databases is a DSL named Structured Query Language, most commonly known as SQL.
- The DSL hides the lower level layer from the user: ever since Edgar F. Codd’s original paper on RDBs, it is plain that the power of this model is to dissociate the desired queries from the underlying loops, indexes and access paths that implement them.
Before RDBs, most database systems were accessed through imperative code, which is heavily dependent on low-level details such as the order of records, indexes and the physical paths to the data itself. Because these elements change over time, code often stops working because of some underlying change in the structure of the data. The resulting code is hard to write, hard to debug, hard to read and hard to maintain. I’ll go out a limb and say that most of this code was in, all likelihood, long, full of proverbial rats’ nests of conditionals, repetition and subtle, state-dependent bugs.
In the face of this, RDBs provided a tremendous productivity leap for systems developers. Now, instead of thousands of lines of imperative code, you had a clearly defined data scheme, plus hundreds (or even just tens) of queries. As a result, applications had only to deal with an abstract, meaningful and lasting representation of data, and interface it through a powerful, yet simple query language. The RDB probably raised the productivity of programmers, and companies that employed them, by an order of magnitude.
What are the commonly listed advantages of declarative programming?
- Readability/usability: a DSL is usually closer to a natural language (like English) than to pseudocode, hence more readable and also easier to learn by non-programmers.
- Succinctness: much of the boilerplate is abstracted by the DSL, leaving less lines to do the same work.
- Reuse: it is easier to create code that can be used for different purposes; something that’s notoriously hard when using imperative constructs.
- Idempotence: you can work with end states and let the program figure it out for you. For example, through an upsert operation, you can either insert a row if it is not there, or modify it if it is already there, instead of writing code to deal with both cases.
- Error recovery: it is easy to specify a construct that will stop at the first error instead of having to add error listeners for every possible error. (If you’ve ever written three nested callbacks in node.js, you know what I mean.)
- Referential transparency: although this advantage is commonly associated with functional programming, it is actually valid for any approach that minimizes manual handling of state and relies on side effects.
- Commutativity: the possibility of expressing an end state without having to specify the actual order in which it will be implemented.
While the above are all commonly cited advantages of declarative programming, I would like to condense them into two qualities, which will serve as guiding principles when I propose an alternative approach.
- A high-level layer tailored to a specific domain: declarative programming creates a high-level layer using the information of the domain to which it applies. It is clear that if we’re dealing with databases, we want a set of operations for dealing with data. Most of the seven advantages above stem from the creation of a high-level layer that is precisely tailored to a specific problem domain.
- Poka-yoke (fool-proofness): a domain-tailored high-level layer hides the imperative details of the implementation. This means that you commit far fewer errors because the low-level details of the system are simply not accessible. This limitation eliminates many classes of errors from your code.
Two Problems With Declarative Programming
In the following two sections, I will present the two main problems of declarative programming: separatenessand lack of unfolding. Every critique needs its bogeyman, so I will use HTML templating systems as a concrete example of the shortcomings of declarative programming.
The Problem With DSLs: Separateness
Imagine that you need to write a web application with a non-trivial number of views. Hard coding these views into a set of HTML files is not an option because many components of these pages change.
The most straightforward solution, which is to generate HTML by concatenating strings, seems so horrible that you will quickly look for an alternative. The standard solution is to use a template system. Although there are different types of template systems, we will sidestep their differences for the purpose of this analysis. We can consider all of them to be similar in that the main mission of template systems is to provide an alternative to code that concatenates HTML strings using conditionals and loops, much like RDBs emerged as an alternative to code that looped through data records.
Let’s suppose we go with a standard templating system; you will encounter three sources of friction, which I will list in ascending order of importance. The first is that the template necessarily resides in a file separate from your code. Because the templating system uses a DSL, the syntax is different, so it cannot be in the same file. In simple projects, where file counts are low, the need to keep separate template files may duplicate or treble the amount of files.
I open an exception for Embedded Ruby templates (ERB), because those are integrated into Ruby source code. This is not the case for ERB-inspired tools written in other languages since those templates must also be stored as different files.
The second source of friction is that the DSL has its own syntax, one different from that of your programming language. Hence, modifying the DSL (let alone writing your own) is considerably harder. To go under the hood and change the tool, you need to learn about tokenizing and parsing, which is interesting and challenging, but hard. I happen to see this as a disadvantage.
You may ask, “Why on earth would you want to modify your tool? If you are doing a standard project, a well-written standard tool should fit the bill.” Maybe yes, maybe no.
A DSL never has the full power of a programming language. If it did, it wouldn’t be a DSL anymore, but rather a full programming language.
But isn’t that the whole point of a DSL? To not have the full power of a programming language available, so that we can achieve abstraction and eliminate most sources of bugs? Maybe, yes. However, most DSLs start simple and then gradually incorporate a growing number of the facilities of a programming language until, in fact, it becomes one. Template systems are a perfect example. Let’s see the standard features of template systems and how they correlate to programming language facilities:
- Replace text within a template: variable substitution.
- Repetition of a template: loops.
- Avoid printing a template if a condition is not met: conditionals.
- Partials: subroutines.
- Helpers: subroutines (the only difference with partials is that helpers can access the underlying programming language and let you out of the DSL straightjacket).
This argument, that a DSL is limited because it simultaneously covets and rejects the power of a programming language, is directly proportional to the extent that the features of the DSL are directly mappable to the features of a programming language. In the case of SQL, the argument is weak because most of the things SQL offers are nothing like what you find in a normal programming language. At the other end of the spectrum, we find template systems where virtually every feature is making the DSL converge towards BASIC.
Let’s now step back and contemplate these three quintessential sources of friction, summed up by the concept of separateness. Because it is separate, a DSL needs to be located on a separate file; it is harder to modify (and even harder to write your own), and (often, but not always) needs you to add, one by one, the features you miss from a real programming language.
Separateness is an inherent problem of any DSL, no matter how well designed.
We now turn to a second problem of declarative tools, which is widespread but not inherent.
Another Problem: Lack Of Unfolding Leads To Complexity
If I had written this article a few months ago, this section would have been named Most Declarative Tools Are #@!$#@! Complex But I Don’t Know Why. In the process of writing this article I found a better way of putting it: Most Declarative Tools Are Way More Complex Than They Need To Be. I will spend the rest of this section explaining why. To analyze the complexity of a tool, I propose a measure called the complexity gap. The complexity gap is the difference between solving a given problem with a tool versus solving it in the lower level (presumably, plain imperative code) that the tool intends to replace. When the former solution is more complex than the latter, we are in presence of the complexity gap. By more complex, I mean more lines of code, code that’s harder to read, harder to modify and harder to maintain, but not necessarily all of these at the same time.
Please note that we’re not comparing the lower level solution against the best possible tool, but rather against no tool. This echoes the medical principle of “First, do no harm”.
Signs of a tool with a large complexity gap are:
- Something that takes a few minutes to describe in rich detail in imperative terms will take hours to code using the tool, even when you know how to use the tool.
- You feel you are constantly working around the tool rather than with the tool.
- You are struggling to solve a straightforward problem that squarely belongs in the domain of the tool you are using, but the best Stack Overflow answer you find describes a workaround.
- When this very straightforward problem could be solved by a certain feature (which does not exist in the tool) and you see a Github issue in the library that features a long discussion of said feature with +1s interspersed.
- A chronic, itching, longing to ditch the tool and do the whole thing yourself inside a _ for- loop_.
I might have fallen prey to emotion here since template systems are not that complex, but this comparatively small complexity gap is not a merit of their design, but rather because the domain of applicability is quite simple (remember, we’re just generating HTML here). Whenever the same approach is used for a more complex domain (such as configuration management) the complexity gap may quickly turn your project into a quagmire.
That said, it is not necessarily unacceptable for a tool to be somewhat more complex than the lower level it intends to replace; if the tool yields code that is more readable, concise and correct, it can be worth it t. It’s an issue when the tool is several times more complex than the problem it replaces; this is flat-out unacceptable. Brian Kernighan famously stated that, “Controlling complexity is the essence of computer programming.” If a tool adds significant complexity to your project, why even use it?
The question is, why are some declarative tools so much more complex than they need be? I think it would be a mistake to blame it on poor design. Such a general explanation, a blanket ad-hominem attack on the authors of these tools, is not fair. There has to be a more accurate and enlightening explanation.
My contention is that any tool that offers a high level interface to abstract a lower level must unfold this higher level from the lower one. The concept of unfolding comes from Christopher Alexander’s magnum opus, The Nature of Order – in particular Volume II. It is (hopelessly) beyond the scope of this article (not to mention my understanding) to summarize the implications of this monumental work for software design; I believe its impact will be huge in years to come. It is also beyond this article to provide a rigorous definition of unfolding processes. I will use here the concept in a heuristic way.
An unfolding process is one that, in a stepwise fashion, creates further structure without negating the existing one. At every step, each change (or differentiation, to use Alexander’s term) remains in harmony with any previous structure, when previous structure is, simply, a crystallized sequence of past changes.
Interestingly enough, Unix is a great example of the unfolding of a higher level from a lower one. In Unix, two complex features of the operative system, batch jobs and coroutines (pipes), are simply extensions of basic commands. Because of certain fundamental design decisions, such as making everything a stream of bytes, the shell being a userland program and standard I/O files, Unix is able to provide these sophisticated features with minimal complexity.
To underline why these are excellent examples of unfolding, I would like to quote a few excerpts of a 1979 paper by Dennis Ritchie, one of the authors of Unix:
On batch jobs:
“… the new process control scheme instantly rendered some very valuable features trivial to implement; for example detached processes (with
&) and recursive use of the shell as a command. Most systems have to supply some sort of special
batch job submissionfacility and a special command interpreter for files distinct from the one used interactively.”
“The genius of the Unix pipeline is precisely that it is constructed from the very same commands used constantly in simplex fashion.”
This elegance and simplicity, I argue, comes from an unfolding process. Batch jobs and coroutines are unfolded from previous structures (commands run in a userland shell). I believe that because of the minimalist philosophy and limited resources of the team that created Unix, the system evolved stepwise, and as such, was able to incorporate advanced features without turning its back on to the basic ones because there weren’t enough resources to do otherwise.
In the absence of an unfolding process, the high level will be considerably more complex than necessary. In other words, the complexity of most declarative tools stem from the fact that their high level does not unfold from the low level they intend to replace.
This lack of unfoldance, if you forgive the neologism, is routinely justified by the necessity to shield the user from the lower level. This emphasis on poka-yoke (protecting the user from low level errors) comes at the expense of a large complexity gap that is self-defeating because the extra complexity will generate new classes of errors. To add insult to injury, these classes of errors have nothing to do with the problem domain but rather with the tool itself. We would not go too far if we describe these errors as iatrogenic.
Declarative templating tools, at least when applied to the task of generating HTML views, are an archetypical case of a high level that turns its back on the low level it intends to replace. How so? Because generating any non-trivial view requires logic, and templating systems, especially logic-less ones, banish logic through the main door and then smuggle some of it back through the cat door.
Note: An even weaker justification for a large complexity gap is when a tool is marketed as magic, or something that just works, the opaqueness of the low level is supposed to be an asset because a magic tool is always supposed to work without you understanding why or how. In my experience, the more magical a tool purports to be, the faster it transmutes my enthusiasm into frustration.
But what about the separation of concerns? Shouldn’t view and logic remain separate? The core mistake, here, is to put business logic and presentation logic in the same bag. Business logic certainly has no place in a template, but presentation logic exists nevertheless. Excluding logic from templates pushes presentation logic into the server where it is awkwardly accommodated. I owe the clear formulation of this point to Alexei Boronine, who makes an excellent case for it in this article.
My feeling is that roughly two thirds of the work of a template resides in its presentation logic, while the other third deals with generic issues such as concatenating strings, closing tags, escaping special characters, and so on. This is the two-faced low level nature of generating HTML views. Templating systems deal appropriately with the second half, but they don’t fare well with the first. Logic-less templates flat out turn their back on this problem, forcing you to solve it awkwardly. Other template systems suffer because they truly need to provide a non-trivial programming language so their users can actually write presentation logic.
To sum up; declarative templating tools suffer because:
- If they were to unfold from their problem domain, they would have to provide ways to generate logical patterns;
- A DSL that provides logic is not really a DSL, but a programming language. Note that other domains, like configuration management, also suffer from lack of “unfoldance.”
I would like to close the critique with an argument that is logically disconnected from the thread of this article, but deeply resonates with its emotional core: We have limited time to learn. Life is short, and on top of that, we need to work. In the face of our limitations, we need to spend our time learning things that will be useful and withstand time, even in the face of fast changing technology. That is why I exhort you to use tools that don’t just provide a solution but actually shed a bright light on the domain of its own applicability. RDBs teach you about data, and Unix teaches you about OS concepts, but with unsatisfactory tools that don’t unfold, I’ve always felt I was learning the intricacies of a sub-optimal solution while remaining in the dark about the nature of problem it intends to solve.
The heuristic I suggest you to consider is, value tools that illuminate their problem domain, instead of tools that obscure their problem domain behind purported features.
#codango #developer #development #coder #coding
We're happy to share this resource that we found. The content displayed on this page is property of it's original author and/or their organization.