Hachi
We do security research on Cardano
overzZzealous: the Peculiar Semantics of || and && in Plutus Core
Contents
TL;DR: We identified an issue in the semantics of Plutus Core, which we refer to as overzZzealous. This issue may slow down and raise the cost of evaluating code on the Cardano blockchain, or even cause validators to reject transactions and thus block assets. Since developers rarely engage with Plutus Core directly, most are oblivious to overzZzealous. Furthermore, as the problem comes from the design of Plutus Core, it is challenging to address. It has been reported to the Plutus team.
Introduction
One never-ending struggle in computing is to guarantee that code behaves as intended. Developers want to write programs whose behaviour aligns with their intentions and want interpreters/compilers to yield predictable results. In a well-designed and well-documented language, it is harder to make mistakes: either because the behaviour of the language is predictable and consistent with our expectations or because the type system prevents some classes of bugs from happening. If one of these points does not hold and the code does not behave as intended, we find ourselves stuck with a bug whose severity can range from mild to catastrophic.
In the Cardano environment, the language designers pay attention to the above points, which is extremely important as the pieces of code that are being written are meant to run on the blockchain, where they can manipulate all sorts of data, including that with monetary value. For instance, the Plutus language for writing off- and on-chain code is based on Haskell, a mature, strongly-typed, lazy, functional language with a rich ecosystem. However, contrary to Haskell and Plutus, the on-chain representation in Plutus Core – which Plutus compiles to – has strict semantics with the aim of making it more predictable for programmers.
Unfortunately, the choice of strict semantics does not come without problems.
These inevitably find their way into the Cardano codebase in various places.
This article shows that combining Haskell with strict semantics can have
unexpected and potentially dangerous consequences. We discovered that ||
and
&&
have strict semantics in Plutus Core, even though these operators typically
have lazy semantics even in programming languages that are otherwise strict. We
name this peculiar behaviour overzZzealous.
(See also our report of this behaviour to the Plutus team.)
The rest of this blog post consists of three parts. In the first part, we
discuss strict and lazy semantics of ||
and &&
in several programming
languages, bring the design of Plutus Core into context, and finally show how
overzZzealous manifests itself. In the second part, we guide
the reader step-by-step into reproducing and exhibiting this peculiar behaviour.
Finally, in the third part, we discuss whether this is a problem or not
(spoiler: in our opinion, it is).
About (Non-)Laziness of Boolean Operators
Lazy Operators and Strict Function Application
Programming languages usually include boolean operators “or” and “and”, often
written as ||
and &&
respectively. For instance, we can express that a
number n
is either smaller than 10
or bigger than 100
and divisible by 3
(ie. equal to 0
modulo 3
) with:
(n < 10 || n > 100) && (n mod 3 == 0)
In most programming languages (eg. C, Java, OCaml), these operators are lazy.
This means that they first evaluate their left-hand side and then evaluate
their right-hand side only if it is necessary. For instance, in n < 10 || n > 100
, the sub-expression n > 100
is evaluated only if n < 10
evaluated to
false
. In other words, a || b
behaves exactly like if a then true else b
.
Meanwhile, a && b
behaves exactly like if a then b else false
where b
is
evaluated only if a
evaluates to true
.
This behaviour differs from strict function application, where we evaluate all the arguments first. This is known as the strict semantics of function application. For instance, in OCaml, if we were to define our own “or” and “and” functions and use them to rewrite the above expression, such as
let my_or a b = if a then true else b
and my_and a b = if a then b else false
in my_and (my_or (n < 10) (n > 100)) (n mod 3 == 0)
then all the arguments (n < 10
, n > 100
and n mod 3 == 0
) are always
evaluated. In general, strict languages differentiate between normal functions and
operators such as ||
and &&
, whose semantics cannot be reproduced easily
with functions.
Haskell and Off-chain Plutus’ Pervasive Laziness
Unlike OCaml, Haskell features lazy semantics of function application. Arguments of a function call are only evaluated if the body of the function uses them. Consider the following Haskell code:
let my_or a b = if a then True else b
my_and a b = if a then b else False
in my_and (my_or (n < 10) (n > 100)) (n `mod` 3 == 0)
In the program above, we only evaluate n > 100
if n < 10
evaluates to False
. In such a
context, lazy ||
and &&
are not special in any way: they are just normal
functions in a world where function application is always lazy.
When off-chain (that is, when running anywhere else other than on the Cardano
blockchain), Plutus is simply Haskell with a Cardano-specific standard library.
Therefore, it shares the lazy semantics of function application. In fact,
||
and &&
are defined as ordinary functions in Plutus’ standard
library. For example, ||
is defined as:
(||) :: Bool -> Bool -> Bool
(||) l r = if l then True else r
On-chain Plutus’ Peculiar Semantics
Although on-chain Plutus shares the same syntax as Plutus and Haskell, its compilation pipeline, design, and semantics are entirely different from off-chain code. Two essential design choices matter for us in relation to overzZzealous:
- On-chain Plutus is built on top of Haskell, allowing it to enjoy Haskell’s mature compiler and rich ecosystem, particularly Haskell’s syntax and type system, which have been carefully refined over the years.
- On-chain Plutus is a language with strict function application. This is motivated by the fact that it allows for better predictability of what is going to be computed and how, an essential property for on-chain code.
However, together, these two reasonable design choices combine into an
unfortunate mixture: ||
and &&
, as in off-chain Plutus, are functions
and function application is strict, meaning that ||
and &&
will always
evaluate both their arguments. This behaviour can come as a confusing feature
as the usual understanding of these two operators (or functions depending on the
language) is that they are lazy.
This behaviour surprised us, and we believe this will surprise many other developers. Introducing unexpected behaviour in their programs has these two critical effects:
- Since programmers assume that
||
and&&
are lazy, they will not expect that their program becomes overzealous and runs as much code by constantly evaluating the right-hand sides of boolean operators. For on-chain code that runs in an environment with limited resources and where processing costs gas/fee/money, this can become expensive. - When the right-hand side includes side effects, such as throwing an error in
the relatively common
conditionMet || error ()
pattern, this can completely change the behaviour of a piece of code. For example, it may cause a validator to reject supposedly-valid transactions and potentially block assets.
Demonstrating overzZzealous
We want to demonstrate that overzZzealous can occur in a trustworthy environment. For that, we chose to reproduce the issue within the Plutus Starter project.
Cloning the Repository
Clone our fork of the Plutus Starter project. It takes the
main
branch of the upstream project, as of commit 48ab4d5
(October 12,
2021), and adds one commit updating the README
and adding two Shell scripts to
help reproduce the issue.
Playing the Game
As a first step, let us follow the Plutus Application Backend (PAB) example
described in its README. It describes an example
application with which “we can serve and interact with contracts over a web
API”. By default, the PAB is configured with a guessing game contract named
Game
, located at ./examples/src/Plutus/Contracts/Game.hs
.
Let us build and run the PAB executable with cabal run plutus-starter-pab
.
Once this is done, we can play the game with the Shell script
./play-game.sh
. As it provides a wrong guess by default, the
last line of the PAB executable’s output should look like this:
[WARNING] W8e8fdb9: Validation error: Phase2 4f530b2a2d96fa34e6b1393fca5b661eb36b6a83c5ec271d4d1316f1a329586b: ScriptFailure (EvaluationError ["PT5"] "CekEvaluationFailure")
Now let us restart the PAB executable: we quit the previous execution with
<enter>
and run cabal run plutus-starter-pab
again. Once this is done, we
can play the game again with the correct answer this time. This can
be done by running ./play-game.sh
with the --guess-right
argument. This
time, the last four lines of the PAB executable’s output should look like the
following:
[INFO] W162f384: Finished balancing. fd03c93cec942edbc2f2547d33a7a3dc45551230638bcb1fe9fb27082d087411
[INFO] W162f384: Submitting tx: fd03c93cec942edbc2f2547d33a7a3dc45551230638bcb1fe9fb27082d087411
[INFO] 25456589-c00f-4f3a-8b14-d8ce089e7d5f: "Waiting for guess or lock endpoint..."
[INFO] Slot 26: TxnValidate fd03c93cec942edbc2f2547d33a7a3dc45551230638bcb1fe9fb27082d087411
If everything works smoothly, we are ready to change the code of the on-chain
validator to include ||
and &&
and exhibit overzZzealous.
Changing the Semantics?
To expose this issue, let us consider the following lines from the Game
contract in ./examples/src/Plutus/Contracts/Game.hs
:
gameInstance :: Scripts.TypedValidator Game
gameInstance = Scripts.mkTypedValidator @Game
$$(PlutusTx.compile [|| validateGuess ||])
$$(PlutusTx.compile [|| wrap ||]) where
wrap = Scripts.wrapValidator @HashedString @ClearString
-- [...] --
-- | The validation function (Datum -> Redeemer -> ScriptContext -> Bool)
validateGuess :: HashedString -> ClearString -> ScriptContext -> Bool
validateGuess hs cs _ = isGoodGuess hs cs
isGoodGuess :: HashedString -> ClearString -> Bool
isGoodGuess (HashedString actual) (ClearString guess') = actual == sha2_256 guess'
In particular, we can see that validateGuess
is meant to be run on-chain
because it shows up as an argument to PlutusTx.compile
. For now, it only calls
the auxiliary function isGoodGuess
.
Let us change that slightly to something that should have the same semantics
but uses the ||
operator. This can be achieved by calling the Shell
script ./modify-code.sh
with the argument --with-or
. It makes the following modification to the source code:
diff --git a/examples/src/Plutus/Contracts/Game.hs b/examples/src/Plutus/Contracts/Game.hs
index 74625d2..b639356 100644
--- a/examples/src/Plutus/Contracts/Game.hs
+++ b/examples/src/Plutus/Contracts/Game.hs
@@ -94,7 +94,7 @@ clearString = ClearString . toBuiltin . C.pack
-- | The validation function (Datum -> Redeemer -> ScriptContext -> Bool)
validateGuess :: HashedString -> ClearString -> ScriptContext -> Bool
-validateGuess hs cs _ = isGoodGuess hs cs
+validateGuess hs cs _ = if True || error () then isGoodGuess hs cs else True
isGoodGuess :: HashedString -> ClearString -> Bool
isGoodGuess (HashedString actual) (ClearString guess') = actual == sha2_256 guess'
If the semantics of ||
were lazy, the expression True || error ()
should
evaluate to True
without ever evaluating the error ()
on the right-hand
side. Therefore, this whole expression should always call isGoodGuess hs cs
in
the same way as before.
Playing Again and— Wait What?
Let us then play the game again. We restart the PAB executable, and we run
./play-game.sh
again. This time, regardless of whether we provide the right or wrong
answer, the transaction cannot be validated! We always get a validation error
like the one below:
[WARNING] W8e8fdb9: Validation error: Phase2 4f530b2a2d96fa34e6b1393fca5b661eb36b6a83c5ec271d4d1316f1a329586b: ScriptFailure (EvaluationError ["PT5"] "CekEvaluationFailure")
What happened here? Contrary to our expectation, the ||
is not evaluated
lazily: both sides are always evaluated, no matter what. Since the right-hand
side calls error ()
, the validator fails systematically instead of checking
whether the guess is right or wrong.
Is This an Issue?
We firmly believe that it is. Let us explore why in more detail.
Semantic Side-Effects
Firstly, overzZzealous may introduce bugs when in the presence of side
effects. In pure code, there is no issue (other than a performance issue) since
both lazy and strict evaluations produce the same results. However, code with
side-effects such as trueCondition || error ()
would invalidate transactions
that are meant to be valid. In reality, the bug may not be as apparent as in our
example, as the side effect may be hidden deep inside a complex source code. A
non-exhaustive test suite might fail to detect it, leading to forever-locked
assets once the code gets deployed on-chain.
Performance
Secondly, on a blockchain such as Cardano, where evaluating code costs
gas/fee/money, this issue would lead to users spending more fees and time
evaluating unnecessary code. Most of the code we have seen deployed on the chain
makes use of explicit case-expressions, but the cheapCheck || expensiveChecks
pattern is still quite popular.
We have found an instance of this pattern in plutus-use-cases
where the
following innocent-looking code requires more resources than
an unnatural-looking if-then-else.
-- || is surprisingly way more expensive!
isUpdateValid = (not isCurrentValueSigned) ||
(fromMaybe False $ validateGameStatusChanges <$>
(osmGameStatus <$> extractSigendMessage (ovSignedMessage oracleData)) <*>
(osmGameStatus <$> extractSigendMessage outputSignedMessage))
-- if-then-else is way cheaper!
isUpdateValid = if not isCurrentValueSigned then True else
(fromMaybe False $ validateGameStatusChanges <$>
(osmGameStatus <$> extractSigendMessage (ovSignedMessage oracleData)) <*>
(osmGameStatus <$> extractSigendMessage outputSignedMessage))
We calculated the required budget using writeScriptsTo
as documented in the
official how-to.
writeScriptsTo
(ScriptsConfig "." (Scripts UnappliedValidators))
"updateOracleTrace"
Spec.Oracle.updateOracleTrace
def
The difference is rather significant, as shown in the following table. “Size” represents the size of the script in bytes; “CPU” and “Memory” are made-up units whose actual value is up to the ledger to decide.
With | Size | CPU | Memory |
---|---|---|---|
|| | 9,539 | 1,124,168,956 | 3,152,700 |
if-then-else | 3,531 | 698,345,767 | 2,069,978 |
When the cheap check is satisfied, the strict ||
still evaluates the expensive
right-hand side, which yields a much larger script and requires over 1.5 times
more resources in terms of both, CPU and memory consumption, than a short-circuit
solution with if-then-else
. This overzealous behaviour might be extremely
costly and wasteful if a frequently updated oracle falls into this case. To
reproduce our findings, one can run the experiment in this
branch to find out more.
Confusing Behaviour
The two previous issues might seem unproblematic considering that programmers
“just” have to know about overzZzealous to avoid it. However, we believe that
the strict semantics will confuse programmers. While we do not necessarily think
that short-circuit operators are superior, ||
and &&
simply behave so in
most popular languages. The most popular linter for Haskell, hlint
, even
suggests to replace if-then-else with ||
when possible:
src/Contracts/Oracle/OnChain.hs:(136,21)-(139,69): Warning: Redundant if
Found:
if not isCurrentValueSigned then
True
else
(fromMaybe False
$ validateGameStatusChanges
<$>
(osmGameStatus
<$> extractSigendMessage (ovSignedMessage oracleData))
<*> (osmGameStatus <$> extractSigendMessage outputSignedMessage))
Perhaps:
not isCurrentValueSigned
||
(fromMaybe False
$ validateGameStatusChanges
<$>
(osmGameStatus
<$> extractSigendMessage (ovSignedMessage oracleData))
<*> (osmGameStatus <$> extractSigendMessage outputSignedMessage))
A contract may have on-chain and off-chain code located close to each other. It
is worrying how two occurences of ||
or &&
just a few lines apart in the same module can
have different semantics. This is a disaster waiting to happen. &&
is even
more common than ||
with patterns such as con1 && con2 && con3
. Even though
most APIs would invalidate and refuse to submit the transaction off-chain for an
early False
, they are still wasteful calculations that could be saved. One
can never save enough resource in blockchain tech.
At Hachi, we only discovered this surprising behaviour while developing a concolic execution engine for Plutus Core. The community’s responses on our report to the Plutus team suggest that others are also surprised. We also received several private complaints from other developers in the ecosystem. We genuinely empathise with the people who fell into this trap and had to rewrite their code less idiomatically.
Solutions
We observed two kinds of mitigations to this issue which dApp developers currently employ: some
rewrite their usage of ||
and &&
as if-then-else to recover the intended
semantics. Many others write their own ad-hoc boolean functions and lazy lists
in order to emulate the behaviour which they expected from these operators. It
is unpleasant for the people who know of this behaviour. It is probably worse
for people who do not as they may only notice the issues once their scripts have
already been deployed to the blockchain.
Since the problem comes from the design of Plutus Core, it is not easy to fix. We have the following proposals to remedy this issue:
- Introduce a semantic change to add lazy
||
and&&
to Plutus Core. A simple fix is to compile them toif-then-else
. Of course, this implies breaking backwards compatibility. - Keep the current semantics for
||
and&&
, but add new operators such as|
and&
with lazy semantics. Document the differences between the two variants; it should hopefully catch more attention with two variants. - Remind dApp developers to write tests to catch this issue.
As security researchers who cheer for both sides, we feel it is better to implement ad-hoc changes on the compiler (such as making two particular symbols operators instead of handling them in a generic way as functions) for better UX and performance for dApp developers. The compilation pipeline is non-trivial, but the language itself is not vast. We can write more documentation and refactor code more freely while deployed scripts stick. A small compiler optimisation might translate to a considerable amount of ADA for long-lived and high-traffic dApps! Most dApp developers would not work close enough to the Plutus Core level to face this complexity.
Whatever the resolution to this issue may be, we believe that this peculiar behaviour should be documented and communicated better to dApp developers.
Conclusions
This blog post presents overzZzealous, a peculiarity in the semantics of
Plutus Core, the language of the Cardano blockchain. overzZzealous is simply
the fact that ||
and &&
are not lazy, which can lead to slower and more
expensive (in terms of gas/fee/money) evaluation for on-chain code. In some cases,
it can also lead to validators rejecting transactions that are supposedly correct,
potentially blocking assets.
overzZzealous came as a surprise to us and surprised several other
developers. We believe it will keep surprising people for as long as the
Cardano blockchain exists. At Hachi, we will continue
studying the raw on-chain scripts that use ||
, &&
and, more generally, how
strict semantics affect side-effect usage in scripts from the chain index.
Hachi is a research effort in security, programming language theory, and formal methods on the Cardano blockchain. Follow our blog at blog.hachi.one for frequently updated content in this exciting field.