Hachi

We do security research on Cardano


An Overview of Hachi's Security Toolkit for Plutus Core

Contents

Cardano uses validator scripts, also known as smart contracts on other blockchain platforms, to verify whether a new transaction should be accepted. Even though typically written in a high-level programming language like Plutus, the scripts are compiled to and stored on-chain in the form of Plutus Core (PLC).

Bugs in validator scripts can have catastrophic consequences, such as by allowing malicious users to wrongfully claim assets, or by preventing assets from ever being retrieved. Unfortunately, since we are in the early days of Cardano, there is a lack of security solutions. dApp developers have no reliable tools to detect and evaluate vulnerabilities in validator scripts.

For this reason, the Hachi team works in an effort to develop a security toolkit to analyse validator scripts on the Cardano platform. Concretely, our tools work with PLC, which brings the following benefits:

  • It guarantees that our tools will be compatible with different source languages of validators, be they Plutus or any other language that is introduced in the future.
  • We do not need to rely on the availability of validators’ source codes, which are not always made accessible by dApp developers.
  • This choice offers good precision as we work directly with what is deployed to the blockchain. In particular, this approach is resilient to bugs in the compilation pipeline as well as peculiarities in the source language.

Among several directions, we put our early focus on dynamic analysis techniques and developed new engines for symbolic execution, concolic execution, instrumentation, and hybrid fuzzing of PLC code. This article describes our work and some results of these exciting projects.

Transpiler

To make our development sustainable, we translate PLC programs to Racket with our plc2racket compiler. This allows us to re-use existing tools for Racket and build our tools upon them. As a result, we get instrumentation, symbolic/concrete evaluation, and debugging for (almost) free. Racket’s design as a powerful framework to develop other languages gives us flexibility that few other programming languages provide.

Symbolic Execution

Symbolic execution is a form of semantic analysis, for which program inputs are treated as symbols. Normal programs usually take concrete values like $0b10001$, $42$, or “hello world” as inputs. However, in symbolic execution, the inputs are abstract and are annotated with constraints indicating their domains such as $α,β,γ$ where $α ∈ \left\lbrace{0, 1}\right\rbrace^5$, $β ∈ \mathbb{Z}^+$, $γ ∈ \left\lbrace\text{“hi hachi”}, \text{“goodbye”}\right\rbrace$. A symbolic executor explores multiple execution traces and accumulates the constraints required to reach a particular outcome. For example, let us consider the following Racket program:

(define (f a b)
  (if (> a b)
      (+ b 3)
      (+ a b)))

It defines a function f with two parameters, a and b. When a is greater than b, f returns b + 3; otherwise, it returns a + b. Let us say that we want to find $α, β$ such that $f(α, β)$ returns 6. This is easily doable with symbolic execution:

(define-symbolic α integer?)
(define-symbolic β integer?)

(solve (= (f α β) 6))

The symbolic executor will visit both branches of the if statement inside f. Let $γ$ be the return value of $f$, then we end up with the following constraints where each argument of the disjunction corresponds to one of the two branches:

$$((α > β) ∧ (β + 3 = γ)) ∨ (¬(α > β) ∧ (α + β = γ))$$

Determining which values for $α, β ∈ \mathbb{Z}$ result in $\gamma = 6$ can now be accomplished by solving the above constraints in conjunction with $\gamma = 6$. We can either solve those constraints by hand or, even better, throw them into an SMT solver to extract the concrete values of $α$ and $β$.

In the case of Cardano validator scripts, we can use this technique to find the subset of inputs that are accepted by a given validator script. The interesting part is that we can add domain-specific constraints. For example, we can ask questions such as

“Can you give me valid inputs that the validator script accepts and that lead to assets being stolen by an attacker?”

If the script is faulty, then the symbolic engine will provide example inputs that we can use to construct the transaction which led to the erroneous result. If not, the SMT engine may produce proof verifying that the implementation follows the specification.

We built our framework named Medus which leverages Rosette to bring practical symbolic execution to Cardano validator scripts. As with all of our tools, Medus works with Untyped Plutus Core code. It provides functionality to easily define symbolic inputs for the script and properties about it. For example, consider the following data type that may be defined as part of a Plutus script:

data SaleDatum = SaleDatum
  { seller :: !PubKeyHash,
    price :: !Integer,
    cs :: !CurrencySymbol,
    tn :: !TokenName
  }

We can define symbolic inputs corresponding to this data type with the following Racket code:

(struct/plc 0
            SaleDatum
            ([seller ??bytestring]
             [price ??integer #:min 0]
             [cs ??bytestring]
             [tn ??bytestring]) #:transparent)
(define datum (??SaleDatum))

While Medus is still at an experimental stage, we managed to use this framework to test some real-world smart contracts, such as Martify, where we detected a Double Satisfaction vulnerability, and Meld’s Vesting, where we found that a trusted initial state will not lead to the same problem.

Concolic Execution

Concolic execution, also often called concolic testing or symbolic tracing, is a variant of symbolic execution that explores one execution trace at a time instead of all the possible traces. The name is a portmanteau of “concrete” and “symbolic”, since concolic execution is mix of both concrete and symbolic execution. The execution is concrete in that it takes a normal, concrete set of inputs and follows the one execution trace corresponding to those inputs. The execution is also symbolic in that the engine accumulates all the constraints discovered along the way, creating a set of constraints describing which other inputs would result in the same execution trace. Concretely, a concolic engine typically performs the following steps:

  1. The engine selects concrete inputs, either randomly, or guided by a user or another program. For example, for the function f from the example shown in the previous section, we might arbitrarily choose $0$ and $0$ for the two integer arguments a and b.
  2. With these concrete inputs, the engine executes the program concretely. This behaves like normal program execution, except that the engine records exactly which branches are being followed. For (f 0 0), the condition (> a b) would fail and execution would follow the else branch and return the result of (+ a b), which is $0$.
  3. The engine then gathers all the constraints encountered in the execution trace in question. For instance, while evaluating (f 0 0), the trace goes into the else branch of the if-then-else construct. The engine therefore records that the test (> a b) failed for those inputs. For efficiency reasons, this step is often done on-the-fly at the same time as step 2.
  4. Together, all these constraints define which other concrete inputs would follow exactly the same execution trace. The engine stores the trace somewhere or sends it to another program and starts again from step 1, except that it makes sure to not take an input that would lead to an execution trace that is already known. In the example above, this leads the engine to look for a and b such that (> a b) is true. For example, the engine could select $1$ for a and $0$ for b to satisfy this constraint, which indeed leads the engine to then explore the other branch of the if-then-else construct.

Eventually, a concolic engine might detect that there are no other inputs leading to new traces, meaning that all the possibilities have been found. Compared to symbolic execution, concolic execution is slower to discover all the traces, because it keeps restarting the execution from the beginning. However, it is always productive as it can return traces whenever it finds them. Moreover, it can easily be guided to find traces of particular interest to a user or another program.

At Hachi, we have developed Sekkou, a concolic execution framework for Racket. It is based on Rosette for the symbolic aspects and adds its own instrumentation and tooling on top of it. It is not specific to PLC, but works for all Racket programs.

Sekkou can easily be applied to PLC code, since we can compile PLC to Racket and use Medus to provides PLC’s standard library in Racket. Our experiments show that Sekkou can efficiently explore and enumerate the execution traces of real-world validator scripts, detecting exactly all the possible entries that lead a script to validate (or not). Sekkou can also be used in conjunction to another tool, as in the case of our hybrid fuzzer.

Hybrid Fuzzing

Fuzzing, also called fuzz testing, is one of the most productive, dynamic analysis techniques for finding software bugs. Fuzzing works by feeding randomly mutated input to a program, until an input is discovered which causes the program to reach an erroneous state, such as crash or hang. There are different types of fuzzers and we pay special attention to the greybox fuzzer, also known as coverage-guided fuzzer, which is both simple and highly effective: by tracking code coverage during program execution, a greybox fuzzer can generate and prioritise inputs that have a better chance of revealing new code paths, thus enabling quicker improvement of coverage.

Unfortunately, greybox fuzzers suffer from a fundamental shortcoming: they can easily get stuck, i.e. become unable to find new code paths when faced with a very specific condition. For example, given a condition such as if (x == 0xdeadbeef), the fuzzer would take a lot of time to generate the exact value of x required for the condition to succeed. We can overcome this issue with a hybrid fuzzer, which is a greybox fuzzer combined with concolic execution: whenever the fuzzing is stuck, the fuzzer calls the concolic execution engine in order to help find a new input that can get to the stuck branch. In other words, the majority of path exploration is offloaded to the greybox fuzzer, while the concolic engine is selectively used to drive execution across the paths that are guarded by tight constraints. Since we only explore a specific path in the concolic engine, this minimises the state explosion problem for the constraint solver. By leveraging the advantages of each approach, hybrid fuzzing can achieve better code coverage than applying greybox fuzzing or concolic execution independently.

To fuzz PLC code, we have built our own hybrid fuzzer named Hachifuzz:

  • We transpile PLC code to Racket using our plc2racket tool, then fuzz the Racket code.
  • For the greybox fuzzer, we reuse the well-known AFL++ fuzzer, which supports a good range of fuzzing mechanisms and optimisations.
  • The concolic engine is built on top of Sekkou, our concolic engine. This engine runs in parallel with AFL++. It periodically scans and detects “stuck” states and then tries to find new inputs to get the fuzzer past the relevant condition. The results are shared with the fuzzer, so that the fuzzing process can proceed with the newly discovered inputs.

Hachifuzz accomplishes the following:

  • We made AFL++ support Racket: AFL++ was originally designed to fuzz machine code, so we designed Hachifuzz on top of AFL++ to fuzz Racket code instead. To achieve this goal, we reused the fuzzer component of AFL++, but built our own client-side components, which generates coverage information, updates coverage to fuzzer, and sends feedback to classify fuzzing output, all for the Racket language.
  • Racket instrumentation: coverage information is an essential part of the greybox fuzzer. Hachifuzz uses transpiler to inject unique branch IDs into the output Racket code for reporting coverage. For better performance, we avoid pervasive instrumentation by only injecting coverage IDs into the conditional branches, thus minimising the execution overhead.
  • Concolic engine: for the concolic engine to work efficiently, it must know exactly where the code is stuck, i.e. at which conditional branch, so it can negate that condition. Hachifuzz handles this using code tracing together with the branch ID mechanism from the previous point to detect stuck points in the code.

Some early experiments show that Hachifuzz is effective at exploring code paths of PLC programs. We have been working to leverage this powerful tool to analyse and find software bugs in validator scripts from the Cardano blockchain, without having their original source code.

Conclusions

We have designed and implemented tools to analyse validator scripts in their PLC form. By combining symbolic execution, concolic execution, and hybrid fuzzing, we leverage the independent strengths of each technique for our toolkit.

Our goal is to make Cardano a safer place for dApp developers and users. This article gives an overview of our first steps towards detecting security bugs in validator scripts in order to realise our goal. Stay tuned for more updates of our work. In particular, we will introduce each of these tools in more detail in their own article in the near future.

Hachi is a research effort in security, programming language theory, and formal methods on the Cardano blockchain. Follow our blog at blog.hachi.one for frequently updated content in this exciting field.