The Current State of Knowledge About Zero Knowledge. Comprehensive Private Token Overview - Part 3
Mon, Jan 29, 2024 •10 min read
Category: Code stories
This post is Part 3 of our three-part series dedicated to ZKs and Private Tokens. Click here for Part 1, and Part 2.
Part 3: Return of the Hash
Theory vs practice. Expectations vs reality
In the previous part of the series, we delved into seemingly unrelated topics about blockchain, zero-knowledge proofs, and privacy. Now, let’s transition to practical considerations and explore the tools available for developers.
Undoubtedly, the creation of any kind of ZK technology needs ZK Proof. How can we create it? The core element in this process is known as a circuit, often referring to an arithmetic circuit—an ordered collection of operations (such as addition, multiplication) represented by gates. In the realm of ZK-circuits, this translates to a program specifying a calculation to be performed on certain data inputs, used by the prover to generate a proof of knowledge. Thankfully we’re not forced to create every complex circuit from logic gates (AND, OR etc), but instead can use dedicated programming languages. Moreover, it's possible to validate the proof on-chain, using smart contracts deployed on platforms like Ethereum. Let's explore the available options:
- circom by iden3: designed for ZK-circuit development, circom provides programmers with a powerful and precise Domain-Specific Language (DSL). However, newcomers may find it challenging to grasp the syntax and semantics underlying the mathematical equations.
- Noir by Aztec: a language for Aztec, a privacy-focused Layer 2 for Ethereum, heavily inspired by Rust. Noir is designed to be user-friendly for developers without a ZK background.
- Cairo by Starkware: another language dedicated to Layer 2, Cairo is a relatively low-level "intermediate" language known for its efficiency and scalability.
- Zokrates: while not a language per se, Zokrates is a toolbox for zkSNARKs. It simplifies the complexity of ZK-circuits, aiding in generating proofs in high-level languages and verifying them through Solidity smart contracts.
Other noteworthy languages include:
As evident, several options are already available, with more on the horizon! While developing on Layer 2 platforms like Starknet has dedicated languages, addressing the Ethereum Mainnet introduces intricacies due to subtle yet consequential differences between ZK and EVM elliptic curves. Let’s explain them now.
Elliptic curves and straightforward hashes
Once again, in this series of articles, I don't intend to delve too deeply into this complex theme, as covering everything comprehensively would require another series of papers. Numerous excellent resources on this subject can be found on the internet. So, let's keep it as simple as possible.
Blockchains like Bitcoin or Ethereum employ hashes as account addresses, which is obvious. However, not everyone understands how these addresses are created. It involves something like private and public keys, a touch of cryptography (or, as some might say, magic), and voila! Let's take a closer look.
In mathematics, there's a concept known as elliptic curves. While these curves can take various forms, they are essentially a set of points that satisfy a specific mathematical equation, such as:
y2 = x3 + ax + b.
Utilizing the cryptographic system known as ECDSA (Elliptic Curve Digital Signature Algorithm), it becomes possible to obtain a point on the elliptic curve (public key) through the multiplication of two numbers. One of these numbers represents the private key. The trick lies in the fact that calculating the public key from a known number (private key) is extremely fast, yet attempting to deduce the number (private key) from a given point (public key) is incredibly challenging. In essence, the public key can be derived through a one-way function from the private key.
To generate a public address for your account, simply append 0x to the beginning of the last 20 bytes of the Keccak-256 hash of the public key.
The curve employed in ECDSA is named the secp256k1 curve, and what holds significance for us is that operations on this curve entail arithmetic with 256-bit numbers. Why is this aspect crucial?
Zero-Knowledge Proofs and EVM Relationship Status: It’s complicated
Recall zkSNARKs, the proving system integral to Zero-Knowledge Proofs? zkSNARKs rely on elliptic curve cryptography, which permits operations only on numbers restricted to 254 bits. In summary, the numbers utilized on Ethereum's EVM surpass the 254-bit limits allowed in zkSNARK systems. Consequently, to implement ECDSA algorithms within ZK circuits, we must resort to 'non-native' field arithmetic.
This presents one of the most significant challenges in ensuring a smooth interaction between ZKP and EVM. While it is achievable, there's always a trade-off: time or money. A noteworthy example is the commendable work by the 0xPARK Foundation, which presented an implementation of an ECDSA circuit in the circom language, capable of generating zkSNARKs. Although it is possible to prove ownership of an account, there is still much to be done, as the performance falls short of expectations. The circom environment is resource-constrained, making proofs expensive and time-consuming.
Nevertheless, ongoing efforts have resulted in an improved approach to verifying ECDSA signatures in ZK. Spartan-ecdsa, introduced by Personae Labs, shows promise. It’s still a work in progress, with challenges like poor Keccak performance and excessively large proofs for on-chain verification via smart contracts still needing to be addressed.
There is always hope
Let's pause for a moment and contemplate our goals and available options. The objective is to validate the proof (a 254-bit system) on-chain, which is a 256-bit system. Two potential approaches come to mind:
- Convert 256-bit numbers to 254-bit ones, as previously discussed. However, this option, as assumed, is currently impractical.
- Convert 254-bit numbers to 256-bit ones, which we will dive into below.
Given that numerous encryption and description operations need to occur within zkSNARKs, necessitating 254-bit numbers for easy arithmetic, we can explore the utilization of another elliptic curve known as Baby Jubjub. Specifically designed to function efficiently within zkSNARKs on Ethereum, the ERC-2494 was introduced in 2020. Although it is no longer active, it remains a compelling and intriguing concept worth considering.
A Bitcoin Orchard: Merkle Trees
In the realm of Zero-Knowledge Proofs (ZKP) on the Ethereum Virtual Machine (EVM), or more broadly, whenever we navigate the blockchain landscape, a critical consideration emerges: data storage. While the ability to create and verify proofs is commendable, it becomes even more enchanting when we can efficiently store those proofs on the blockchain. A widely adopted solution, employed, for instance, by Tornado Cash, is the Merkle Tree. This data structure proves invaluable for privacy-centric projects by securely storing hashed (encrypted) data fragments while enabling the verification of specific data without exposing the entirety of the dataset. Let's once again refer to the diagram for a clearer illustration.
At its core, a Merkle Tree is a data structure crafted from encrypted (hashed) data fragments. Illustrated in the diagram, each pair undergoes a hashing process, generating the next "leaf" (node). This paired hashing repeats, ascending the tree until culminating in the top hash known as the Merkle Root. This ingenious structure facilitates the proof of data existence within the tree without necessitating the storage of the entire dataset. For an in-depth exploration, delve into this detailed explanation.
Merkle Trees works great for the UTXO Model, because the entire state can be stored with two Merkle Trees:
- a Note Tree - stores all system data, including smart contracts and asset ownership;
- a Nullifier Tree - stores data about spent (used) UTXOs.
Before using a UTXO, it's checked in the Nullifier Tree to ensure it hasn't been used. If all's clear, changes can be made in the Note Tree.
Encrypt. Verify. Repeat. zkSNARK
Understanding Merkle Trees is vital for comparing SNARKs on various circuits, especially if the criterion is hash verifying efficiency. The method of creating hashes within the Merkle Tree, however, is far from unequivocal; it hinges on factors like numerical systems (e.g., 256- / 254-bit), arithmetic complexity, and cryptographic security. The most prominent ones among many available hash functions are:
- MiMC - designed for very compact circuits with minimal amount of multiplications. Great for the tiniest circuits and as the base for more complicated algorithms (e.g. GMiMC),
- Poseidon - an enhanced version of MiMC with improved security, well-suited for scenarios requiring a collision-resistant, secure, and SNARKs-friendly algorithm,
- Pedersen - extremely useful and efficient, for e.g. homomorphic encryption, which enables solving some zk-security problems (like confidential payments) and building Merkle proofs. On the other hand, this can lead to significant security vulnerabilities.
Segueing back to zkSNARKs protocols, two prominent players — PLONK and Groth16 — take the spotlight. For a detailed exploration of their nuances and functionalities, dive into this article. In a nutshell, both Groth16 and PLONK, are not quantum-resistant, and exhibit distinct characteristics: Groth16 boasts a generally smaller proof size, whereas PLONK sidesteps the need for recurring trusted setup ceremonies.
Private Token: Proof of Concept
Let’s see what the Private Token system could look like, using ZKP and EVM blockchain. Private Token is deployed as an EVM smart contract on a ZK-rollup (L2) and uses zkSNARKs to verify (on-chain via smart contract) transfer correctness and keep privacy at the same time. It is designed with the UTXO Model and uses Merkle Tree for membership proof. The Private Token contract maps UTXOs with commitments, having a structure defined by owner address, value, and unique ID. The ZK Circuit, written in circom language, verifies input and output UTXO values and sender ownership. As a result, we get a new UTXO commitment, in the form of data hashed with e.g. Pedersen Hash Function. If valid, a new commitment is added to the Merkle Tree using the Baby Jubjub curve (254-bit) for an efficient proof system. UTXOs are stored on Private Token contract in an encrypted format and thanks to the Partially Homomorphic Encryption can be decrypted and read only by the owner of the account.
Simple, right? Well, if you're still a bit dazzled, no worries – I'm not grading anyone on their zero-knowledge prowess. Stick around, and we'll unravel more in the next posts!