This article is part of the MoonBlokz series, which focuses on creating a hyper-local blockchain using microcontrollers and radio communication. You can find the previous articles here:
Cryptographic algorithms are the fundamental building blocks of blockchains. Every aspect of a blockchain, including transactions, consensus algorithms, and wallets, relies on crypto components. Numerous cryptographic algorithms are available, each with unique strengths & weaknesses. In the case of MoonBlokz, the network’s minimalist design offers certain advantages but also imposes limitations.
Elliptic curves
Before designing the MoonBlokz crypto algorithms, I researched popular blockchains and found that most use some form of elliptic curve algorithms. I dug deeper into elliptical curves and found an excellent article by Andre Corbellini:
https://andrea.corbellini.name/2015/05/17/elliptic-curve-cryptography-a-gentle-introduction/
Before analyzing possible crypto algorithms, I collected the MoonBlokz crypto requirements. These will serve as my guidelines for identifying the most effective algorithm.
Signatures
Digital signatures are crucial in blockchains as tools for authentication, authorization, and non-repudiation. The initiator of each transaction signs it, and each block is signed by its creator. Furthermore, during the approval process, evidence blocks include multiple signatures. Managing digital signatures is the primary cryptographic feature we require, along with some necessary additions, such as hashing.
Now let’s see what we need about signatures:
It is difficult to assess the performance without a reference point. To guide my evaluation, I selected a reference HW configuration. I checked several microcontrollers and ended with an RP2040-based board with an integrated LoRa radio (https://www.waveshare.com/rp2040-lora.htm). Technically, it is a Raspberry Pi PICO 1.
With these capabilities, my nodes can remain compact without requiring any additional hardware components.
The device from Waveshare has some drawbacks. The 2MB flash memory is quite limited for high-capacity blockchain applications, and the LoRa radio module covers the debug port, so I have to work without debugging capabilities. This is a significant limitation. To test my code, I purchased an additional RP2040 that does not include LoRa but does have a debug port.
RP2350 (the successor to RP2040) can also be an option, offering a faster processor; however, I haven’t found an RP2350-based setup with an integrated LoRa radio.
From now on, I will accept solutions that run well on my RP2040 board, and I hope to reproduce them with different hardware setups in the future.
After reviewing the commonly used algorithms, I narrowed my focus to two for deeper inspection: BLS and Schnorr.
BLS (Boneh-Lynn-Shacham) signatures are a type of cryptographic digital signature scheme known for their short signatures and the unique property of signature aggregation.
Schnorr is a digital signature scheme known for its simplicity, security, and efficiency. It generates short signatures. While patented for a long time, its patent expiration has led to increased adoption, including its integration into Bitcoin through the Taproot upgrade.
Ethereum utilizes BLS in its beacon chain, a separate chain for coordinating the consensus mechanism. On the other hand, Bitcoin uses Schnorr. Both algorithms are based on elliptic curves.
The key aspects of BLS and Schnorr are:
┌──────────────────┬────────────────┬──────────────┬────────────────────┐
│ │ │Signature Size│Aggregated Signature│
│ Algorithm │Public Key Size │ (bytes) │ Size (bytes)* │
├──────────────────┼────────────────┼──────────────┼────────────────────┤
│ BLS │ 96 │ 48 │ 48+2 │
├──────────────────┼────────────────┼──────────────┼────────────────────┤
│ Schnorr │ 32 │ 64 │ n*32+32+2 │
└──────────────────┴────────────────┴──────────────┴────────────────────┘
*n is the number of aggregated signatures. This formula does not contain the
node identifiers contributing to the aggregated signature. It adds an extra
n*4 bytes for both cases. For BLS, I only need to store all node identifiers
and one signature. In contrast, I must store both node identifiers and public
nonces for every node and the aggregated signature for Schnorr.
From an algorithmic perspective, BLS would be the ideal choice for MoonBlokz. It offers the smallest signature size, with 96-byte public keys and 48-byte signatures. Additionally, it supports non-interactive aggregation, enabling the efficient storage of evidence blocks. Everything seemed perfect; however, when I tested the BLS libraries, I encountered performance issues on the reference hardware. I conducted benchmarks with multiple options and summarized my findings in the following table:
┌─────────────────┬───────┬──────────────┬──────────────┬──────────────┐
│ │ │ Test Project │ Test Project │ Test Project │
│ │ │ UF2 size │ signing time │ verification │
│ Solution │ Alg. │ (Kbytes) │ (secs) │ time (secs) │
├─────────────────┼───────┼──────────────┼──────────────┼──────────────┤
│ │ │ │ │ │
│bls12_381-bls │ BLS │ 220│ 4.6│ 13.1│
│ │ │ │ │ │
├─────────────────┼───────┼──────────────┼──────────────┼──────────────┤
│ │ │ │ │ │
│arkworks │ BLS │ 267│ 7.5│ 17│
│ │ │ │ │ │
├─────────────────┼───────┼──────────────┼──────────────┼──────────────┤
│Own Schnorr │ │ │ │ │
│implementation │Schnorr│ 118│ 11.6│ 13.5│
│(num-bigint) │ │ │ │ │
├─────────────────┼───────┼──────────────┼──────────────┼──────────────┤
│Own Schnorr │ │ │ │ │
│implementation │Schnorr│ 123│ 2.7│ 5.4│
│(num-bigint-dig) │ │ │ │ │
├─────────────────┼───────┼──────────────┼──────────────┼──────────────┤
│Own Schnorr │ │ │ │ │
│implementation │Schnorr│ 576│ 0.7│ 1.4│
│(malachite) │ │ │ │ │
└─────────────────┴───────┴──────────────┴──────────────┴──────────────┘
To validate a block, all transactions within it must be verified and confirmed. Typically, a block can contain around 20 node transfers. We can optimize this process by approximately 50% through batch verification. However, even with this optimization, the fastest BLS implementation I found still takes about two and a half minutes to validate a block, which is unacceptable for our network.
I have chosen the second option. Currently, I am using the Malachite big integer library. Although it produces large binaries, my reference hardware can accommodate them. If necessary, I can later optimize Malachite by removing any algorithm variations that are irrelevant to the specific numbers I use. Additionally, I ensure that the cryptographic algorithms are easily replaceable with other Schnorr and BLS implementations. By finding common abstractions for these two libraries, I can introduce different types of algorithms later without needing to modify other parts of the code.
There is a critical consideration regarding Malachite: its licensing. It is licensed under LGPL 3.0, which requires that I allow end-users to modify and relink the LGPL library. Since Rust does not support dynamic linking, MoonBlokz end-users must be able to recompile the whole library. Fortunately, this is not an issue for MoonBlokz, as I previously mentioned that I plan to distribute it in source code form.
We are now at the starting line. Let’s begin coding! I’m adding some code snippets to this article for illustration, but I’ve also uploaded the entire source code of the moonblokz-crypto-lib to GitHub. You can find the link at the end of this article.
I decided to develop a standalone library for MoonBlokz’s crypto module, enabling it to operate independently of the main framework. I chose the name moonblokz-crypto-lib for this module. While the name may lack creativity, it is simple and logical.
First, I gather the requirements:
Code structure
The primary question regarding this library is how to handle the varying data sizes of different implementations. I cannot simply use traits as return types due to the varying sizes. I want to minimize heap usage as much as possible, so I also cannot use Vec or other dynamic types. While I can use generics, this approach requires me to parametrize the embedding project with numerous data types, such as signature, aggregated signature, and public key. Instead, I opted for a simpler (though somewhat less elegant) solution: I created structs with the same name and same functions for each algorithm and used crate-level features to select between these implementations.
The code structure of the crypto library is the following:
I control the actual implementation to compile using cargo features. In Cargo.toml, I use the following structure:
[dependencies]
sha2 = { version = "0.10", default-features = false,optional = true}
malachite-base = {version="0.6", default-features=false,optional = true}
malachite-nz = {version="0.6", default-features=false,optional = true,features = ["32_bit_limbs"]}
num-bigint-dig = {git="https://github.com/petersallai/num-bigint.git", default-features=false,optional = true}
num-traits = { version = "0.2", default-features = false, optional = true }
bls12_381-bls={version = "0.5", default-features = false,optional = true}
dusk-bls12_381 = { version = "0.14", default-features = false, features = ["alloc", "pairings", "zeroize"],optional = true }
dusk-bytes = { version = "0.1.7", default-features = false,optional = true }
[features]
default = ["schnorr-num-bigint-dig"]
schnorr=["dep:sha2"]
bls=[]
schnorr-malachite = ["schnorr","dep:malachite-base", "dep:malachite-nz"]
schnorr-num-bigint-dig = ["schnorr","dep:num-bigint-dig","dep:num-traits"]
bls-bls12_381-bls = ["bls","dep:bls12_381-bls","dep:dusk-bls12_381","dep:dusk-bytes"]
The dependencies are only included if they are required by the selected feature.
In lib.rs we also use these features to control the used implementation. First, we add some validation logic to prevent compilation if no cryptographic implementation is selected or if multiple implementations are enabled:
#[cfg(any(
all(feature = "schnorr-malachite", any(feature = "schnorr-num-bigint-dig", feature = "bls-bls12_381-bls")),
all(feature = "schnorr-num-bigint-dig", any(feature = "schnorr-malachite", feature = "bls-bls12_381-bls")),
all(feature = "bls-bls12_381-bls", any(feature = "schnorr-malachite", feature = "schnorr-num-bigint-dig")),
))]
compile_error!("Only one crypto implementation feature can be enabled at a time");
#[cfg(not(any(feature = "schnorr-malachite", feature = "schnorr-num-bigint-dig", feature = "bls-bls12_381-bls")))]
compile_error!("At least one crypto implementation feature must be enabled");
Next, we select the implementation to use:
#[cfg(feature = "schnorr-num-bigint-dig")]
pub mod schnorr_num_bigint_dig_signer;
#[cfg(feature = "schnorr-num-bigint-dig")]
pub use schnorr_num_bigint_dig_signer::AggregatedSignature;
#[cfg(feature = "schnorr-num-bigint-dig")]
pub use schnorr_num_bigint_dig_signer::Crypto;
#[cfg(feature = "schnorr-num-bigint-dig")]
pub use schnorr_num_bigint_dig_signer::MultiSignature;
#[cfg(feature = "schnorr-num-bigint-dig")]
pub use schnorr_num_bigint_dig_signer::PublicKey;
#[cfg(feature = "schnorr-num-bigint-dig")]
pub use schnorr_num_bigint_dig_signer::Signature;
I created similar blocks for all implementations. I also defined constants based on the configured features. I used a two-layered feature definition. The first layer defines the algorithm (Schnorr or BLS), and the second layer defines the exact implementation. With this setup, I only need to define algorithm-dependent constants once for an algorithm.
#[cfg(feature = "schnorr")]
///Single signature size (in bytes)
pub const SIGNATURE_SIZE: usize = 64;
#[cfg(feature = "schnorr")]
///Multi signature size (in bytes)
pub const MULTI_SIGNATURE_SIZE: usize = 64;
#[cfg(feature = "schnorr")]
///Size of the public key (in bytes)
pub const PUBLIC_KEY_SIZE: usize = 32;
#[cfg(feature = "schnorr")]
///Size of the private key (in bytes)
pub const PRIVATE_KEY_SIZE: usize = 32;
#[cfg(feature = "schnorr")]
///Contant parts's size in an aggregated signature (in bytes)
pub const AGGREGATED_SIGNATURE_CONSTANT_SIZE: usize = 34;
#[cfg(feature = "schnorr")]
///Signature count dependent parts's size in an aggregated signature (in bytes)
pub const AGGREGATED_SIGNATURE_VARIABLE_SIZE: usize = 32;
This structure incurs no runtime overhead and only the required implementation compiles into a binary.
This code structure has some drawbacks:
I divided the functionality into five separate structs. Each struct has a trait defined in `lib.rs` to establish a consistent interface, ensuring that all implementations provide the same API. The embedding projects utilize these structs directly, as this approach is much simpler. Therefore, the traits primarily serve to maintain internal consistency.
At first glance, this structure appears overly complicated. Why don’t we use simple Signatures during the aggregation? I chose this structure to model the situation where modifications are required from the signer’s side to create aggregatable signatures.
In MoonBlokz, I use Signature for signing transactions and blocks. During the approval process, I utilize MultiSignatures in support messages and AggregatedSignatures in evidence blocks.
To implement Schnorr signing, I used this page as a reference:
learnmeabitcoin.com
It is a very detailed algorithm description of Schnorr signing, signature verification, and aggregation. I recommend reading it if you are interested in crypto algorithms.
Algorithmically, I changed only one element. The article used random numbers to create unpredictable nonces. I don’t like random numbers. It is very hard to generate cryptographically good random numbers in embedded environments. I switched to deterministic nonce generation. I use a simplified variation of RFC 6979 to generate a random value:
let mut k0: BigInt;
let mut counter = 0u32;
loop {
let counter_bytes = counter.to_le_bytes();
k0 = BigInt::from_bytes_le(Sign::Plus, &self.tagged_hash(b"nonce", &self.private_key_bytes, &message, &counter_bytes)) % &self.n;
if k0 > BigInt::zero() {
break;
}
counter += 1;
}
I combine the tag, private key, message, and a counter by hashing them together. This counter is used to address the rare case when the generated number equals zero.
In the case of Schnorr, there is no algorithmic difference between Signatures and MultiSignatures.
Handling aggregation in Schnorr signatures can be quite challenging. Currently, no non-interactive aggregation schemes are available. In the case of MoonBlokz, using interactive algorithms is not an option, as it would add significant complexity to the radio communication. Instead, I implemented a half-aggregation algorithm. I summarized the signature component from the signatures while keeping the public nonces (r) separate. This approach is necessary to calculate the individual challenges as follows:
e = int(H(tag || r || P[x] || m)) % n
where
By using this approach, I can halve the total signature size (n*32 + 32 + 2 bytes). Although this is significantly larger than the BLS aggregation (consistent 48 + 2 bytes), it is much more efficient than storing n signatures separately.
During testing, I found that the BigInteger library has a significant impact on performance. I initially used num-bigint and num-bigint-dig, but later found more powerful options, such as Malachite. However, I am not fond of Malachite’s API. It strictly separates Natural and Integer types, without allowing implicit conversions, and many methods, such as serialization, are only available for Natural types. In contrast, the implementation of num-bigint-dig is much cleaner and more readable than the solution provided by Malachite.
I had to fork the num_bigint_dig library because it depends on lazy_static, which is incompatible with the thumbv6m-none-eabi target used by my reference hardware. Anyway, I don’t use the lazy-static dependent functionality from num_bigint_dig. Therefore, I created a fork that makes `lazy_static` an optional dependency.
I also submitted a pull request with this change to the original library, but Tarcieri, the maintainer, informed me that they are migrating to `crypto-bigint` and will soon retire this library. I attempted to develop an implementation using `crypto-bigint`, but I got stuck on the missing `modpow` function. Tarcieri gave me some tips. I will revisit this issue later.
The BLS implementation is significantly shorter than Schnorr’s because it acts merely as a wrapper over the bls_12_381_bls library. In this case, the implementations of MultiSignature and Signature differ, while MultiSignature and AggregatedSignature utilize the same internal data structure.
I created two types of tests for the library. First, I added the unit tests to lib.rs. It required a shell script to run the tests separately for all implementations, because cargo test only runs with the default features. My script looks like this (it also creates coverage reports):
cargo llvm-cov clean
rm target/lcov.info
cargo llvm-cov nextest --no-report --no-default-features --features bls-bls12_381-bls --output-path ./target/lcov.info
cargo llvm-cov nextest --no-report --no-default-features --features schnorr-malachite --output-path ./target/lcov.info
cargo llvm-cov nextest --no-report --no-default-features --features schnorr-num-bigint-dig --output-path ./target/lcov.info
cargo llvm-cov report --lcov --output-path ./target/lcov.info
The unit tests are unsuitable for evaluating the library on target devices. Therefore, I created a new project that depends on the moonblokz-crypto-lib library. As an embedded framework, I am using Embassy (https://embassy.dev). I will dedicate a separate article later to the Embassy framework. Now I am only using it for LED blinking.
#![no_std]
#![no_main]
use alloc::vec::Vec;
use embassy_executor::Spawner;
extern crate alloc;
use embassy_rp::gpio;
use embassy_rp::peripherals::PIN_25;
use embassy_time::Timer;
use gpio::{Level, Output};
use {defmt_rtt as _, panic_probe as _};
use embedded_alloc::LlffHeap as Heap;
use moonblokz_crypto::{Crypto, CryptoTrait, MultiSignature, PublicKey, Signature};
const TEST_COUNT: usize = 10;
#[global_allocator]
static HEAP: Heap = Heap::empty();
struct Led<'d> {
led: Output<'d>,
}
impl<'d> Led<'d> {
fn new(pin: PIN_25) -> Self {
let led = Output::new(pin, Level::Low);
Self { led }
}
async fn blink(&mut self, count: u8) {
for i in 0..count {
if i != 0 {
Timer::after_secs(1).await;
}
self.led.set_high();
Timer::after_secs(1).await;
self.led.set_low();
}
}
}
#[embassy_executor::main]
async fn main(_spawner: Spawner) {
{
use core::mem::MaybeUninit;
const HEAP_SIZE: usize = 102400;
static mut HEAP_MEM: [MaybeUninit<u8>; HEAP_SIZE] = [MaybeUninit::uninit(); HEAP_SIZE];
unsafe { HEAP.init(&raw mut HEAP_MEM as usize, HEAP_SIZE) }
}
let p = embassy_rp::init(Default::default());
let mut led = Led::new(p.PIN_25);
loop {
led.blink(1).await;
//setup test signers (TEST_COUNT instances of Crypto)
let mut signers = Vec::<Crypto>::new();
for i in 0..TEST_COUNT {
let private_key = [(i + 1) as u8; 32];
if let Ok(signer) = Crypto::new(private_key) {
signers.push(signer);
} else {
panic!("Failed to create signer");
}
}
led.blink(2).await;
let mut signatures = Vec::<Signature>::new();
let message = b"Hello, world!";
//do signing TEST_COUNT times
for i in 0..TEST_COUNT {
let signer = &signers[i];
let signature = signer.sign(message);
signatures.push(signature);
}
led.blink(3).await;
//verify TEST_COUNT signatures
for i in 0..TEST_COUNT {
let signer = &signers[i];
let signature = &signatures[i];
if !signer.verify_signature(message, signature, signer.public_key()) {
panic!("Signature verification failed");
}
}
led.blink(4).await;
//create TEST_COUNT multi signatures
let mut multi_signatures = Vec::<MultiSignature>::new();
let mut public_keys = Vec::<&PublicKey>::new();
for i in 0..TEST_COUNT {
let signer = &signers[i];
let signature = signer.multi_sign(message);
multi_signatures.push(signature);
public_keys.push(signer.public_key());
}
led.blink(5).await;
//aggregate multi signatures
let multi_signature_refs: Vec<&MultiSignature> = multi_signatures.iter().collect();
let aggregated_signature_result = signers[0].aggregate_signatures(multi_signature_refs, message);
let aggregated_signature = match aggregated_signature_result {
Ok(sig) => sig,
Err(_) => panic!("Failed to aggregate signatures"),
};
led.blink(6).await;
//verify aggregated signature
if !signers[0].verify_aggregated_signature(message, &aggregated_signature, public_keys) {
panic!("Aggregate signature verification failed");
}
}
}
I made several tests with different settings. The comparison of the algorithms is already in the table above, but the details are also interesting. I tried the program with varying compiler settings. I used the following settings in Cargo.toml:
[profile.release]
lto = true
opt-level = "z"
incremental = false
codegen-units = 1
# note: debug = true is okay - debuginfo isn't flashed to the device!
debug = true
I also made tests with other opt-level settings. The detailed results are the following:
┌───────────────────────┬─────────┬─────────┬─────────┬─────────┬──────────┐
│ │ │ │ │ │ Verify │
│ │ │ │ Multi │Aggregate│aggregated│
│ │ Sign │ Verify │ -sign │ (10) │ (10) │
├───────────────────────┼─────────┼─────────┼─────────┼─────────┼──────────┤
│bls-bls12_381-bls │ │ │ │ │ │
│(opt-level="z") │ 4.6s│ 13.1s│ 6.9s│ 3s│ 66s│
│220Kbytes │ │ │ │ │ │
├───────────────────────┼─────────┼─────────┼─────────┼─────────┼──────────┤
│bls-bls12_381-bls │ │ │ │ │ │
│(opt-level="s") │ 7s│ 17.1s│ 10.5s│ 3s│ 76s│
│428KBytes │ │ │ │ │ │
├───────────────────────┼─────────┼─────────┼─────────┼─────────┼──────────┤
│bls-bls12_381-bls │ │ │ │ │ │
│(opt-level=3) │ 7.8s│ 19.6s│ 11.7s│ 3s│ 85s│
│718Kbytes │ │ │ │ │ │
├───────────────────────┼─────────┼─────────┼─────────┼─────────┼──────────┤
│schnorr-num-bigint-dig │ │ │ │ │ │
│(opt-level="z") │ 2.7s│ 5.4s│ 2.7s│ 0s│ 85s│
│123Kbytes │ │ │ │ │ │
├───────────────────────┼─────────┼─────────┼─────────┼─────────┼──────────┤
│schnorr-num-bigint-dig │ │ │ │ │ │
│(opt-level="s") │ 2s│ 4.1s│ 2s│ 0s│ 65s│
│181Kbytes │ │ │ │ │ │
├───────────────────────┼─────────┼─────────┼─────────┼─────────┼──────────┤
│schnorr-num-bigint-dig │ │ │ │ │ │
│(opt-level=3) │ 2s│ 4.1s│ 2s│ 0s│ 63s│
│242Kbytes │ │ │ │ │ │
├───────────────────────┼─────────┼─────────┼─────────┼─────────┼──────────┤
│schnorr-malachite │ │ │ │ │ │
│(opt-level="z") │ 0.9s│ 1.9s│ 0.9s│ 0s│ 29s│
│596Kbytes │ │ │ │ │ │
├───────────────────────┼─────────┼─────────┼─────────┼─────────┼──────────┤
│schnorr-malachite │ │ │ │ │ │
│(opt-level="s") │ 0.7s│ 1.4s│ 0.7s│ 0s│ 22s│
│576Kbytes │ │ │ │ │ │
├───────────────────────┼─────────┼─────────┼─────────┼─────────┼──────────┤
│schnorr-malachite │ │ │ │ │ │
│(opt-level=3) │ 1s│ 2.1s│ 1s│ 0s│ 33s│
│1,124Kbytes │ │ │ │ │ │
└───────────────────────┴─────────┴─────────┴─────────┴─────────┴──────────┘
“Sign” is the simple signature creation time. “Verify” is the single signature verification time. “Multi-sign” is the creation time of a MultiSignature. “Aggregation” is the time required to aggregate 10 signatures, and “Verify aggregated” is the time to verify the previously aggregated signature.
There are many interesting points behind these numbers:
Based on these measurements, I use the Malachite version as the default in MoonBlokz. It is ten times faster than the BLS version and three times better than the num-bigint-dig based Schnorr.
I have a solution for now, but with quantum computing potentially on the horizon in the next decade, we need to consider how we can operate MoonBlokz in a post-quantum world. Currently, the answer is that we cannot.😟 The previously mentioned algorithms, such as Schnorr and BLS, are not quantum-safe. While several quantum-safe cryptographic algorithms exist, most require large signature sizes, typically exceeding a thousand bytes. This makes the MoonBlokz concept unrealistic, as even a single transaction would struggle to fit within a 2 KB block.
Currently, there is one exception: SQISign. It uses 64-byte-long public keys and 177-byte-long signatures. It is also a significant increase from 64 or 48-byte signatures, but it is not an instant stop sign like other algorithms. However, there are two main issues with SQISign:
We do not have a viable solution at this moment. However, cryptography is a rapidly evolving field, and as quantum computing becomes a real threat, we may discover new solutions to address these challenges.
Finally, we have all the cryptographic code required by the functionality of MoonBlokz. I have released the moonblokz-crypto-lib as an open-source component. You can find the repository here.
Feel free to use it, and if you like it, please consider giving it a star.
In the next article, we will discuss radio communication and radio packet routing to establish a mesh network.
Author: Péter Sallai
Source: https://medium.com/@peter.sallai/moonblokz-series-part-vi-crypto-algorithms-942d4a28fdc7