Skip to content

Commit

Permalink
Update READMEs
Browse files Browse the repository at this point in the history
  • Loading branch information
howardwu committed Jul 28, 2018
1 parent d61baeb commit b914360
Show file tree
Hide file tree
Showing 3 changed files with 88 additions and 84 deletions.
129 changes: 68 additions & 61 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,11 @@
</p>
<h4 align="center">Java library for distributed zero knowledge proof systems</h4>

___DIZK___ (pronounced */'disək/*) is a Java library for distributed zero knowledge proof systems. The library implements distributed multipoint polynomial evaluation, distributed polynomial interpolation, and distributed computation of Lagrange polynomials.
Using these scalable arithmetic computations, the library provides a distributed zkSNARK proof system that enables verifiable computations of up to billions of logical gates, far exceeding the scale of previous state-of-the-art solutions.
___DIZK___ (pronounced */'dizək/*) is a Java library for distributed zero knowledge proof systems. The library implements distributed polynomial evaluation/interpolation, computation of Lagrange polynomials, and multi-scalar multiplication. Using these scalable arithmetic subroutines, the library provides a distributed zkSNARK proof system that enables verifiable computations of up to billions of logical gates, far exceeding the scale of previous state-of-the-art solutions.

The library is developed by [SCIPR Lab](http://www.scipr-lab.org/) and contributors (see [AUTHORS](AUTHORS) file) and is released under the MIT License (see [LICENSE](LICENSE) file).

The library is developed as part of a conference paper called, *"[DIZK: A Distributed Zero Knowledge Proof System](https://eprint.iacr.org/2018/691)."*
The library is developed as part of a paper called *"[DIZK: A Distributed Zero Knowledge Proof System](https://eprint.iacr.org/2018/691)"*.

**WARNING:** This is an academic proof-of-concept prototype. This implementation is not ready for production use. It does not yet contain all the features, careful code review, tests, and integration that are needed for a deployment!

Expand All @@ -31,45 +30,43 @@ The directory structure is as follows:

* [__src__](src): Java directory for source code and unit tests
* [__main/java__](src/main/java): Java source code, containing the following modules:
* [__algebra__](src/main/java/algebra): Fields, groups, elliptic curves, FFT, multi-scalar multiplication
* [__bace__](src/main/java/bace): Batch arithmetic circuit evaluation
* [__common__](src/main/java/common): Standard arithmetic and Spark computation utilities
* [__configuration__](src/main/java/configuration): Configuration settings for the Spark cluster
* [__profiler__](src/main/java/profiler): Profiling infrastructure for zero-knowledge proof systems
* [__reductions__](src/main/java/reductions): Reductions between languages (used internally)
* [__relations__](src/main/java/relations): Interfaces for expressing statement (relations between instances and witnesses) as various NP-complete languages
* [__zk_proof_systems__](src/main/java/zk_proof_systems): Serial and distributed implementations of zero-knowledge proof systems
* [__algebra__](src/main/java/algebra): fields, groups, elliptic curves, FFT, multi-scalar multiplication
* [__bace__](src/main/java/bace): batch arithmetic circuit evaluation
* [__common__](src/main/java/common): standard arithmetic and Spark computation utilities
* [__configuration__](src/main/java/configuration): configuration settings for the Spark cluster
* [__profiler__](src/main/java/profiler): profiling infrastructure for zero-knowledge proof systems
* [__reductions__](src/main/java/reductions): reductions between languages (used internally)
* [__relations__](src/main/java/relations): interfaces for expressing statement (relations between instances and witnesses) as various NP-complete languages
* [__zk_proof_systems__](src/main/java/zk_proof_systems): serial and distributed implementations of zero-knowledge proof systems
* [__test/java__](src/test/java): Java unit tests for the provided modules and infrastructure

## Overview

This library implements distributed zero knowledge proof systems that enable scalable approaches for proving and verifying,
in zero knowledge, the integrity of computations.
This library implements a distributed zero knowledge proof system, enabling scalably proving (and verifying) the integrity of computations, in zero knowledge.

A prover who knows the witness for an NP statement (i.e., a satisfying input/assignment) can produce a short proof attesting to the truth of the NP statement. This proof can then be verified by anyone, and offers the following properties.

- **Zero knowledge** - the verifier learns nothing from the proof besides the truth of the statement.
- **Succinctness** - the proof is small in size and quick to verify.
- **Succinctness** - the proof is small in size and cheap to verify.
- **Non-interactivity** - the proof does not require back-and-forth interaction between the prover and the verifier.
- **Soundness** - the proof is computationally sound (such a proof system is called an argument).
- **Soundness** - the proof is computationally sound (such a proof is called an *argument*).
- **Proof of knowledge** - the proof attests not just that the NP statement is true, but also that the prover knows why.

These properties comprise a **zkSNARK**, which stands for Zero-Knowledge Succinct Non-interactive ARgument of Knowledge
(though zkSNARKs are also knows as succinct non-interactive computationally-sound zero-knowledge proofs of knowledge).
For formal definitions and theoretical discussions about these, see [BCCT12], [BCIOP13], and the references therein.
These properties comprise a **zkSNARK**, which stands for *Zero-Knowledge Succinct Non-interactive ARgument of Knowledge*.
For formal definitions and theoretical discussions about these, see [BCCT12] [BCIOP13] and the references therein.

**DIZK** provides Java-based implementations using Apache Spark [Apa17] for:

1. General-purpose proof systems
- A serial and distributed preprocessing zkSNARK for the NP-complete language, *R1CS* (Rank-1 Constraint Systems), a language that resembles arithmetic circuit satisfiability, see [Gro16].
- A distributed Merlin-Arthur proof system for evaluating arithmetic circuits on batches of inputs, see [Wil16].
2. Scalable arithmetic computations
- A serial and distributed radix-2 fast Fourier transform (FFT), see [Sze11].
- A serial and distributed multi-scalar multiplication (MSM), see [BGMW93] [Pip76] [Pip80].
- A serial and distributed Lagrange interpolation (Lag), see [BT04].
3. Applications using the above proof systems for
- Authenticity of photos on three transformations - crop, rotation, and blur, see [NT16].
- Integrity of machine learning models with support for linear regression and covariance matrices, see [Bis06] [Can69] [LRF97] [vW97].
1. Proof systems
- A serial and distributed preprocessing zkSNARK for *R1CS* (Rank-1 Constraint Systems), an NP-complete language that resembles arithmetic circuit satisfiability. The zkSNARK is the protocol in [Gro16].
- A distributed Merlin-Arthur proof system for evaluating arithmetic circuits on batches of inputs; see [Wil16].
2. Scalable arithmetic
- A serial and distributed radix-2 fast Fourier transform (FFT); see [Sze11].
- A serial and distributed multi-scalar multiplication (MSM); see [BGMW93] [Pip76] [Pip80].
- A serial and distributed Lagrange interpolation (Lag); see [BT04].
3. Applications using the above zkSNARK for
- Authenticity of photos on three transformations (crop, rotation, blur); see [NT16].
- Integrity of machine learning models with support for linear regression and covariance matrices; see [Bis06] [Can69] [LRF97] [vW97].

## Build guide

Expand All @@ -87,10 +84,9 @@ The library has the following dependencies:

### Why Java?

This library was designed and implemented in Java to utilize Apache Spark, which currently supports Java, Python, and Scala.
Our choice of Java lies in our need to architect a system that is accurate and high performance. This meant foregoing potential
higher-level code optimizations in favor of unambiguous lower-level program execution. Our study of the three language candidates
resulted in our choice of Java as the most viable candidate for enforcing these requirements.
This library uses Apache Spark, an open-source cluster-computing framework that natively supports Java, Scala, and Python. Among these, we found Java to fit our goals because we could leverage its rich features for object-oriented programming and we could control execution in a (relatively) fine-grained way.

While other libraries for zero knowledge proof systems are written in low-level languages (e.g., [libsnark](https://github.com/scipr-lab/libsnark) is written in C++ and [bellman](https://github.com/zkcrypto/bellman) in Rust), harnessing the speed of such languages in our setting is not straightforward. For example, we evaluated the possibility of interfacing with C (using native binding approaches like JNI and JNA), and concluded that the cost of memory management and process inferfacing resulted in a slower performance than from purely native Java execution.

### Installation

Expand All @@ -105,7 +101,7 @@ Next, fetch the dependency modules:
git submodule init && git submodule update
```

Lastly, compile the source code:
Finally, compile the source code:
```$xslt
mvn compile
```
Expand All @@ -120,11 +116,11 @@ mvn test
## Profiler

Using Amazon EC2, the profiler benchmarks the performance of serial and distributed zero-knowledge proof systems, as well as its underlying primitives.
The profiler makes use of `spark-ec2` to manage the cluster compute environment and a set of provided scripts for easy launch, profiling, and shutdown.
The profiler uses `spark-ec2` to manage the cluster compute environment and a set of provided scripts for launch, profiling, and shutdown.

### Spark EC2

To manage the cluster compute environment, DIZK makes use of [`[email protected]`](https://github.com/amplab/spark-ec2/tree/branch-2.0).
To manage the cluster compute environment, DIZK uses [`[email protected]`](https://github.com/amplab/spark-ec2/tree/branch-2.0).
`spark-ec2` is a tool to launch, maintain, and terminate [Apache Spark](https://spark.apache.org/docs/latest/) clusters on Amazon EC2.

To setup `spark-ec2`, run the following commands:
Expand All @@ -137,7 +133,7 @@ pwd

Remember where the directory for `spark-ec2` is located, as this will need to be provided as an environment variable for the scripts as part of the next step.

### Profiling Scripts
### Profiling scripts

To begin, set the environment variables required to initialize the profiler in [init.sh](src/main/java/profiler/scripts/init.sh).
The profiling infrastructure will require access to an AWS account access key and secret key, which can be created with
Expand Down Expand Up @@ -195,9 +191,10 @@ Below we use *instance size* to denote the number of constraints in an R1CS inst

### libsnark *vs* DIZK

First, we measure the largest instance size (as a power of 2) that is supported by:
We measure the largest instance size (as a power of 2) that is supported by:

- the serial implementation of Groth’s protocol in [libsnark](https://github.com/scipr-lab/libsnark) [SCI17]
- the serial implementation of PGHR’s protocol in [libsnark](https://github.com/scipr-lab/libsnark)
- the serial implementation of Groth’s protocol in [libsnark](https://github.com/scipr-lab/libsnark)
- the distributed implementation of Groth's protocol in **DIZK**

<p align="center"><img src="https://user-images.githubusercontent.com/9260812/43099291-9203db9a-8e76-11e8-8d68-528d903500e1.png" width="68%"></p>
Expand All @@ -208,7 +205,7 @@ Instances of this size are much larger than what was previously possible via ser

### Distributed zkSNARK

Next, we benchmark the running time of the setup and the prover on an increasing number of constraints and with an increasing number of executors.
We benchmark the running time of the setup and the prover on an increasing number of constraints and with an increasing number of executors.
Note that we do not need to evaluate the zkSNARK verifier as it is a simple and fast algorithm that can be run even on a smartphone.

<p align="center"><img src="https://user-images.githubusercontent.com/9260812/43099290-91ec40c0-8e76-11e8-8391-c30fbddc4acd.png" width="67%"></p>
Expand All @@ -223,55 +220,65 @@ Our benchmarks of the setup and the prover show us that:

## References

[Apa17] [_Apache Spark_](http://spark.apache.org/), Apache Spark, 2017
[Apa17] [_Apache Spark_](http://spark.apache.org/),
Apache Spark,
2017

[Bis06] [_Pattern recognition and machine learning_](https://www.springer.com/us/book/9780387310732), Christopher M. Bishop, 2006
[Bis06] [_Pattern recognition and machine learning_](https://www.springer.com/us/book/9780387310732),
Christopher M. Bishop,
*Book*, 2006

[BCCT12] [_From extractable collision resistance to succinct non-interactive arguments of knowledge, and back again_](http://eprint.iacr.org/2011/443),
Nir Bitansky, Ran Canetti, Alessandro Chiesa, Eran Tromer,
Innovations in Computer Science (ITCS) 2012
Nir Bitansky, Ran Canetti, Alessandro Chiesa, Eran Tromer,
*Innovations in Theoretical Computer Science* (ITCS), 2012

[BCIOP13] [_Succinct non-interactive arguments via linear interactive proofs_](http://eprint.iacr.org/2012/718),
Nir Bitansky, Alessandro Chiesa, Yuval Ishai, Rafail Ostrovsky, Omer Paneth,
Theory of Cryptography Conference (TCC) 2013
*Theory of Cryptography Conference* (TCC), 2013

[BGMW93] [_Fast exponentiation with precomputation_](https://link.springer.com/chapter/10.1007/3-540-47555-9_18),
Ernest F. Brickell, Daniel M. Gordon, Kevin S. McCurley, and David B. Wilson,
International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT) 1992
*International Conference on the Theory and Applications of Cryptographic Techniques* (EUROCRYPT), 1992

[BT04] [_Barycentric Lagrange interpolation_](https://people.maths.ox.ac.uk/trefethen/barycentric.pdf), Jean-Paul Berrut and Lloyd N. Trefethen, 2004
[BT04] [_Barycentric Lagrange interpolation_](https://people.maths.ox.ac.uk/trefethen/barycentric.pdf),
Jean-Paul Berrut and Lloyd N. Trefethen,
*SIAM Review*, 2004

[Can69] [_A cellular computer to implement the Kalman filter algorithm_](https://dl.acm.org/citation.cfm?id=905686), Lynn E Cannon, 1969
[Can69] [_A cellular computer to implement the Kalman filter algorithm_](https://dl.acm.org/citation.cfm?id=905686),
Lynn E Cannon,
*Doctoral Dissertation*, 1969

[Gro16] [_On the size of pairing-based non-interactive arguments_](https://eprint.iacr.org/2016/260.pdf),
Jens Groth,
International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT) 2016
*International Conference on the Theory and Applications of Cryptographic Techniques (EUROCRYPT)*, 2016

[LRF97] [_Generalized cannon’s algorithm for parallel matrix multiplication_](https://dl.acm.org/citation.cfm?id=263591),
Hyuk-Jae Lee, James P. Robertson, and Jose ́ A. B. Fortes,
International Conference on Supercomputing 1997
Hyuk-Jae Lee, James P. Robertson, and Jose A. B. Fortes,
*International Conference on Supercomputing*, 1997

[NT16] [_Photoproof: Cryptographic image authentication for any set of permissible transformations_](https://www.cs.tau.ac.il/~tromer/papers/photoproof-oakland16.pdf).
[NT16] [_Photoproof: Cryptographic image authentication for any set of permissible transformations_](https://www.cs.tau.ac.il/~tromer/papers/photoproof-oakland16.pdf),
Assa Naveh and Eran Tromer,
IEEE Symposium on Security and Privacy 2016
*IEEE Symposium on Security and Privacy*, 2016

[Pip76] [_On the evaluation of powers and related problems_](https://ieeexplore.ieee.org/document/4567910/),
Nicholas Pippenger,
Symposium on Foundations of Computer Science 1976
*Symposium on Foundations of Computer Science* (FOCS), 1976

[Pip80] [_On the evaluation of powers and monomials_](https://pdfs.semanticscholar.org/7d65/53e185fd90a855717ee915992e17f38c99ae.pdf)
[Pip80] [_On the evaluation of powers and monomials_](https://pdfs.semanticscholar.org/7d65/53e185fd90a855717ee915992e17f38c99ae.pdf),
Nicholas Pippenger,
SIAM Journal on Computing 1980

[SCI17] [_libsnark: a C++ library for zkSNARK proofs_](https://github.com/scipr-lab/libsnark), SCIPR Lab, 2017
*SIAM Journal on Computing*, 1980

[Sze11] [_Schönhage-Strassen algorithm with MapReduce for multiplying terabit integers_](https://people.apache.org/~szetszwo/ssmr20110429.pdf),
Tsz-Wo Sze, 2011
Tsz-Wo Sze,
*International Workshop on Symbolic-Numeric Computation*, 2011

[vW97] [_SUMMA: scalable universal matrix multiplication algorithm_](https://dl.acm.org/citation.cfm?id=899248), Robert A. van de Geijn and Jerrell Watts, 1997
[vW97] [_SUMMA: scalable universal matrix multiplication algorithm_](https://dl.acm.org/citation.cfm?id=899248),
Robert A. van de Geijn and Jerrell Watts,
*Technical Report*, 1997

[Wil16] [_Strong ETH breaks with Merlin and Arthur: Short non-interactive proofs of batch evaluation_](https://arxiv.org/pdf/1601.04743.pdf),
Ryan Williams, 2016
[Wil16] [_Strong ETH breaks with Merlin and Arthur: short non-interactive proofs of batch evaluation_](https://arxiv.org/pdf/1601.04743.pdf),
Ryan Williams,
*Conference on Computational Complexity*, 2016

## Acknowledgements

Expand Down
16 changes: 16 additions & 0 deletions src/main/java/algebra/curves/barreto_naehrig/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Barreto-Naehrig curves

This folder contains implementations and parameters for the following Barreto-Naehrig curves:

| BN254a | Order | # Bits | 2-adicity |
|--------------|--------------------------------------------------------------------|------------|-----------|
| Base Field | 0x30644e72e131a029b85045b68181585d97816a916871ca8d3c208c16d87cfd47 | 254 | 1 |
| Scalar Field | 0x30644e72e131a029b85045b68181585d2833e84879b9709143e1f593f0000001 | 254 | 28 |

| BN254b | Order | # Bits | 2-adicity |
|--------------|--------------------------------------------------------------------|------------|-----------|
| Base Field | 0x277a0785027142409a4f160886e00001921d70600000000188f4000000000001 | 254 | 50 |
| Scalar Field | 0x277a0785027142409a4f160886e000012d9614480000000188f4000000000001 | 254 | 50 |

The first curve is used mostly for testing.
The second curve is used for zkSNARKs on large instances. Indeed, the high 2-adicity of the scalar field enables efficient FFTs for very large domains (of size up to 2^50) over the scalar field.
Loading

0 comments on commit b914360

Please sign in to comment.