zcash / halo2 Goto Github PK
View Code? Open in Web Editor NEWThe Halo2 zero-knowledge proving system
Home Page: https://zcash.github.io/halo2/
License: Other
The Halo2 zero-knowledge proving system
Home Page: https://zcash.github.io/halo2/
License: Other
let (_, h_eval) = self
.h_evals
.iter()
.fold((C::Scalar::one(), C::Scalar::zero()), |(cur, acc), eval| {
(cur * &x_3n, acc + &(cur * eval))
});
Rewriting x^n as x, we can apply Horner's rule to simplify
let h_eval = self
.h_evals
.iter()
.rev()
.fold(C::Scalar::zero(), |acc, eval| {
acc * &x_3n + eval
});
https://electriccoin.co/blog/the-pasta-curves-for-halo-2-and-beyond/
We can just migrate over our existing implementation of the Tweedle curves, since we won't be using those anymore.
See also zcash/zcash#4710
Document initial set of benchmarks for milestone 1 including benchmarking for various circuit sizes.
Currently permutation arguments can only refer to advice columns, which makes it difficult to load constants.
Define a parameterized model that can predict the prover and verifier performance of a given Halo 2 circuit, so that I can understand how to optimize circuits. The model needs to be validated against measured performance.
We will want some way to create reusable gadgets that can be composed to create halo2 circuits. This is pretty closely coupled to the general circuit API.
If possible, it would be great to maintain the bellman approach of keeping circuit generation and witness synthesis close together. The current halo2 circuit API doesn't quite achieve that (they are both part of the same trait but occur in different functions), so we either need to adjust that API or design gadgets to abstract over it.
In #95 we added a tool for developers to debug their circuits while writing them. It would be useful to have this also be used to augment production errors when creating proofs. We could do this by running MockProver
on the given circuit if Proof::create
returns an invalid proof (rather than, say, the circuit struct not being set up correctly by e.g. not having all witnesses set). If MockProver
did not find an issue with an invalid proof, that itself would mean we'd uncovered non-determinism within the circuit, which is incredibly important to convey to developers.
It's possible to structure halo2 proofs in such a way that they can share resources. We can then create multiple proofs at the same time, with most of the proof size and time in a common block, and then marginal size and time costs per proof.
Currently, the commitment opening implementation stores a fork
value in the opening proof which the prover can vary until the point that is sampled at the beginning of the argument is on the curve. Let's remove this and instead make the proving fallible; the caller is then responsible for forking the transcript some other way (changing a blinding factor in a commitment) and trying again until it succeeds.
This is used in a few places and can be extracted into a common helper utility (see #53 (comment)):
// Closure to construct commitment to column of values
let commit_column = |column: &Polynomial<C::Scalar, LagrangeCoeff>| {
let poly = pk.vk.domain.lagrange_to_coeff(column.clone());
let blind = Blind(C::Scalar::rand());
let commitment = params.commit_lagrange(&column, blind).to_affine();
(poly, blind, commitment)
};
The helper utility will most likely live in the poly
crate.
Possibly return cosets as well.
We should refactor how proofs are formed, so that they e.g. come in as byte streams, so that we don't need to encode lengths in the proofs.
The current Contributor Agreement is missing a few sections and the section numbering is off and needs to be corrected.
No need for this verbose name in poly::commitment
.
A Sage prototype for the Pasta fields is here: https://github.com/zcash/pasta/blob/master/squareroottab.sage
This includes a few minor optimizations and simplifications that are not in the paper. We essentially follow section 3 but specialize it to the case ℓ0..k-2 = w, which greatly simplifies the algorithm. We also merge Tab1 and Tab2 (which is mentioned in the paper) and use perfect hashing for Tab3 (which is not mentioned).
Currently, the implementation randomly adds together commitments that are being opened at the same point, then performs a multi-point opening. However, this is inefficient when commitments are opened at multiple points, as then we have to do an unnecessary amount of arithmetic.
Instead, we'll make it so that the set of points at which every commitment is opened is queried separately, to minimize the amount of group arithmetic done by the verifier. It'll probably be nicer to break this abstraction out and separate it from the plonk prover; we could place it in the poly
module somewhere.
In order to actually use the accumulation trick, we need "auxiliary" advice wires that are brought in from outside. These can engage in the argument itself, though the only example I can think where that is useful is with public inputs. Somehow, the proof structure exposes these auxiliary evaluations so that the verifier can check them if needed, which is the case with the accumulation.
We've decided to call wires "columns" instead, and witness values "cells." Let's unify the terminology we're using with what's implemented.
Is there a possibility that absorbing two bases in a row could be confused with absorbing a point (in terms the transcript malleability)? Where do we use Transcript::absorb_base
, and do we need to differentiate between absorbing individual bases and points (with e.g. some kind of domain separator)?
Opened from #43 (comment)
Right now, aux wires don't have blinding factors because the G element in the accumulator doesn't have one either. However, we can amend the polynomial commitment opening protocol so that it naturally supports a blinding factor of 1 for the G element just like fixed wires have.
Here:
Lines 180 to 190 in 18746f1
IncompatibleParams
). So I think it is not an error and should just return Ok(false)
.It's a pretty common building block in several circuits, and thus a useful test case (even if we won't use it ourselves in Pollard).
Instead of setting up two transcripts individually, merge the transcripts together so that the implementation can choose how to optimize for the particular setting.
Also, probably rename Hasher to Transcript.
Currently, the verifier performs a complete verification of the OpeningProof
. We want to support both accumulation and batching in the opening argument, and here's a sketch of what that might look like:
We'll have an MSM
(multiscalar multiplication) structure parallel to Params
. This structure will let you add arbitrary terms (the scalar and the point) or add to terms that already exist (such as g
and h
). It'll also let you scale the entire thing by a random blinding factor. Finally, it'll let you check is_zero()
to perform the multiexp and check that it results in zero.
The OpeningProof::verify()
method will take an MSM
as input for it to append the necessary values. It will not compute the G
element (expensive computation) and instead return (1) a structure containing the challenges, and (2) a guard structure which lets the prover either supply the challenges and obtain an MSM, or supply the purported G value and obtain an MSM. The caller will either keep the challenges and the G value and place it in an accumulator object, or it will discard the challenges and check the MSM either immediately or later on.
Currently, we take field elements from a transcript instantiated over the "other curve" and interpret them as field elements in the "current curve." This is fine for one curve's scalar field but not the other one, because it's smaller. However, it's really close in size so perhaps if the output of the hash is random, there's no way for it to overflow. (Perhaps even when it overflows it doesn't matter?)
Right now, in parallel_generator_collapse
we do
for (g_lo, g_hi) in g_lo.iter().zip(g_hi.iter()) {
// TODO: could use multiexp
tmp.push(((*g_lo) * challenge_inv) + &((*g_hi) * challenge));
}
This could be done instead with a small multiexp. Right now we only have Pippinger's multiexp implemented, but this would be more efficient using the standard "share doublings" approach, maybe with some w-NAF added in.
Furthermore, it may be possible to "defer" these individual multiexps to later rounds and just work over the original basis temporarily until the number of individual multiexps is reduced.
As of #53, the ProvingKey
includes values of all fixed columns so that the prover can use them in lookup arguments.
However, not all fixed columns are used in a lookup argument. We can remove unused fixed columns in keygen.rs
. We will also have to change how the prover indexes into this subset of fixed columns.
This would enable recursion to be implemented without being bound to a specific choice of algebraic hash, enabling other projects to keep using primitives they might already be depending on.
Right now you can only commit to polynomials of degree 2^k - 1, but in the PLONK protocol you'll learn several openings of these polynomials and thus zero-knowledge is broken. In order to fix this we have to commit to a polynomial of slightly larger degree (just enough to compensate for the number of openings) but this is not easy to do with the existing APIs, especially because of the FFTs and the assumptions about the size of the evaluation domains.
This will require changing how the Polynomial structure works (probably forcing it to keep track of the degree of the polynomial) among other things.
Currently the PLONK verifier will manually compute the f_commitment
for input into the commitment opening proof verifier. Instead, it should just incorporate this computation into the MSM that it feeds into the commitment verifier. Then, the responsibility for scaling the MSM randomly falls to the plonk verifier instead.
It also may be best to return the Guard
from the PLONK verifier.
I think it's ≤ 30, since rows are sometimes (for relative references) represented as i32
, and we need some headroom.
Let's say that at some point we have a healthy ecosystem of third-party gadget implementations. A circuit implementation for a given protocol will need to make sure that it can pick up all security fixes and circuit-independent performance fixes, in both the proving system library and any third-party gadgets, without changing the circuit and invalidating existing proofs.
I propose that we assign a date-based version number (say yyyymmdd) to each circuit, and we pass this into all gadgets [*]. Now the stability policy for a gadget can be either:
How this would be expressed in a gadget implementation, is that if you're adding a circuit-breaking change on yyyymmdd, then you must make the change conditional on whether the circuit version is ≥ yyyymmdd. Using dates ensures that gadget authors don't have to coordinate, but circuit authors can control when to make breaking changes and then just pick up all such changes up to a given date (after reviewing them for security, of course). Non-breaking updates will be much easier in this system.
[*] We likely will have some kind of context argument passed to gadgets anyway when instantiating them, so the version doesn't have to be passed as a separate argument.
Input: a set of regions, each of which takes up a given set of columns region.cols
for a given number of rows region.rows
.
Output: a mapping from regions to their start rows.
Greedy first fit algorithm (sketch):
For each column, keep track of the set of allocated [start, end) intervals so far.
We greedily find a placement for each region in turn (there is no backtracking at the region level):
It is probably easier to explain this just by giving a Python implementation:
#!/usr/bin/env python3
from math import inf
class Region:
def __init__(self, rows, cols):
self.rows = rows
self.cols = cols
def __repr__(self):
return "Region(%d, %r)" % (self.rows, self.cols)
class Column:
def __init__(self, index):
self.index = index
# allocs is an unordered list of [start, end) pairs.
self.allocs = []
def __repr__(self):
return "Column(%d): %r" % (self.index, self.allocs)
def plan(regions):
for reg in regions:
start = first_fit_region(reg, 0, 0, inf)
# Some placement must be possible given infinite rows.
assert start is not None
print("%4r: %r" % (start, reg))
def first_fit_region(reg, col_index, start, end):
assert start + reg.rows <= end
if col_index == len(reg.cols):
return start
c = reg.cols[col_index]
# Iterate over the unallocated nonempty intervals in c that intersect [start, end).
for (c_start, c_end) in free_intervals(c.allocs, start, end):
# Do we have enough room for this column of reg?
if c_start + reg.rows <= c_end:
# Recursively check whether the placement works for all other columns of reg.
row = first_fit_region(reg, col_index+1, c_start, c_end)
if row is not None:
# It works. On the way back out of the recursion, record the placement.
assert row + reg.rows <= c_end
c.allocs.append((row, row + reg.rows))
return row
# No placement worked; the caller will need to try other possibilities.
return None
# allocs is a set of [a_start, a_end) pairs representing disjoint allocated intervals.
# Return all the *unallocated* nonempty intervals intersecting [start, end).
def free_intervals(allocs, start, end):
row = start
for (a_start, a_end) in sorted(allocs):
if a_start >= end:
# further allocated intervals don't intersect
break
if row < a_start:
yield (row, a_start)
row = max(row, a_end) # Edit: fixed bug here.
if row < end:
yield (row, end)
def test():
(c1, c2, c3) = (Column(1), Column(2), Column(3))
reg1 = Region(15, [c1, c2])
reg2 = Region(10, [c3])
reg3 = Region(10, [c3, c1])
plan([reg1, reg2, reg3])
if __name__ == "__main__":
test()
Here's the placement that this algorithm finds for the test case:
0: Block(15, [Column(1): [(0, 15)], Column(2): [(0, 15)]])
0: Block(10, [Column(3): [(0, 10)]])
15: Block(10, [Column(3): [(0, 10), (15, 25)], Column(1): [(0, 15), (15, 25)]])
Hi,
Have the latest XDK, tried compiling and it doesn't build, any help would be appreciated.
This optimization allows us to use the Rescue duplex sponge for Fiat–Shamir with a rate of 3 rather than 4 (so, a width of 4 field elements rather than 5). This reduces the number of wires needed for a RESCUE gate. (Actually this optimization is independent of whether we use Rescue or Poseidon.) The code that currently absorbs two curve points, in each round of the polynomial commitment argument, is here:
halo2/src/poly/commitment/verifier.rs
Lines 59 to 62 in eeb1b24
halo2/src/poly/commitment/prover.rs
Lines 116 to 119 in eeb1b24
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.