aszepieniec / stark-brainfuck Goto Github PK

Tutorial for designing and impementing a STARK-compatible VM, along with a fully functional Brainfark instruction set architecture, virtual machine, prover, and verifier.

License: Apache License 2.0

Python 100.00%

stark-brainfuck's People

Contributors

Stargazers

Watchers

Forkers

andrewmilson zisequkuai whisker17 lurium monyverse hbcbh1999 seanczkm spathidourgos tinychainorg alexiachen

stark-brainfuck's Issues

Merge Input and Output Tables

Currently, InputTable and OutputTable are separate classes. However, their logics are identical. It should be one class, IOTable that generates two objects.

Add AIR Constraints for Extended Tables

Currently, there are only AIR constraints for the AETs before extension.

Reduce Columns for Evaluation Argument

Currently, the evaluation argument uses two columns: one for the running indeterminate, and one for the running evaluation. However, it might be possible with only one extra column.

Move Selection of Challenges to Callers

The Table Extension classes should not have an idea of which challenges to pick; they should be given the challenges and work with those.

Permutation Argument Soundness

The argument in point 1. of the section "Permutation Arguments" appears to be incomplete.

Denote the tables as matrices by A, B and the verifier randomness (a, b, c, ...) by the vector r. What needs to be analyzed is the probability that the components of the vector Ar are a permutation of the components of the vector Br when the rows of A and B are not permutations of each other. That is, what are the odds that for every permutation matrix P, A ≠ PB, but that we sample an r such that Ar = PBr for some P?

The argument given correctly analyzes the probability of sampling an r such that Ar = PBr for a fixed P. However, if Ar = PBr for any permutation matrix P, then the prover can trick the verifier, so only considering a single P is insufficient.

Basically, it's a priori possible that A - PB has a large codimension 1 kernel for a large number of permutations P. If these kernels are distinct they could fill up a significant portion of the space the verifier is sampling from, since the number of permutations P is on par with the field size for tables of even moderate height.

Link Extended Tables

Currently, the extended tables have AIR constraints. The next step is to ensure that the extension columns compute correct values across distinct tables. Concretely, we need:

a way to represent these relation checks
a way to perform them

Make Table-Extension Generic

Currently, there are separate functions computing the extension for each table. Instead, this should be one function with a higher-order function as an argument.

Merge AIR constraints

AIR constraints that have identical supports (= set of points where they should evaluate to zero) can be linearly combined with random coefficients sampled by the verifier. This can cut the computation of quotient codewords down by a large factor.

Represent Boundary Constraints as Polynomials

Currently, boundary constraints are represented as a tuple (register, cycle, value). Unfortunately, this is not compatible with the boundary constraints of table extensions, where the concrete value is not known. Instead, we need a representation in terms of (cycle, polynomial), which is strictly more general.

adapt FRI to extension field

Part 3: the constructino of the Instruction Table

Thank you very much with this excellent guides about zkVM. I am a little confused about the description about a permutation argument (for a subset relation) as shown in the figure. Actually, a permutation argument should be used for the multi-set equality?

Another question is:
In the Instruction Table (IT) section, it says that the Instruction Table contains one row for every row in the Processor Table (PT), in addition to one row for every instruction in the program. Assume an instruction not in the program is included in the Processor Table, then this instruction will still be included in the Instruction Table. Then, the subset relation between PT and IT and sublist relation program and IT will be satisfied. In this way, it seems the soundness is broken.

Actually, I think an easier understanding way could be used to prove that each instruction executed in the Processor Table is belongs to the Program. As follows,

An Instruction Table with three colums (ip, ci, ni) could be publicly constructed based on the Program (ip, ci). The first two colums of the Instruction Table are exactly the same with the two colums of Program. The third colum satisfy the constrain that ci = ni', which means that ci is the same with the next instruction.
Use a lookup argument to prove that each row of the related colums in the Processor Table belongs to the Instruction Table.

Move Extension Logic to Extended Table Classes

Currently, the logic for computing table extensions is in VirtualMachine. It should be in the respective Table Extension class.

Make Table-Verification generic

Too much boilerplate code to apply identical logics to the various tables. Instead, write the logic once, and loop over the various tables.

functional verifier

Create Context Tables by Selecting Columns

Currently, the Context Tables are created by adding one row for every execution step, and sorting after the entire execution is done. This generates problems when the tables are padded to length 2^k, because the paddings must be consistent. Instead, the better way to generate the context tables is by selecting the right columns from the processor table.

It might be the case that more sophisticated context tables cannot be generated from selecting columns, but certainly the instruction and memory tables do satisfy this property.

Prover skeleton

Soundness in relation to `shift`

Let's check if all the shift values in the non-linear combination have the correct value. If it's too low, it will affect soundness.

Fails on `17 * '!'` output

The program fails on a "Hello World!" output. It works on a 16 * '!' output but fails on a 17 * '!' output.

Unsound proof with `is_zero` set erroneously to zero

We made a test that demonstrates the error by implementing an adversarial simulator. To run this test, go to the demonstrate-iszero-error branch and execute these commands from the code directory:

rm proof.dump; pypy3
from test_brainfuck_stark import *
set_adversarial_is_zero_value_test()

https://github.com/aszepieniec/stark-brainfuck/tree/demonstrate-iszero-error

Padding

The tables need to have length 2^k, even if the computation did not run for this number of cycles. As a result, they need to be padded with extra rows and at this point the AIR constraints might become invalid. Each table needs its own padding rule, and it is possible that the AIR constraints need to be modified.

disallow -1 as index

The index rounded_trace_length - 1 could be sampled by FRI. This should be disallowed, because there is no consistent state evolution between states rounded_trace_length - 1 and 0.

New name for "terminal"?

I like to use the abbreviation tq for transition quotient but we calculate both the terminal quotient and the transition quotient. Could we call "terminal" something else, so the abbreviation tq can be reserved for transition quotients?

Labeled lists

To help the debugging process, the quotient codeword lists should be stored along with a label such that e.g. the shift parameter in the non-linear combination can be associated with a specific quotient in debugging output. Currently we have to do some intricate counting to identify shifts with specific quotients.

Verifier skeleton

Wrong constraints in Memory table

Hi, thanks for the really interesting blog.

I found some minor issue in Memory Table paragraph.

(clk - 1 - clk*)(mv -mv*) should be (clk + 1 - clk*)(mv -mv*)
enforce both (mp - mp*) mv* and (mp -mp*)(mv - mv*) imply that also mv should be 0 every time that mp change (and not just mv*). I noted that in the python code you removed the (mp -mp*)(mv - mv*) constrain

Represent all polynomials as codewords

Polynomial operations are slow, but they are much faster when done on Reed-Solomon codewords. For this we need an evaluation domain that is smaller than the FRI domain (AKA. omega domain) but larger than the trace domain (AKA. omicron domain).

Instantiate BrainfuckStark object with computational claim

Currently, the instance (meaning the computational claim) is provided as an explicit argument to the prover and the verifier. However, this information should be properties of the BrainfuckStark object, supplied when it is initialized.