crev-dev / cargo-crev Goto Github PK

View Code? Open in Web Editor NEW

2.0K 23.0 86.0 3.41 MB

A cryptographically verifiable code review system for the cargo (Rust) package manager.

License: Apache License 2.0

Nix 2.01% Rust 96.57% Shell 1.33% Dockerfile 0.10%

code review code-review scalable p2p security trust decentralized

cargo-crev's Introduction

image credit

cargo-crev

A cryptographically verifiable code review system for the cargo (Rust) package manager.

Introduction

Crev is a language and ecosystem agnostic, distributed code review system.

cargo-crev is an implementation of Crev as a command line tool integrated with cargo. This tool helps Rust users evaluate the quality and trustworthiness of their package dependencies.

Features

cargo-crev can already:

warn you about untrustworthy crates and security vulnerabilities,
display useful metrics about your dependencies,
help you identify dependency-bloat,
allow you to review most suspicious dependencies and publish your findings,
use reviews produced by other users,
increase trustworthiness of your own code,
build a web of trust of other reputable users to help verify the code you use,

and many other things with many more to come.

Getting started

Static binaries are available from the releases page.

Follow the cargo-crev - Getting Started Guide (more documentation available on docs.rs).

cargo-crev is a work in progress, but it should be usable at all times. Use discussions to get help, more information and report feedback. Thank you!

Raise awareness

If you're supportive of the cause, we would appreciate helping to raise awareness of the project. Consider putting the below note in the README of your Rust projects:

It is recommended to always use [cargo-crev](https://github.com/crev-dev/cargo-crev)
to verify the trustworthiness of each of your dependencies, including this one.

Thank you!

Changelog

Changelog can be found here: https://github.com/crev-dev/cargo-crev/blob/main/cargo-crev/CHANGELOG.md

cargo-crev's People

Contributors

Stargazers

Watchers

Forkers

dylan-dpc-zz tommilligan tylerlaberge pzmarzly thomasdenh fredrikekre non-jedi ryanwilsonperkin chessai 0xflotus shaunstanislauslau tarsbase kornelski awesome-archive pimotte jguodev pombredanne emuhedo hjh2019 womingyoutian oherrala johntitor ree26er rust-stuff nemo157 canop alexendoo baitcenter maulingmonkey icefoxen vishalsodani afck ffranr sinhasantos phi-gamma odanoburu zoechi dbrgn kamilaborowska gyc567 steadylearner ambiso niklasf katharostech ia0 tokcum heroickatora awfa jnaulty ralpha shadowjonathan neuroradiology isgasho robinkrahl mgeisler kpcyrd zeta1999 axion014 strugee atouchet vlad20012 icodein p-avital jayhill365 pxiaoer chris-morgan frankfanslc sgued mattruzzi janzerebecki tcharding thomasjfox matthiasbeyer golddranks sleiner-forks nasifimtiazohi thealgorythm cyberflamego dpc fenollp 0abquaycongo tommorris orhun b15hi rstkit witcher01

cargo-crev's Issues

Rename `cargo-trust`?

There's a confusion created by the name of the cargo subcommand (trust) which is a verb.

Eg. to trust someone it would have to be:

cargo trust trust <id>

Originally I wanted to name it cargo crev, then switched to cargo trust. If there are no other ideas, I will revert back.

`crev verify`

This should read WoT and from the perspective of the user's ID, display trust status of all files in the project.

Id should be configurable, so it should be possible to do this for other IDs as well.

Discuss compromised identities scenarios.

People are going to get their machines compromised, and CrevIDs stolen.

My plan was that people should just create a self-Trust Proof with distrust set to non-None and publish that. Any client that finds a Trust Proof like that should immediately distrust the whole CrevId. Maybe even include the Proof like that into their own trust db to publish it for others to see. Only for CrevIDs that they considered trusted before, to prevent spamming.

The rest of the problem should be covered by the fact that the default number of reviews required to consider something a trusted code, should be at least 2. This way one compromised/malicious individual can not compromise anything. For this to happen, the graph/trust algorithm will have to get smarter too and consider only non-overlapping paths, so that people can't create a new CrevId, trust it, and it would now count as another reviewer.

`cargo crev trust` should check if ids are valid

It is possible to cargo crev trust <garbage> and generate Trust Proof out of it.

Split in 3 crates.

data should only abstractly handle data without any IO/files/paths considerations.

lib should handle the core logic, without concern for CLI.

bin should be just simple CLI over lib

Review recursive digest crate.

cargo crev is using a custom recursive digest algorithm to calculate the unique digest of the crate content. It is vitally important that this operation is cryptographically secure, and there are no bugs, so it doesn't have to get fixed while all previously calculated and signed digests are now incorrect.

I'd like at least one person go through that fairly small sub-crate and reviewe it.

`List` commands

cargo crev
  list
    ids all # all known
    ids trusted
    ids mine # currently `id list`
    reviews <package_name> <package_version>

`cargo crev list-reviews <crate> [<name>]`

This would allow nicely:

discovering people that already reviewed a given crate
and also looking for versions that were already reviewed.

`crev add` and `crev rm` should support directory arguments

Just recursively add all files.

Git remote cache

Clone github repositories to:

~/.cache/crev/remote/<id>/<blake2(id-url)>/...

Negative reviews - how? What values should `trust` field have?

Right now Review Proof and Trust Proof comes with a trust field (among other). I was planing that every field would have 4 possible values:

none
some
good
ultimate

The question is - should none be "negative reviiew" and mean "I know this crate should not be used" or should there be additional field distrust, and trust: none mean "I have no trust in thie code, but I don't necessarily distrust it".

In my opinion the less potential fields and values the better, to a point. Too many fields and potential values, make user decision too difficult, and reasoning about the whole trust harder.

Too little fields/values might make certain important scenarios impossible too express.

`cargo crev change readme`

Edit the README.md inside the proof directory.
Add to git index

`cargo crev review <crate>` should error on version ambiguity

If the project is using multiple versions of same <crate>, the <version> argument must be passed to specify which.

Right now it will create a proof for the "random" one.

`cargo crev db git ...` to not treat `-args` as it's own argument

Right now cargo crev db git -- commit -a has to be used, which is inconvenient. Anything after git can't be an argument to cargo-crev itself.

cargo trust potential design issues

Hello,

I have just come upon https://github.com/dpc/crev/wiki/cargo-trust:-Concept ; and have a few comments about it.

First, ~~$ cargo trust verify~~ $ cargo trust project (EDIT: oops) appears to trigger Updating registry. This means that somehow, signing something triggers a download. If it does, then how do I know that what I'm signing is what I have reviewed? It should definitely not need to download anything, as if it needs to then there's a TOCTOU attack.

Second, trust project and trust id are under the same subcommand. This is one of the design errors of GnuPG (and in a way OpenPGP): putting ownertrust and key validity under the term “sign”. Here, you want to consider project's validity, and other key's trust. As such, I don't think it makes sense to jam the two under the same command, and it will likely lead to the same kind of confusion brought by GnuPG's interface and everyone confusing ownertrust and key validity. Maybe validate project and trust id would be better names.

Finally, one of the big drawbacks of reimplementing one's own crypto (by having one's own keypair) is that it means it can't be put on a secure hardware token (eg. smartcard). Which is not nice from a security point of view, if you assume that identities are supposed to survive computer compromise.

HTH,
Leo

Can future versions of `cargo` break our digests?

cargo does some stuff to the original Cargo.toml and directory where it downloads the crate. If anything about it changes in the future, digest that we've calculated could change, and all the existing Project Review Proofs would stop working.

Can we do anything about it? Will the cargo team agree to keep the download directory immutable? rust-lang/cargo#6340

Right now when calculating the checksum cargo-crev will skip the .cargo-ok file.

I wish cargo would just leave the whole directory alone, and any additional files or modifications happened in another place, letting the crate source stay immutable.

Project Review Proof - gather more info from cargo

We should fill in project-name, project-version - for information purposes

If possible to get it reliably from cargo, the revision would be awesome too.

`crev commit` should add instruction in comments

Just like git commit:

<review goes here>

# lines after # are ignored
# blabhablah
# thouroguness means:
#  none - you don't trust
# low - you're not sure
# medium - you trust somewhat
# high - you trust
# blabhablah

Implement distrust in `trustdb`

Right now it's just ignored.

`Fetch` commands

Move fetch from cargo crev db to cargo crev command.

cargo crev
  fetch
    url <url>
    trusted
    known

Fetch:

single URL
all trusted IDs (recursively)
all known IDs (recursively?)

This is going to be a primary way for people to discover other reviews and IDs.

Prettier `Id` serialization

Because of serde shotcommings (serde-rs/serde#1410) it doesn't look optimal.

`crev commit` should detect urls from `Cargo.toml`

`crev update` should download updates

Related to #11

The current idea is that we will try to fetch repositories from urls of trusted IDs.

Create some helper commands for `cargo crev db`

cargo crev db commit # git ci -a
cargo crev db push # git push
cargo crev db pull # git pull --rebase (?)

`crev commit` should detect the current revision

Make trust graph consider paths (aka flow).

This is related to #44 .

Right now WoT graph is build by just a cost-bounded flooding of a graph. This makes it possible for anyone to create a new CrevId, trust it, and this way artificially increase the possible count of reviewes for a given crate.

This algorithm should keep track of path (id(s) of that directly trusted this one), on each step to the root of the trust tree, and so that when calculating the count of reviews, it's possible to "merge" reviews coming out from a common path, for the purpose of calculating the total trust count.

Example: If you directly trust only one other CrevId, you can only have trust count equal to 1, for any given crate, no matter how many people reviewed it.

`cargo crev` doesn't work on Edition 2018 crates

rust-lang/cargo#6113

This will probably solve itself soon, after Edition 2018 is stable.

Using artifact for design

I'm planning to give https://github.com/vitiral/artifact a try. 2.0.0 is just around the corner, and git version is building for me just fine.

`cargo crev id gen` should set-up a tracking branch

Because it isn't right now, the first push has to be:

cargo trust db git -- push  --set-upstream origin master

while the following should just work:

cargo trust db git -- push

Should we use shorter `digest`?

Would it make sense to take shorter digest, just the output shorter? Is the loss in security significant?

Change license

I'm planing to migrate to:

license = "MPL-2.0 OR MIT OR Apache-2.0"

everywhere. Just to give people more choice. I did in digest crates already, before releasing.

@Dylan-DPC @rffrancon . Ack? :)

`crev status`

How should commands be structured?

Moving discussion from #47 to a new thread.

cargo crev <verb> <obj> <args> seems to work quite well.

Edit: I'll be updating this list. Not all implemented yet.

cargo crev
  new
    id
  change
    id
    readme
  review <pkg> <version>
  trust <ids>
  push
  pull
  fetch
    url <url>
    trusted
    all
  query
    id
      current [--urls]
      own [--urls]
      trusted  [--urls]
      distrusted [--urls]
      all [--urls]
    review  [--by <id>] [--trusted] [--distrusted] <pkg> <version>  
    package
      outdated [--trusted]

...

Contributing/Technical documentation

It would be awesome if you could add a CONTRIBUTING.md and/or some documentation on the more technical side. Doesn't need to be much since this is still early and sure to change, but some notes on where everything is/how the code is organized could make it easier for others to contribute :)

Add `ReviewStore` and `TrustStore`

They should be in-memory data structures that allow easy lookup.

So something like:

type Id = usize;

struct ReviewStore {

review_by_id: HashMap<Id, ReviewProof>
pubid_to_id: MultiMap<String, Id>; // or BTreeMap<String, Vec<Id>>,
/// any other "index" for lookup
}

Functionally, it's a one-column table in relational database + many indices.

These should be able to (de-)serialize to/from a file.

The mode of operation would be: on every command that requires it, crev scans stuff and builds a small database like that, later used to perform many lookups when verifying trust, traversing graph of trust, etc.

We can start with just loading everything every time, and if it won't scale, we can switch to SQLite or something. Then we can introduce a trait ReviewStore and have many implementations.

Want to help? Just try out `cargo crev` and give feedback.

cargo-crev is kind of working already. In a sense it's even quite feature complete (alpha quality though)

See https://github.com/dpc/crev/tree/master/cargo-crev for instructions.

Proof store paths

It should be:

$HOME/.config/crev/<pubid>/{trust,review}/{year}-{month}.crev

This way it's easy to share the whole <pubid> eg. on github, and year-month is a good balance between too many fies and rewriting too big files.

`ed25519-dalek`, `miscreant`, and other crypto - stabilize

ed25519-dalek, miscreant are both in pre-final release.

ed25519-dalek is almost 1.0.0 but other crypto libraries are not. Is it a bad idea to use them?

When it's OK to release crev to the public?

`crev rm <file>`

Weighing opinions of reviewers

One issue of any review system, is that not all reviewers are created equal:

some are more knowledgeable in certain domains,
some have (possibly unconscious) bias,
some may have agendas (zealots),
...

It can be difficult enough to weigh in the opinion of multiple reviews when one personally knows the reviewers; when they are anonymous and numerous, such as in crev, it is just unwieldy. And unfortunately the "average" only gets you the opinion of the masses, which risks drowning the voices of the experts in the noise.

I have been thinking hard about this problem of scaling trust from a handful of individuals to an unlimited number of them, and my answer is to trust a few founders, and let them delegate (part of) their authority in a hierarchical Web of Trust. Then, associating a weigh to a handful of webs seems like "reasonable" homework, and more casual users can just go with the community consensus.

I have described the system, at length, at Scaling Trust: Weighted Webs of Trust. It is relatively complete, as far as I can see, and should support this goal in a scalable way.

The one remaining question, however, is how susceptible the system is to malign individuals. That is, how easy it would be for a determined individual, or group, to subvert the system? I have already tried to imagine multiple attack vectors, their consequences, and the available mitigations and responses... however it only takes one vulnerability to upend all this, so I would appreciate additional eyes.

`cargo crev review <crate> <version>` should work even outside of any project

It should be possible to call it from anywhere.

Trust Proofs: Should there be one field, or two for trust.

At least one user expressed that two are needed: https://www.reddit.com/r/rust/comments/99aiea/idea_for_a_scalable_code_reviewtrust_system_not/e4pc8rh/

So there could be eg.:

trust: some
transitive-trust: none

meaning "I trust code reviews of this person, but I don't necessarily trust people that this person trusts.

If we're going with two fields, things are becoming more complicated. How exactly should transitive trust work (especially that it is supposed to be configurable).

Two versions of `proof::Content` serialization.

There are two, slightly different ways to serialize proofs:

the final one: this one gets signed and becomes a proof
the editing one: when interactively editing the proof, it makes sense to show/hide some fields differently / have different defaults:
- version should not be editable by the user, but be there in the final serialization
- comment should always be there when editing, but hidden if empty in the empty version
maybe others
there's probably no point in showing the digest and revision when editing (are we sure?)

The solution here is probably to have two versions of each struct, with different serde annotations, and some methods to conveniently convert them back and forth.

Better error messages when locked id password was wrong.

`cargo crev id gen` -> `cargo crev id new`

Help fix serde_yaml issue

chyh1990/yaml-rust#107

This should be an easy fix, and these needless quotes are ruining everything.

Prettify `cargo crev verify`

I've improved formatting, but at least some color would help.

Yaml or something else?

Yaml is my favorite:

most readable: looks very simple, yet supports compact nesting
has serde support so no need to write parsers
other languages also should have good support of it

Few problems I've noticed:

chyh1990/yaml-rust#107
chyh1990/yaml-rust#106

`crev trust git <cmd>`

... should run git <cmd> inside a ~/.config/crev/<ownid> so it's easy to initialize git repo and push it somewhere etc.

Circulating WoT updates.

We need to figure out, what is the best way to keep up with proofs being created.

One idea that I have is to have urls embedded in both IDs and Review Proofs, that would crev fetch updates in the future.

Review Proof already is supposed to have project_urls field: a sequence of URL. The main idea was to allow identifying somehow, which given Review Proof is reviewing. But a secondary function could be - fetching future Review Proof.

Eg. maintainers release version 1.2.3. The snapshot at the point of the release (that is uploaded to NPM.org/crates.io) does contain some Review Proofs already, but only after it is released people will find out about it, review it, and hopefully submit reviews as PRs to the project.

If the existing Review Proofs contain URL to the upstream repository, crev can download the up to date revision, and use the Review Proofs that came up after the release.

Similarly, it might be in user's best intention to keep a public personal git repository with all own Review Proofs, all Trust Proof from other users, and all Trust Proofs of their own. This way all people that trust given ID, and already have their ID in their local WoT, can fetch the latest version, and keep up to date. So we should add urls field with sequence of urls in the user ID (and thus in Trust Proofs).

The only problem that I see here is finding a balance between keeping up to date, and potentially recursively having to download too much data.

The Trust Reviews URL have a natural cut-off point: download updates only for IDs you trust (since you trust them).

The repository URLs don't, since "trusting" and "not trusting" is not clearly defined, and changes all the time. But the "the current one and all dependencies" seems OK...