dat-ecosystem-archive / docs Goto Github PK

Documentation resources for dat and the surrounding ecosystem [ DEPRECATED - see https://github.com/hypercore-protocol/new-website/tree/master/guides for similar functionality. More info on active projects and modules at https://dat-ecosystem.org/ ]

Home Page: https://dat-ecosystem-archive.github.io/docs/

CSS 2.02% JavaScript 97.98%

dat dat-protocol

docs's Introduction

See hyp for similar functionality.

More info on active projects and modules at dat-ecosystem.org

Dat Project Documentation

docs's People

Contributors

Stargazers

Watchers

Forkers

olizilla clkao chadfennell juliangruber bcomnes ionicabizaukitchen darriall kemitchell emilbayes thewebsitenursery andrewosh joshuawarner32 aaronshaf pfrazee michaelrhodes alexxnica hbehrens srenatus kislyuk matthiaskern arcalinea acka47 spencerx dialmarco michft rvaughan cortex narainbalaji millerhooks danjallen hhy5277 ecoblockchain bnewbold galgeek salisbury-espinosa aschrijver creationix dcposch bnvk techmexdev amellnik aartgoossens corderophi678 nlpet khubo anarcat sbepstein 42vision monadicus schollz harikrishnan-github fabi91 aral rmccrary tu-curious zhaohaijun jimpick cequencer sauln nicknikolov jbn bioglp bencevans mulpk tbsmn dat-china shumin1027 osakastarbux tmacharia 0polar edrex martinheidegger ivzhh frostytear avry philcockfield robfr todrobbins neuronsupport anchpop deltaf1 noclouds websiteinspiration forkkit micahscopes hhan87 jestre blooz jolindroth haozhang46 fabiancook isabella232 dat-ecosystem

docs's Issues

How can I change or recover my password?

👋

Trying to log in, but it seems I'm using the wrong password. Is there a way to change it?

$ dat login
Welcome to dat program!
Login to publish your dats.

Dat registry:  (datproject.org) 
Email:  [email protected]
Password:  ************

Password incorrect.

$ dat --version
13.8.1

Spec: how to delete files?

From experimentation, it looks like deletes are encoded by adding a Node entry with no Stat... but with updated children?

If somebody replies to this issue I can submit a PR updating the paper.

"How Hypercore Works" link breaks navigation

Steps to reproduce

Navigate to http://docs.dat-data.com/
Click How Dat Works, scroll down to the section with the heading Flat In-Order Merkle Tree.
Click the How Hypercore Works link.

Expected behavior

The link navigates to a page about how hypercore works.

Observed behavior

Address bar changes to http://docs.dat-data.com/hyperdrive.md#how-hypercore-works, but the browser just resets to showing the top of the How Dat Works page.
Error in the JS console: bundle.js:144 Uncaught TypeError: Cannot read property 'source' of undefined
Clicking other links in the navigation sidebar (e.g. Welcome to Dat) does not work after clicking the poison How Hypercore Works link, and results in more errors in the JS console: bundle.js:377 Uncaught AssertionError: nanoraf: infinite loop detected.

Explain what a signalhub is and does and requires

In the dat-js documentation at Under the hood in the code snippet there is the concept of a signalhub coming out-of-the-blue:

var DEFAULT_SIGNALHUBS = 'https://signalhub.mafintosh.com'

There is not any explanation what this means. Do you have to run one or more of these servers for webrtc solution to work?

Googling on signalhub + webrtc gives the github repo of signalhub with only the - unhelpful - text:

Simple signalling server that can be used to coordinate handshaking with webrtc or other fun stuff.

More googling finds webrtc-swarm and:

Creates a new webrtc swarm using signalhub hub for discovery and connection brokering.

Now in the dat-js README there is no mention of signalhub, but it's in the package.json and the code. (Also note that from the Docs page there is no link to the dat-js project it documents)

As a Dat user, one should not have to go through all this effort, dive in code, to find the important concepts.

Recommendation

Update docs with:

What is a signalhub?
What does it do, how does it work?
Is it required, are there alternatives?
What are server / operational requirements?

More security info to add

holepunchto/hypercore#96

Your use case sounds perfect, we've also started to integrate some of our logging and build tools with hypercore using hyperpipe (apologies: docs need a bit of updating there) and hypertail. However, I think we still need to update hypertail/hypername to be compatible with the newest hypercore release.

Is the public key the "network id" and the secret key the "password" to access that network?

One of the key elements of Dat privacy is that the public key is never used in any discovery network, only the discovery key is (the hash of the public key). So in a sense, the discovery key is the "network id" and the public key is the "password".

Does it mean that you can track which peers are asking for the same public key, but can't tell/access that data? How does the public key vs discovery key vs secret key fit into this?

You can track which peers are asking for the same discovery key. Once a user has the public key, they can access the data if they can connect to other users sharing that key.

The use of discovery keys is important in networks like DHT:

For example if a Dat client decides to use the BitTorrent DHT to discover peers ... then because of the privacy design of the BitTorrent DHT it becomes public knowledge what key that client is searching for.

BitTorrent DHT is pretty "chatty" by design. Once you start searching for a discovery key, it becomes known knowledge. There are a some servers that may falsely respond to a request and try to connect. But unless they have the public key, they will not be able to connect. However, your IP address may become known in this process.

tldr

As long as the public key isn't shared outside of your team, the content will be secure (though the IP addresses + discovery key may become known). In order of easy + less secure to harder + more secure, you can take a few steps further (mostly in discovery-swarm):

Disable bittorrent DHT discovery (using only DNS discovery)
Whitelist IP addresses
Run your own DNS discovery server
Encrypt contents before adding to hypercore (content is automatically encrypted in transit but this would also require decrypting after arrival).

We have a bit more on security & privacy as well, may answer any more lingering questions you had. Your questions will probably be good to include there are well, so I may copy some to a new issue in our docs.

typo "be come" -> "become"

in "popular files can be come very expensive"

use readme.md for instructions on editing/deploying docs

and then use index.md or maxogden/dat/readme.md for the rendered docs

SLEEP documentation missing?

The dat paper mentions a data format called SLEEP and references "it's own paper" but I can't find that paper or details on it. Is SLEEP a proprietary format for just metadata used internally by dat or chunked source data similar to how HDFS stores raw data?

Dat Network Protocol specification needs to be updated

For my message layer research wrt hypercore-protocol I was referred to the dat network protocol specification, but found it outdated in places.

For instance message no. 0 is called Register but in schema.proto it has become Feed.

I have made a proposal in Discussions project to auto-generate docs from schema.proto:

Auto-generate protocol documentation from inline comments in schema.proto

Use minidocs to publish all docs with a nice index

Is the paper up to date?

Reading the paper from May 2017, and would love to know if it is in synch with the actual protocol as of today, if not, where can I learn and track the delta?

dat desktop installation

It moved over to the dat-land org so we need to change/update install instructions.

datproject.org download demo deletes current working directory.

Hey folks,

Just ran the code below from dat's documentation and when I ran it it deleted my working directory and the code I was working on. Not sure why yet but I thought I'd share it here before I forget. Any ideas on what happened? I hope it's just my own error and new users aren't accidentally deleting their work!

var ram = require('random-access-memory')
var hyperdrive = require('hyperdrive')
var discovery = require('hyperdiscovery')
var mirror = require('mirror-folder')

var link = process.argv[2] // user inputs the dat link
var key = link.replace('dat://', '') // extract the key
var dir = process.cwd() // download to cwd

var archive = hyperdrive(ram, key)
archive.ready(function () {
  discovery(archive)

  var progress = mirror({name: '/', fs: archive}, dir, function (err) {
    console.log('done downloading!')
  })
  progress.on('put', function (src) {
    console.log(src.name, 'downloaded')
  })
})

Browser Dat section is out of date?

I'm having trouble trying the demos under the "Browser Dat" section. I'm sourcing dat-js from https://cdn.jsdelivr.net/dat/6.2.0/dat.min.js and running from Chrome 67.0.3396.87.

The default constructor for Dat connects to https://signalhub.mafintosh.com, which fails with an SSL error. Should it be updated to something like

Run your own instance of signalhub
node ./node_modules/signalhub/bin.js listen -p 8080

with the dat/clone creation updated to something like

var datOpts = { signalhub: [ 'http://localhost:8080' ] };
// or omitting running your own instance
var datOpts = { 
  signalhub: [ 'https://signalhub-jccqtwhdwc.now.sh', 'https://signalhub-hzbibrznqa.now.sh' ]
};
var dat = Dat(datOpts);

The "Sharing data" subsection looks to have been cut-and-paste from the dat-js/README.md, but the datproject section omits the replicate() definition. The function parameter to clone.add() inside replicate() is never called.... I see my local signalhub v7.7.1 logging subscribe and broadcast of the key....
Under "Downloading data" the functional parameter to clone.add() never executes. I do see signalhub log the subscription....
"Under the hood" refers to 2016-era hyperdrive usage. dat-js depends upon hyperdrive v7.7.1, two major versions back from the latest. E.g. the default hyperdrive constructor no longer functions. Is it worth mentioning the version lag? Sourcing a working test or code snippet from dat-js? Removing the section?

The callbacks to Dat.add(key, repo) aren't triggered using either my local signalhub nor publicly available signalhubs at the github page. Am I not starting signalhub properly? I do see heartbeat traffic when I connect with curl and replay the requests from the Chrome debugger.

I'm happy to take a swing at updating the documentation once I can get a working example. Seeing that the dat-js github project has been archived, maybe the best thing to do is to remove the section from datproject.org?

Docs Page Broken

The http://dat-data.com/docs page is broken.

Terms

2:46 PM is there a term reference around? something like this: https://camlistore.org/doc/terms
2:46 PM it would be nice to know specifically what an "archive" means, or what a "dat-key" is vs a "discovery key".

Random brain dump to start below.

Terminology we use across the Dat Project.

Network / Swarm

Dat Archive

Snapshot

Dat Key

dat://<hash>

Discovery Key

Hashed key for connected to peers over the newtwork.

Key

Link

Public Key vs Secret Key

Feed / Core Feed

Hyperdrive

Hypercore

Hyper- (modules)

e.g. hyperdiscovery, hyperhealth.

A module that works with hyperdrive archive and hypercore feeds.

FAQ

some example q's:
is there a history?
how can I add metadata to a dat?
~~is there a desktop client?~~
can multiple people write to a dat?
won’t data producers (government) be upset that they can’t remove data?

Typo in cookbook example

https://github.com/datproject/docs/blame/master/docs/cookbook/diy-dat.md#L68
Adding dat:// to the link is throwing errors. hyperdrive is only expecting key ?

Please clarify hyperdrive metadata "children"/"trie" encoding

I've been working on a from-scratch implementation of dat/hyperstuff/sleep following the whitepaper/spec. One place i've run in to trouble is understanding directory structures are supposed to be encoded in the metadata register (the "Filename Resolution" section of the whitepaper).

A first small question about indexing conventions: in the whitepaper examples are given pointing to metadata entry index [0], but by convention this is always the Header/Index message (containing the public key of the paired content register). I assume this means entry numbers should be translated on lookup (aka, add one to the "hyperdrive index" to get the "hypercore/SLEEP index"), which is the convention i'll use below. I'll note that the dat log output does not do this translation though (starts with 1 index).

To clarify questions around the children/trie encoding, here's an example directory tree:

.
├── Animalia
│   └── Chordata
│       └── Mammalia
│           └── Carnivora
│               ├── Caniformia
│               │   └── Mustelidae
│               │       └── Lutrinae
│               │           └── Enhydra
│               │               └── E_lutris.txt
│               └── Feliformia
│                   └── Felidae
│                       └── Felis
│                           └── F_silvestris.txt
├── datapackage.json
├── Fungi
│   └── Basidiomycota
│       └── Agaricormycetes
│           └── Cantharellales
│               └── Cantharellaceae
│                   └── Cantharellus
│                       ├── C_appalachiensis.txt
│                       └── C_cibarius.txt
└── README.md

Assume these were added in order "README.md", "E_lutris.txt", "F_silvestris.txt", "C_cibarius.txt", "C_appelachiensis.txt". What should the entries look like?

0: name=/README.md
    children=[]
    stat=<protobuf>
1: name=/Animalia/Chordata/Mammalia/Carnivora/Caniformia/Mustelidae/Lutrinae/Enhydra/E_lutris.txt
    children=[[0]]
    stat=<protobuf>
2: name=/Animalia/Chordata/Mammalia/Carnivora/Feliformia/Felidae/Felis/F_silvestris.txt
    children=[[0]]
    stat=<protobuf>
3: name=/Fungi/Basidiomycota/Agaricormycetes/Cantharellales/Cantharellaceae/Cantharellus/C_cibarius.txt
    children=[[0]]
    stat=<protobuf>
4: name=/Fungi/Basidiomycota/Agaricormycetes/Cantharellales/Cantharellaceae/Cantharellus/C_appalachiensis.txt
    children=[[0], [], [], [], [], [], [3]]
    stat=<protobuf>

This clearly can't be correct, because there's no way to go from [4] to find any of the Animalia entries. Should [4]'s children look like [[0,1,2], [], [], [], [], [], [3]]? The paper says:

The children field represents a shorthand way of declaring which other files at every level of the directory hierarchy exist alongside the file being added at that revision.

Does that mean that every single file in the whole drive/archive is pointed to from every appended entry? Or is that the case if there are, eg, 100k files in the root directory, in which case an appended entry would be at least ~100 KB, even with the compressed encoding?

Later "F_silvestris.txt" is edited and updated, then "datapackage.json" gets added. What does it's node look like? Given this [6] node, how do you look up the most recent version of "F_silvestris.txt" (node [5]), without getting confused with [2]? Also, what about deletions?

The current state of the whitepaper looks like it has an early version of multi-writer protocol stuff added, but isn't complete. An older revision seemed more complete and is what I followed in trying to be compatible with stable dat (i'm running 13.8.1).

I took a look at the implementation in https://github.com/mafintosh/append-tree/blob/master/index.js, but there are no comments and i'm not familiar enough with javascript for it to be enlightening.

document `metadata.ogd`

Created a new dat directory using dat(1); found a new file in there. Curious what it's for, and how it's formatted. Thanks!

Make style consistent with main site

Style should be consistent with main site. Maybe should have same navigation header too?

This is dependent on custom themes in minidocs.

Compression Oracle Attacks

Hi there,

The whitepaper claims you do both e2e encryption and also incremental versioning.

I believe this combination may leave you open to a compression oracle attack: anytime you combine encryption with compression you are open to a chosen plaintext attack, unless you pad out your messages carefully (defeating the purpose of compression).

'Using dat in JS apps' page should have a sharing example

Should write this, right now its only for downloading

whitepaper transfer

Various cleanup tasks for migrating whitepaper to Dat Protocol

Update latest version at datproject.org/paper
Remove from docs repository
Remove supporting information about paper here
Add link to docs, DEPs, datprotocol.com on docs.datproject.org

Fix Download all files… example in Build With Dat section

The Download all files to computer example in the Build with DAT section is currently broken (see mafintosh/mirror-folder#16).

datfolder.org link is relative

In [terms.md](https://github.com/datproject/docs/blob/6abe36a66a77f33142c625a03a4b5af525aa515d/docs/terms.md#L41) line 41 (github doesn't give you links to lines in markdown files, it seems), the reference is relative,

We maintain an open register called [Dat Folder](datfolder.org) which contains public data, and is open to everyone.

and thus doesn't work. I'd fix it, but I'm not sure where it should link to:http://datfolder.org gives me a 503; https://datfolder.org forwards to some other page so its certificate's name mismatches (cert says beta.datproject.org).

Thanks!

Following the "Viewing Docs Locally" instructions gives me an error on npm run update

Steps 1,2 and 3 succeed but I get the following on step 4
(Do I have to run some ecosystem-docs command first? )

[email protected] update:build /Users/hugo/LearningSpace/dat/docs
cat repos.txt | ecosystem-docs sync && cat repos.txt | ecosystem-docs read | node scripts/build.js

Your GitHub username: datproject/dat
Your GitHub password: ✔✔✔✔✔✔✔✔✔✔✔✔✔✔✔✔✔✔✔✔

/Users/hugo/LearningSpace/dat/docs/node_modules/ecosystem-docs/bin.js:81
if (err) throw err
^

Error: Bad credentials
at BufferList.afterCreateAuthResponse [as _callback] (/Users/hugo/LearningSpace/dat/docs/node_modules/ghauth/ghauth.js:40:19)
at BufferList.end (/Users/hugo/LearningSpace/dat/docs/node_modules/ghauth/node_modules/bl/bl.js:98:10)
at DuplexWrapper.onend (/Users/hugo/LearningSpace/dat/docs/node_modules/hyperquest/node_modules/readable-stream/lib/_stream_readable.js:537:10)

Improved deployment

Improve gh-pages so we know built tree vs source tree: tschaub/gh-pages#91
post-commit deployment via travis

Restructure TOC and current content

Scrolling down and clicking a new page

if i scroll down on one page and click a new page it doesn't go to the top. that's new with the minidocs fork, i don't see the same behavior on example minidocs pages

Can't see shared data from hypercored

I'm using dat 13.9.0 with hypercored on Debian 9.2 and OS X.

If I run dat create in my /data directory, followed by dat share, I can see the data on my local network. However, if instead of dat share I run hypercored, the same dat clone ... command that works with the dat share no longer works.

I have questions about how it's supposed to work:

Am I supposed to clone the same ID, or use the Archiver key as printed out after starting hypercored?
What relevance does the Swarm port have? Am I supposed to connect to it somehow?
For dats to work, do we need bi-directional access? Meaning, in order to have a share on a public machine, I can only clone to another public machine?

Thanks!
Dustin

Put auto generated module docs into sub directory

The module docs from ecosystem-docs should go into docs/modules/. This will make the URLs a bit nicer and prevent name collision issues.

We need minidocs to support subdirectories to do this.

Mention private key in documentation

I cannot find any mention of the private key.

webrtc cookbook

Get a webrtc cookbook thing going. We get asked about it a lot.

https://gist.github.com/karissa/6c0594ae9fc215d2b750c39e7e4f8973

dat logo not available from dat-design

Since the downloads folder of dat-design is listed in .npmignore the build scripts in this repo are failing to copy the dat-data-logo.png file.

I tried using the svg logo in the public dir of dat-design instead and that works fine. I could send a PR for that.

(Testing minidocs changes and happened across this.)

Make hypercore/hyperdrive & feed/archive more clear.

Docs are still not clear about what those modules do and their relations. We can do better :)!

9:23 AM a core instance can contain any number of feeds. a drive is a special core instance that contains feeds in sets of two.
9:23 AM an archive is two feeds, one for the metadata and one for the content. so you can store any number of archives in a regular core or a drive
9:23 AM in dat-node we basically say 1 drive = 1 archive
9:23 AM this is because we want to store the database locally to that folder, in the .dat folder.
9:24 AM so we can't have that drive polluted with other archives or have that archive database stored separately.

use ecosystem-docs to include module api references

publish.md describes functionality that does not seem to work or may not exist yet.

publish.md describes functionality that allows users to clone dats using dat.json as described in this feature request. This feature does not seem to be implemented yet (see my comment on feature request). Should publish.md be edited to reflect this?

Update outdated apidocs (options, methods, method sigs, events)

The hypercore API documentation refers to the options that can be passed to various methods ('options include: ...`).

But the documentation only details a subset of supported options. For other options you have to dive into the code of the hypercore repo (which is not linked from the page).

Full set of options are

{
  createIfMissing: true, // create a new hypercore key pair if none was present in storage
  overwrite: false, // overwrite any old hypercore that might already exist
  valueEncoding: 'json' | 'utf-8' | 'binary', // defaults to binary
  sparse: false, // do not mark the entire feed to be downloaded
  secretKey: buffer, // optionally pass the corresponding secret key yourself
  storeSecretKey: true, // if false, will not save the secret key

  // undocumented options
  id: null,
  live: null,
  maxRequests: null,
  writable: null,
  allowPush: null,
  indexing: null
}

PS: As I'm creating this issue I found more methods, such as proof which are completely undocumented and also have an options argument, and clear which has an added opts parameter that is undocumented. So I changed the title from 'Update options' to 'Update docs'.

View on mobile

Hypercore specification missing "Basic Privacy" information

I was looking for docs on how the discovery key and public key are used, but found only a placeholder for the "Basic Privacy" section at https://github.com/datproject/docs/blob/master/docs/hyperdrive_spec.md#basic-privacy

Please clarify key names and new SLEEP file types

I recently created a dat archive:

bnewbold@orithena$ dat -v
13.8.1
bnewbold@orithena$ tree .dat/
.dat/
├── content.bitfield
├── content.key
├── content.secret_key
├── content.signatures
├── content.tree
├── metadata.bitfield
├── metadata.data
├── metadata.key
├── metadata.latest
├── metadata.ogd
├── metadata.signatures
└── metadata.tree

First, I notice that content.secret_key is in there, but metadata.secret_key. I have heard (and previously read somewhere) that secret keys live in ~/.dat/secret_key/, and indeed I see keys in there. @maxogden on IRC seemed concerned that content.secret_key was not in the homedir folder, and indeed, I believe the metadata register does not include any hashes or signatures of the content register (only index and length), so an attacker could inject corrupted/modified data. To be clear, this isn't any kind of remote exploit weakness, it only applies to the case where somebody copies a whole .dat directory around, copying and potentially exposing key material.

The metadata.latest and metadata.ogd files are not mentioned in the whitepaper. On IRC @maxogden said that metadata.ogd (:wink: :wink: ) mean "this is the original dat folder", which I assume is supposed to be a flag that the metadata secret key should be available and the dat overall is writable... is this the canonical flag for this, or should compatible clients also look in ~/.dat/secret_keys/ on their own? What is metadata.latest for?

If somebody replies to this issue, I can send a PR to include answers in the whitepaper.

Interesting documentation missing from ToC

I found that 2 interesting pages with more in-depth design details are not linked in the docs, nor the toc.

How Dat works
Hyperdrive + Hypercore Specification (referenced from the first page)

are not in contents.json.

Grammar suggestion

Page 2 > section 2.1 > sub-section "Dat Links" > para. 3 > sentence 1:

(suggested changes in bold)

Every Dat repository has a corresponding a private key that is kept in your home folder and never shared.

FAQ: Link to a specific question

It'd be nice to have a link icon thing so we can get permalinks to specific questions. Minidocs gives IDs but still probably some PR there for the link icon.

add docs on utp

dat-ecosystem-archive/dat-node#170

Highlight dat-js compatibility wrt dat-node as a warning

I read page https://docs.datproject.org/browser on the first day I encountered and - not knowing Dat - completely missed the relevance of this casual text in the introduction:

Because dat-js uses webrtc, it can only connect to other browser clients. It is not possible for the dat-js library to connect to the UTP and UDP clients used in the Node.js versions.

If I understand correctly there is a fundamental choice to make: either go the browser route, or use Node for the 'full Dat experience'. They are different worlds..

I would highlight this text as a warning so its importance is not missed (e.g. with a yellow background, and starting with Important: ...)

dat vs SSB

It would be great to add a comparison with SecureScuttleButt (SSB) in the Dat vs... section of the FAQ... I know that the projects are quite different, but it would still seem relevant to include a basic overview of how they differ and why one would choose one over the other.

My instinct tells me the fundamental difference is that dat is more file-oriented and SSB message-oriented, so that you would use SSB to create a social network or photo gallery, but dat to share files...

JS bundle is semi-broken (links don't work thru js)

On firefox 57, with no content blocked (according to ublock extension) the left sidebar links don't work when clicked. Well, the URL changes in the location bar, but the content remains the same. Visiting a page directly properly shows it's content.

For instance,open https://docs.datproject.org/ and click Installation from the sidebar. On the other hand, if you click Installation here, it works as expected.

I'm seeing these errors in the js console:

Blocage du chargement du contenu mixte actif (mixed active content) « http://fonts.gstatic.com/s/bitter/v12/zfs6I-5mjWQ3nxqccMoL2A.woff2 »
[En savoir plus]
docs.datproject.org
21:11:41,323
TypeError: node is null
[En savoir plus]
bundle.js:7921:1
21:18:11,831 uncaught exception: AssertionError: nanoraf: new frame was created before previous frame finished

Dat-paper feedback

Paper's coming along really well! The Other Work section is very thorough.

(Commenting on this version of the paper)

Background Section

The background section is good, but a bit unfocused; it lists a lot of small grievances, and I don't get a clear picture of which ones matter the most.

This section also doesn't mention existing P2P solutions, which I think it should. The question I'd ask, when reading this paper, is why we need another P2P distribution network.

I suggest you structure this section as follows:

Para 1. HTTP has a problem with broken links and cost. It also does not guarantee the integrity of data.

Para 2. BitTorrent solves cost and broken links, but doesn't have mutability or privacy.

Para 3. IPFS solves cost, broken links, and mutability, but does not maintain version history. It also has no privacy mechanism.

Para 4. Scientists!

Existing Work Section

LGTM! I might talk more about how Certificate Transparency works, because Dat shares a lot of its design.

Dat Section

I'd talk more about the named versions and history log. I'd consider renaming "Content Integrity" to "Auditable Versioning," or perhaps to "Versioned Content Integrity," and include a subsection about how you can use named checkpoints to refer to old versions. The Certificate-Transparency-style changelog is one of the differentiators from BitTorrent and IPFS, so I think you should give that some strong focus.

I'd then suggest you rename "Efficient Versioning" to "Efficient Sync."

Last thoughts

One thing - I'd consider dropping any mention of WebRTC, except as a hypothetical with a working demo. We're moving away from it, and people who read the paper will ask about its support.

FWIW, this is the Intro to Dat page I have on Beaker's site.

Looking good, keep it up.