rangermauve / hypercore-fetch Goto Github PK

View Code? Open in Web Editor NEW

35.0 35.0 11.0 138 KB

Implementation of Fetch that uses the Hyper SDK for loading p2p content

License: MIT License

JavaScript 100.00%

hypercore-fetch's People

Contributors

Stargazers

Watchers

Forkers

deltaf1 localhost-international kygost dwebprotocol unmellow bitwebs artisynthz ducksandgoats josephmturner timechain-academy gnostr-org

hypercore-fetch's Issues

Support diffs

Please add support for interfacing with hyperdrive's drive.diff API.

If my public key is hyper://sw8dj5y9cs5nb8dzq1h9tbjt3b4u3sci6wfeckbsch9w3q7amipy/, I am able to successfully send requests to hyper://sw8dj5y9cs5nb8dzq1h9tbjt3b4u3sci6wfeckbsch9w3q7amipb/ (the final character y replaced with b) as if I had send those requests to my actual public key.

Is this behavior expected?

What characters may be used in a namespace petname?

Replace Non-standard methods with special paths

After playing with this for a while, it seems that using non-standard method names is a huge issue with clients.

For example, Godot only supports GET, HEAD, POST, PUT, DELETE, OPTIONS, TRACE, CONNECT, and PATCH

Node has a bunch more methods, but we're using a bunch that aren't supported there either.

I propose instead using a special path for things that aren't just Hyperdrive data. Maybe under the prefix like hyper://example/$/ or hyper://example/@

Here's the weird method names we'll need to fill and some proposals:

DOWNLOAD hyper://example/path/to/thing => GET hyper://example/path/to/thing?seed
CLEAR hyper://example/path/to/thing => DELETE hyper://example/path/to/thing?clear
TAG hyper://example/ TAG_NAME => PUT hyper://example/$/tags/TAG_NAME
TAGS hyper://example/ => GET hyper://example/$/tags/
TAG-DELETE hyper://example/ => DELETE hyper://examples/$tags/TAG_NAME

This could also play well with the need to expose additional data like peers and extensions. e.g GET hyper://example/$/peers/

Does this seem like a good idea? Any suggestions for the path names?

Display entries and prefixes with same path in directory listing

Per #40, hypercore-fetch will allow for entries and prefixes with the same name. Currently, when both an entry and prefix exist inside of a directory, only the entry is displayed. Does it make sense to display both entries and prefixes (distinguished by a trailing slash) when both exist?

Ensure that 'Must create key with POST before reading' error appears

GET requests to a hyperdrive alias which hasn't been created by a POST like the following:

fetch(`hyper://localhost/?key=NOT-CREATED-YET`, { method: 'get' })

should return

{ status: 400, body: 'Must create key with POST before reading' }

but instead I get a TypeError.

I think getDBCoreForName is throwing an error before the if-block can run.

Navigate version history on a per-file basis

One way to explore the history of a particular file:

Loop through hyperdrive version numbers, starting with the current version and going backwards. Check if drive.diff(...) returns non-null (we could use a HEAD request for this?). If so, checkout that version and display the file at that version or inform the user that the file didn't exist before that version.

What about the edge-case where a file was created, deleted, then created again? Can we just continue checking back in time until we get a non-null diff or we get to the beginning of the history?

Do you know of a better approach? I imagine that sending a new HTTP request to hyper-gateway on each loop could be slow.

Think of API for extension messages / peers

We need an API for listing peer metadata as well as sending extension messages to peers.

I propose something like

PEERS / -> [{id, remotePublicKey, remoteAddress}]

EXTENSION-REGISTER /{extension name in the path} -> text/event-stream

EXTENSION-SEND /{extension name in the path}?peer={optional peer id} {extension data in body -> 200 OK

The PEER method will list the metadata about the peers you're connected with. Maybe there should be something here for filtering peers by which ones have a certain extension? Maybe we could list which extensions peers are registered on too?

The EXTENSION-REGISTER will register an extension message handler if one doesn't exist. It will then start an event source stream and send new events down the wire. This should be used with the EventSource API in browsers. We need to figure out a way to specify which peer an event has come from. Maybe in the message name? Maybe in the lastMessageId?

Lastly, EXTENSION-SEND will send out an extension message. The body will be used as the extension message contents. You can optionally specify a peer's id in the querystring to send to a specific peer (or multiple specific peers), or leave the querystring blank to broadcast to everyone. This is similar to a POST.

How does that sound? I'm mostly iffy about the peer info.

cc @DeltaF1 @calm-rad

Replace petnames with `POST hyper://localhost`

The concept of using petnames for public keys in the hostname conflicts with custom DNS resolvers like handshake which can register top level names.

In js-ipfs-fetch we removed the petname functionality with an explicit call to POST ipns://localhost?key=KEY_NAME instead of auto creating with GET ipns://KEY_NAME/

I think we should do something similar for hyper-fetch and other protocols. (like bittorrent)

Expose MOUNT and UNMOUT methods

There should be a way to mount hyperdrives onto your hyperdrive.

I propose something like

MOUNT /path/to/dir {body: 'hyper://tomount'} => 200 OK
UNMOUNT /path/to/dir => 200 OK

You can specify the path you want to mount at (the parent folders will get auto-created), and specify the hyper:// URL you want mounted in there.

You can remove a mount by using UNMOUNT and the path to the mount.

CREATE method

I think there should be a CREATE method.

An option would be something like creating a drive with name Kopple + (current system time) but this is kinda yuck.

Currently because of CORS stuff, getting the address isn't be possible. This is a different issue though.

What about a new method, not sure how it would work with the URL, perhaps ignore the contents?
Example:

fetch('hyper://new', {method: 'CREATE'}) (returns address)

Beaker uses beaker.hyperdrive.createDrive which returns a drive object which contains the URL of the created drive.

Not finishing with a / on a directory leads to all sorts of funkiness

See AgregoreWeb/agregore-browser#36

Integrate more hyper data structures

Create HYP for the new subtype feature
Add subtype to hyperbee's header
Implement resolve-hyper-structure
Use queryString parameter to specify data structure in the first PUT
Integrate resolve-hyper-structure
Support Hypercore #7
Support Hyperbee #21
Support Hypertrie

Related to #7 and #17

What is the purpose of the special top-level directory?

What is the purpose of this special directory?

hypercore-fetch/index.js

Line 611 in e77372d

if (finalPath === '/') files.unshift('$/')

Should it be possible to put a file at the root url?

Currently, it is possible to put a file at the root of a hyperdrive, so that hyper://PUBLIC-KEY points to a file (while hyper://PUBLIC-KEY/ points to the root directory). I think this is an extension of #40, but this specific case seems highly likely to confuse users.

Set `ETag` to version of drive after file is added (not before)

Currently, the ETag header is set to drive.version (or entry.seq) of the drive before the file is added to the hyperdrive. Would it make more sense to set the ETag header to the version of the hyperdrive after the file is added?

That way, if example.txt was last modified at version 50, HEAD hyper://PUBLIC-KEY/example.txt would give ETag 50.

This would mean that getting the version of a drive before the file was modified would require subtracting 1 from the Etag.

Feature Parity with Gateway

Here's the protocol handler for Gateway: https://gitlab.com/gateway-browser/gateway/-/blob/master/src/reducers/protocols.js

Ideally, dat-fetch should be able to do a lot of the stuff that gateway does so that we could potentially reuse it there and standardize the fetch interface.

Some stuff that needs to be added here:

Version in the hyper:// URL should checkout a tag or version of the hyperdrive
Ability to get site info, related to #5 {key, url, writable, version, title, description}
QUERY method, which does a search against the data in the hyperdrive. Similar to Beaker's query API. Maybe this could be done in userland, though? 🤔
STAT API or some way to get all the stat data
PeerSockets API, this will likely be possible once we have extension messages from #5
Watch API, could probably get it with EventSource

Some stuff which I'm not sure whether it'd work:

Hyper-Options header passed when initializing the Hyperdrive, is this safe?
The application/hyper Content-Type probably won't work in "real" fetch implementations that only allow binary buffers in responses
Diff APIs. Is there a way we could keep this in userland?

I'm guessing this can be worked around within Gateway though since it can use the raw hyperdrive APIs whenever it wants. 😁

With that said, I'd be more than happy to help people through implementing some of these features and landing them in.

Feature request: directory and file stat

Is it possible to get information about a Hyperdrive path (to a file or a directory) without downloading its contents? I am interested in ctime, mtime, x-blocks, x-blocks-downloaded. This information could be quickly loaded to give the user a more complete view of a directory beyond just the names of its contents.

I this idea makes sense, maybe an interface could look like fetch('hyper://NAME/example/?stat', {method: 'GET'})

return the associated public key in the url

with a post request using ipfs, the cid is return in the url, i think it would be great if we can do the same thing with hypercore-fetch. for example, PUT hyper://test/test.txt should return hyper://somePubKeyHere/test.txt. as of right now, we can get the pubkey from the Link header but returning it with the url seems more direct and easier.

List Directory With Extra info like Etag

Look at stuff like WebDAV or other specs
Add flag to list directory with details

Adjusting HTTP response for various hyperdrive statuses

Hi Mauve,

@josephmturner and I were talking about how to distinguish various cases when requesting information about a hyperdrive, and we came up with this matrix of what would be ideal for the application side of the protocol. Would it be possible for hyper-gateway and/or hypercore-fetch to provide responses like these?

Response matrix

Currently observed behavior.

Request for	Returns	#
URL to valid hyperdrive without content	Etag == 1
URL to unknown (i.e. network-inaccessible) hyperdrive	Etag == 1	!
URL to valid hyperdrive that has ever had content	Etag > 1
URL to valid hyperdrive directory but invalid file	HTTP 404
URL with too-short public key	HTTP 500

Ideals?

What we'd ideally like the behavior to be.

Request for	HTTP	Etag
Obviously malformed URLs	400 Bad Request	N/A
Unknown hyperdrive	404 Not Found	N/A
Known hyperdrive (never had content)	204 No Content	N/A
Known hyperdrive (has or had content)	200 OK	>= 1

To distinguish whether peers are available, ideally we would use another header, something like X-Hyperdrive-Peers.

Glossary

Unknown hyperdrive

An unknown hyperdrive may or may not exist. We don't know whether it does. We have never received any information about it.

Known hyperdrive (never had content)

A hyperdrive that we know exists, and we know it is empty, and it has never yet had any content. Possibly created by us.

Last-Modified header missing

Files created with PUT requests should have a Last-Modified header, but they currently do not.

Add `GET` request header recursively download a directory

We've talked about a flag like --full-replication for hyper-gateway. A header like this would let users recursively download specific directories without having to download the whole drive. WDYT?

Problem using dat-fetch in Firefox with Browserify

Trying to use dat-fetch in client-side JS

This is my hyper.js file:

async function fetchFunction(){
    const fetch = require('dat-fetch')()
    const someURL = `hyper://blog.mauve.moe`
    const response = await fetch(`${someURL}/index.json`)
    const json = await response.json()
    console.log(json)
}

fetchFunction()

Executing dat-fetch

I execute these commands on the shell to install the required dependencies and check that the script works:

npm install --save dat-fetch
npm install --save dat-sdk@^2.0.0 
node hyper.js

And the script properly runs:

{
  title: 'RangerMauve - Blog',
  description: 'My Blog - Talking about Dat stuff',
  type: [ 'website' ],
  fallback_page: '/index.html'
}

Creating Browserify bundle

If I try to create the bundle with:

 browserify hyper.js -o bundle.js

I get an error:

browserify hyper.js -o bundle.js
Error: Cannot find module 'babelify' from '<my_path>/BrowserifyTest/node_modules/dat-sdk'
    at /usr/local/lib/node_modules/browserify/node_modules/resolve/lib/async.js:115:35
    at processDirs (/usr/local/lib/node_modules/browserify/node_modules/resolve/lib/async.js:268:39)
    at isdir (/usr/local/lib/node_modules/browserify/node_modules/resolve/lib/async.js:275:32)
    at /usr/local/lib/node_modules/browserify/node_modules/resolve/lib/async.js:25:69
    at FSReqCallback.oncomplete (fs.js:176:21)

So I proceed installing the specified "babilify" module:

npm install --save babelify

I get a warning: npm WARN [email protected] requires a peer of @babel/core@^7.0.0 but none is installed. You must install peer dependencies yourself. so I proceed:

npm install --save @babel/core@^7.0.0

I finally get the bundle and include it as a source in an index.html file:

<script src="bundle.js"></script>

When I open the file in a browser and I check the console, rather than getting the JSON from the hyperdrive I get:

Uncaught (in promise) Error: No native build was found for platform=browser arch=javascript runtime=node abi=undefined uv= libc=glibc node=undefined
    loaded from: /node_modules/sodium-native

Can somebody reproduce the error and give me some tips about what should be done in order to execute the dat-fetch API as part of a browser-first javascript proyect?

Interface for hypercore

It'd be cool if we could detect whether a hyper:// URL was a raw hypercore and change what the path means in that case.

Potentially, we could look at the first block in the hypercore, and if it's a hyperdrive before interpreting it as such, and if not it'll interpret it as a raw hypercore.

I think a hypercore interface would be similar to hyperdrive, but a bit more simple.

// Get a single chunk out
GET /{index} => buffer

// Append a chunk to the log
POST / {body} => 200 OK

// Get data for chunks between a range
GET /{start}...{end} => buffer

// Get the latest chunk in the hypercore
GET /head => buffer

It doesn't need a PUT or DELETE method since hypercores are immutable.

I'm not sure if there's a safe way to specify a content type since it's assumed to just be binary in hypercore unless the application already knows what it is. 🤷

In addition, I think the length should be put into the ETAG header.

I think all the extension message stuff could stay the same as in hyperdrive.

Append path to link header

In v8.6.1, the link header included the appended path:

const canonical = `hyper://${archive.key.toString('hex')}${path || ''}`
responseHeaders.Link = `<${canonical}>; rel="canonical"

Now, in v9.0.2, no path is appended. Only drive.core.url is returned.

Think of wording for drive.download() method

It'd be nice to have an HTTP method which enables you to download a directory, to go along with CLEAR for clearing locally stored data for it.

I proposed DOWNLOAD but @DeltaF1 rightfully said it could be confused along side GET.

Any other suggestions or input would be appreciated.

cc @calm-rad

Integrate multiwriter hyperdrive

I think it'd be cool to support co-hyperdrive in some capacity so that we can have multiwriter support for p2p websites.

Might want to add some additional methods like AUTHORIZE and UNAUTHORIZE to add / remove writers. Also some way to get a list of writers?

Should `'Range'` header be set somewhere?

What is the purpose of the 'Range' and 'Content-Range' headers? I don't see that the 'Range' header is set anywhere.

Return status `403` error on attempt to modify non-writable hyperdrive

Currently, requests which modify a non-writable hyperdrive return 500 error. Should PUT/POST/DELETE requests return 403 instead?

Unnecessary trailing slash added to non-prefix entries

After a PUT request to a path like hyper://<pubkey>/nested/file.txt, a GET request to hyper://<pubkey>/nested/ yields ['file.txt/'], instead of the expected ['file.txt'].

Add ability to delete local copy of file

It should be possible to send a request to hyper-gateway which deletes locally-stored copies of a particular hyperdrive file. Users should be able to perform this kind of deletion on files in hyperdrives which are not writable.

I don't see a command in the hyperdrive API which does this, though...

Specify version number in url

Will it be possible to to specify a version number in a hyper url with the new version of hypercore?

Entry names should not be URI-encoded in directory listings

The items in the JSON object returned from GET hyper://PUBLIC-KEY/ should not be URI-encoded.

Currently, results look like ["path%20with%20spaces/"], instead of expected ["path with spaces/"].

Throw error if hyperdrive doesn't exist

Currently, loading a nonexistent hyperdrive returns an empty array. Instead, I think hypercore-fetch should signal an error indicating that the hyperdrive could not be found.

Change status code for unwriteable hyperdrives

Consider changing the status code from 500 to 405 Method not Allowed (or 403 Unauthorized). 405 also returns which methods are allowed.

Dedicated error code for rmdir on non-empty directory

Does the new API makes this possible? https://docs.holepunch.to/building-blocks/hyperdrive#await-drive.del-path

Support `redirect` field

Right now we internally resolve files and return responses as though the user loaded the resolved file.

This should optionally be disabled and 302 responses should be sent.

The redirect flag is what handles this in regular fetch.

So if a user sets redirect to manual, we would resolve the file, and return a 302 redirect so that the protocol handler or user can resolve themselves if they so choose.

We should also consider whether we should return a 302 redirect when a request gets made for a DNS address or a named drive. I think DNS doesn't make as much sense since you'd want to preserve the origin, but named drives could make sense so that users are less confused as to how to get the actual URL of the drive they created.

Possible to have file and dir with same name?

It appears that it's possible to create a file and directory with the same name, only distinguished by a trailing slash. Is this expected?

PUT request to malformed url erases Hyperdrive

Leaving out the slash between the public key and the path of a hyper:// url erases the Hyperdrive.

Let's say that I have generated the following public key from an alias:
hyper://ef8db02d8e47fcd21b8708adacaadebb682586cde9c9d3257c32ed75d57d02a9/

fetch('hyper://ef8db02d8e47fcd21b8708adacaadebb682586cde9c9d3257c32ed75d57d02a9/example.txt', {method: 'PUT', body: 'Hello World'})

fetch('hyper://ef8db02d8e47fcd21b8708adacaadebb682586cde9c9d3257c32ed75d57d02a9/example.txt', {method: 'GET'})
// => 'Hello World'

fetch('hyper://ef8db02d8e47fcd21b8708adacaadebb682586cde9c9d3257c32ed75d57d02a9/', {method: 'GET'})
// => ["$/", "example.txt"]

// Attempt to update `example.txt`, but make a typo (notice missing "/" before "example.txt":
fetch('hyper://ef8db02d8e47fcd21b8708adacaadebb682586cde9c9d3257c32ed75d57d02a9example.txt', {method: 'GET'})
// This throws a HTTP/1.1 500 Internal Server Error with the following stack trace:
/*
Error: Missing Content-Type
at new Busboy (/snapshot/hyper-gateway/node_modules/busboy/lib/main.js:23:11)
at hyperFetch (/snapshot/hyper-gateway/node_modules/hypercore-fetch/index.js:390:26)
at async fetch (/snapshot/hyper-gateway/node_modules/make-fetch/index.js:37:9)
at async Server.<anonymous> (/snapshot/hyper-gateway/src/index.js:63:24)
*/

fetch('hyper://ef8db02d8e47fcd21b8708adacaadebb682586cde9c9d3257c32ed75d57d02a9/', {method: 'GET'})
// => ["$/"]
// Directory contents are now gone!

Is this behavior expected?

How to check that a node is writable

Besides attempting to write or delete a file from a hyperdrive, is there a way to check that a drive is writable?

How to get next hyperdrive version for an entry

It is currently possible to navigate backwards through the history of a hyperdrive entry by checking the ETag returned in the HEAD/GET response header, decrementing that ETag, then loading the previous version (hyper://PUBLIC-KEY/$/version/ETAG-1/file.txt).

How can we load the next version of a hyperdrive entry? For "directories", this is as simple as incrementing the ETag. IIUC, for files we have to iterate through ETags going forward from current version. Is there a more practical solution?

can FormData be added

while using hypercore-fetch, i see that using formdata is not possible. it would be good to support FormData just like the bt-fetch pr and js-ipfs-fetch.

Add `X-blocks` and `X-blocks-downloaded` headers

Thank you!

List connected peers

Consider adding another extension inside /$/peers/

Close drives after finishing request

We currently have a memory leak where any drives that get processed with dat-fetch are loaded indefinitely.

To solve this we should call drive.close after the request is finished.

This will mean that applications will need to implement their own methods for seeding drives.

Maybe this should come with a PIN and UNPIN method for keeping a drive online and seeded?

Delete hyperdrive

Perhaps we could use a POST request to delete a hyperdrive, like:

POST hyper://localhost/?delete=true&key=NAME

Use `Content-Type` `inode/directory` for directories

Would you be open to setting the Content-Type header to the mime type inode/directory for directory listings (entry prefix listings)?

URI-encoding question marks

How should clients handle question marks in the path portion of hyper URLs?

Currently, if we don't encode question marks, the portion after the question mark is treated as a search query and removed. If we do encode question marks, we get %3F back from GET requests instead of question marks.

Demonstrated in #70.

Hyperbee URLs

Based on discussions on Discord

Use a regular hyper:// URL
/ in the URL is pulled through the sub command
Keys can be URI encoded, that's how you can have a literal /
When you GET a key, you get the raw value back
Content-type is determined from the key's file extension
- e.g. foo/bar.txt' => text/plain`
Having a / at the end of a path triggers a directory listing
- By default do an HTML listing like in Hyperdrive
- Use Accept header to enable JSON
  - {[urlEncodedKey]: "urlEncodedValue"}?
  - [[urlEncodedKey, urlEncodedValue]]?
  - [{key: urlEncodedKey, value: urlEncodedValue}] ?
  - [{key: urlEncodedKey, value: byteArray}] ?
  - Opt into byte array?
- Enable EventSource via application/octet-stream
- Some sort of binary format with length prefixed streams?
- Use gt and lt query string params for search

cc @KyGost @pfrazee

Conversation Log

(05:29:02 PM) Mauve: 
@pfrazee So far, I was thinking we could detect hypercore vs hyperdrive using headers when doing a GET and providing a way to get hyper://hyperbeeaddres/path/to/key.

One question is whether it'd make sense to convert the / in the path to delimeters that'd usually be used in leveldb. Not certain yet.

Also wondering what binary encoding would look like in the URL, somehow detecting hex strings?

Also, what does it mean to "read a folder" in a hyperbee. Should it list all the keys with that prefix?

For prefix searches, I think it could be useful to have querystring parameters for gt and lt when listing.

Finally, I think it'd make sense to use some sort of content type encoding when returning multiple results so you can get the key and value stuff out, maybe take note from the EventSource API?
(05:29:40 PM) Mauve: 
cc @Kyran if you have thoughts on this
(05:31:06 PM) Mauve: 
Also, I was thinking of encouraging people to use file endings in their keys and doing mimetype detection. So if you're storing a JSON key you could use /file.json for the key name to give apps a hint. This is the approach EarthStar is taking IIRC
(05:31:27 PM) Kyran: 
I think that about sums it up. If I think of anything to add I'll add :-)


---


(05:33:42 PM) Kyran: 
@Mauve (if using Pidgin atm) @pfrazee edited message.
(05:34:02 PM) pfrazee: 
ah let me repost
(05:34:09 PM) pfrazee: 
converting the / to the path delim makes sense to me. I guess that might cause problems if / is actually in the path but I feel like that's just something we'll have to live with?
(05:35:15 PM) pfrazee: 
binary key encoding... also a pain. Hmph. Yeah I guess we just have to choose some kind of encoding
(05:35:17 PM) Mauve: 
 @Kyran I can see edits, thankfully. 😁 Yeah, I think the other gotcha with converting slashes is which delimiter we should go wtih since different projects might use different schemes
(05:36:07 PM) pfrazee: 
there's a PR to standardize it https://github.com/mafintosh/hyperbee/pull/8
(05:36:11 PM) Mauve: 
 @pfrazee What if we don't do anything special for keys to start and use the slashes as plain values, and if you want to search it can only be done on `/` with query strings?
(05:36:54 PM) Mauve: 
And further, what about using hex encoding for binary and treating everything with `0x` as hex keys or something
(05:39:12 PM) pfrazee: 
would there be any merit to only using query parameters?
(05:39:21 PM) pfrazee: 
we might have a little more flexibility
(05:39:32 PM) Mauve: 
Only using query strings for keys in general?
(05:39:55 PM) pfrazee: 
yeah. Would we lose anything by doing that?
(05:39:59 PM) Mauve: 
Hmmm. I dunno. I kinda like the idea of having a key in the pathname since it "makes sense"
(05:40:03 PM) pfrazee: 
right
(05:40:06 PM) Mauve: 
Functionally it should be fine though
(05:40:27 PM) Mauve: 
If you use slashes in your keys, then you can do useful things with the URL API
(05:40:53 PM) Mauve: 
Like `url = `hyper://name/foo/bar`, otherURl = new URL('./fizzbuzz.txt', url)`
(05:41:04 PM) Mauve: 
I've been using this pattern a lot lately
(05:41:07 PM) pfrazee: 
yeah that's true
(05:41:50 PM) pfrazee: 
okay back to your other points...
(05:42:04 PM) pfrazee: 
0x prefixes for hex seems pretty sensible
(05:42:27 PM) pfrazee: 
same with the searches using query strings
(05:43:11 PM) pfrazee: 
yeah for reading "a folder," I feel like listing the keys is pretty sensible
(05:43:38 PM) pfrazee: 
is it possible for a keyname that's a sub's prefix to have a value?
(05:44:16 PM) Mauve: 
One thing that'll be weird is that folder traversal doesn't make sense for hyperbee. Like, you could list everything in the folder and any sub folders, but getting just the sub folder names is hard
(05:44:24 PM) Mauve: 
Yeah that could be possible.
(05:44:34 PM) Mauve: 
It'd be similar to the index.html stuff in hyperdrive
(05:45:11 PM) pfrazee: 
yeah agree on the subfolder thing, that is a little tricky
(05:45:44 PM) Mauve: 
It could be fine to treat it like a recursive directory listing, IMO. 🤷
(05:45:55 PM) Mauve: 
It's just a bit quirky compared to hyperdrive
(05:45:56 PM) pfrazee: 
what I meant with the prefix key having a value thing is, if I have /foo/bar = "hi" and /foo = "hello" then GET /foo probably needs to give "hello" and not list subkeys
(05:46:06 PM) Mauve: 
Yeah agreed
(05:46:32 PM) pfrazee: 
would it make sense to never list subkeys on GET because of that possibility?
(05:46:43 PM) pfrazee: 
or should it list subkeys if it has no value itself?
(05:46:49 PM) Kyran: 
Maybe a different kind of request for subkeys?
(05:46:58 PM) pfrazee: 
that's what it makes me wonder
(05:47:05 PM) Kyran: 
Hmm
(05:47:13 PM) pfrazee: 
could always do a new method name like LIST
(05:47:14 PM) Mauve: 
What about never listing subkeys unless you use something in the querystring params?
(05:47:22 PM) pfrazee: 
or that
(05:47:40 PM) Kyran: 
That makes sense I guess
(05:47:40 PM) Kyran: 
EDIT: That makes sense
(05:48:16 PM) Mauve: 
Like `/prefix/?gt=&lt=foobar`
(05:48:29 PM) Mauve: 
So that'd search between `/prefix/` and `/prefix/foobar`
(05:48:32 PM) Kyran: 
Ah!
(05:48:42 PM) Kyran: 
What if it was if the path is suffixed by / it lists
(05:48:49 PM) Kyran: 
If no suffix it gives value?
(05:48:55 PM) Kyran: 
Does that make any sense?
(05:49:10 PM) Mauve: 
That'd run into the issue pfrazee had above where you might have keys that end in `/` which you'd want to be able to access
(05:49:25 PM) pfrazee: 
aren't we kind of hosed on that either way?
(05:49:45 PM) pfrazee: 
if we substitute / for the delim then you can never have / in a keyname right?
(05:50:00 PM) Mauve: 
I was thinking `/` in this case isn't treated specially
(05:50:06 PM) pfrazee: 
I wonder if we could percent-encode / to differentiate
(05:50:25 PM) pfrazee: 
%2F
(05:50:41 PM) Mauve: 
In the case where you only do lists when you have gt and lt in the querystring, the `/` doesn't mean anything special
(05:50:46 PM) pfrazee: 
so if you did /foo%2Fbar that would lookup foo/bar not sub(foo).get(bar)
(05:51:18 PM) pfrazee: 
@Mauve that's true but we need to solve this for the value-get case anyway, right?
(05:52:02 PM) Mauve: 
Ah, I guess if hyperbee has special sub functionality it'd make sense to use `/` for value getting.
(05:52:29 PM) pfrazee: 
that's pending an unmerged PR but I think they intend to merge that
(05:52:58 PM) Mauve: 
I think it'd make most sense for `/` in the URL to be subs, and `%2F` to be literal slashes in the key
(05:53:06 PM) pfrazee: 
yeah same
(05:53:17 PM) Mauve: 
So maybe for binary data we could stick to URL encoding?
(05:53:33 PM) Mauve: 
No need for the 0x thing
(05:53:42 PM) pfrazee: 
oh that's...huh I have no idea what that would look like
(05:54:23 PM) pfrazee: 
% encode just does a direct byte->number translation right?
(05:54:33 PM) Mauve: 
No clue, actually. 😂
(05:54:40 PM) pfrazee: 
haha yeah I gotta look that up
(05:54:55 PM) Kyran: 
hahaha what encoding would be used if not hex.
Why the need for the 0x prefix btw?
(05:55:04 PM) Mauve: 
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURI
(05:55:08 PM) Kyran: 
Hypercore addresses don't use them do they?
(05:55:16 PM) Mauve: 
 @Kyran Sometimes people use non-unicode keys for leveldb
(05:55:47 PM) Kyran: 
Yeah but if you are encoding it from binary to hex that's not an issue?
(05:55:54 PM) pfrazee: 
if that's right then I assume Buffer([0,1,2,3]) would be encoded as /%00%01%02%03
(05:56:31 PM) Kyran: 
O
(05:57:44 PM) Mauve: 
https://www.ietf.org/rfc/rfc2396.txt
(05:57:58 PM) Kyran: 
I'm a bit inexperienced here-- why would you need to separate values? Why wouldn't you be able to reference that as binary or hex?
URI encoding is kinda yuck if not needed
(05:57:58 PM) Mauve:

2.4.1. Escaped Encoding An escaped octet is encoded as a character triplet, consisting of the percent character "%" followed by the two hexadecimal digits representing the octet code. For example, "%20" is the escaped encoding for the US-ASCII space character. escaped = "%" hex hex hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f"


kylemathews Kyran 
(05:58:30 PM) Mauve: 
 @Kyran We can't have arbitrary binary in a URL key so we need to encode it somehow
(05:58:38 PM) Kyran: 
Oh! Nevermind I'm being silly. This is to facilitate both ascii and binary yeah?
(05:59:03 PM) Mauve: 
Kinda yeah
(05:59:12 PM) Mauve: 
I think this URL escaped encoding would work fine
(05:59:22 PM) pfrazee: 
yeah I think this is probably the "right way" to do it
(05:59:25 PM) Mauve: 
Something custom would make life harder for other implementations
(05:59:26 PM) Mauve: 
Yeah
(05:59:33 PM) Kyran: 
That's fair. I agree.
(05:59:48 PM) pfrazee: 
ok cool, so the question about listing keys vs getting value
(06:00:02 PM) pfrazee: 
the trailing slash idea seems pretty sensible to me
(06:00:22 PM) Mauve: 
Yeah, if the slash is treated specially, I think that's an elegant solution
(06:00:57 PM) pfrazee: 
@Mauve what was your point about the encoding and EventSource?
(06:01:36 PM) Mauve: 
 @pfrazee I think that listing as HTML is useful for when browsing, but when an application wants to GET the data, something more structured is useful
(06:01:52 PM) pfrazee: 
yeah agree
(06:02:07 PM) Mauve: 
In dat-fetch you can opt into reading a directory as JSON with the `ACCEPT` header, I wasn't sure what would be best for a hyperbee
(06:02:16 PM) Mauve: 
JSON would be obvious, but it's bad for binary values
(06:02:38 PM) pfrazee: 
I was going to suggest that yeah, so arguably accept=text/html  could also give a UI for rendering it
(06:03:14 PM) pfrazee: 
there's a mimetype for generic binary, application/octet-stream
(06:03:24 PM) Mauve: 
That make sense when you get a single value
(06:03:34 PM) Mauve: 
But when you list it's key-value pairs
(06:03:51 PM) pfrazee: 
hmm
(06:04:01 PM) Mauve: 
lol
(06:04:04 PM) Mauve: 
URI encoding it?
(06:04:10 PM) pfrazee: 
hah
(06:04:14 PM) Mauve: 
https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events
(06:04:44 PM) Mauve: 
Something loosely based on text/event-stream
(06:05:34 PM) pfrazee: 
hmm. I'm sure people have had to deal with this in the past
(06:05:38 PM) Mauve: 
Like

key: urlencodedkey
value: urlencodedvalue

(06:06:00 PM) Kyran: 
How is the key delimiter stored in binary keys?
(06:06:00 PM) Kyran: 
EDIT: How is the key delimiter stored in binary keys? (once subs are added)
(06:06:18 PM) Mauve: 
Same as the URLs!
(06:06:26 PM) Kyran: 
Huh?
(06:06:45 PM) pfrazee: 
maf has dealt with so much streaming stuff I imagine he'd be a good person to ask about this
(06:06:57 PM) Mauve: 


key: url/encoded/key
value: urlencoded%AAvalue

key: url/encoded/key2
value: urlencoded%AAvalue2


(06:07:14 PM) pfrazee: 
@Mauve that is a sensible option for sure
(06:07:21 PM) Kyran: 
Perhaps one could simply respond with an octet-stream with a delimiter between key and value and a double delimiter between key and key
(06:07:36 PM) pfrazee: 
also has a nice upside of human-readability
(06:07:49 PM) Kyran: 
@Mauve would that just be plaintext like that?
(06:07:52 PM) pfrazee: 
@Kyran the question there is whether we can be sure the delim wouldnt show up in the actual data
(06:08:39 PM) Kyran: 
@pfrazee that's why I was wondering how the true delimiter works (the one we can disguish to be different from $2F)
(06:08:39 PM) Kyran: 
EDIT: @pfrazee that's why I was wondering how the true delimiter works (the one we can disguish to be different from %2F)
(06:08:56 PM) pfrazee: 
@Mauve for a text-based encoding, I think either something like what you're proposing or just a JSON array (which is thus not ideal for streaming reads)
(06:09:25 PM) Mauve: 
OMG, if we literally use the event source format we could load a series of events from a hyperbee using EventSource. Like if we could do a live listing we could have it notify us of changes
(06:09:37 PM) Mauve: 
JSON array could be fine too
(06:09:39 PM) pfrazee: 
that's true
(06:09:46 PM) Mauve: 
Like, an array of bytes?
(06:09:49 PM) Mauve: 
We could do both
(06:10:04 PM) Mauve: 
Or neither. 😛
(06:10:08 PM) pfrazee: 
@Kyran if you're doing text encoding then you have a little more control of the output because you're transforming it
(06:11:03 PM) pfrazee: 
@Mauve tbh I dont have any strong opinions here so whatever yall think is best. One other thing to consider is, for binary listing responses using something like protobufs
(06:11:56 PM) pfrazee: 
I sort of get the feeling that SSE is on the way out but it's not a bad option for this
(06:12:01 PM) Mauve: 
Yeah... Protobufs would be great but then it wouldn't be usable in the browser without a library
(06:12:07 PM) pfrazee: 
right
(06:12:21 PM) Mauve: 
I was gonna use SSE for a bunch of stuff in hyperdrive and hypercore actually
(06:12:40 PM) pfrazee: 
yeah seems fine, SSE is pretty cool
(06:13:06 PM) Mauve: 
Electron doesn't give us any hope for custom websocket protocols so it's the only way I can get streaming data out in a friendly API
(06:13:10 PM) pfrazee: 
one other option for binary responses is just to cook up a format where you write lengths and then values
(06:13:20 PM) Mauve: 
Though I suppose we have readable streams in fetch these days...
(06:13:44 PM) Mauve: 
Yeah, length-prefixed streams could be decent for getting data
(06:14:06 PM) pfrazee: 
yeah <varint><key><varint><value><varint><key>... etc
(06:14:21 PM) Mauve: 
K, what about this: Start with HTML, add JSON with a flag for binary arrays / strings, then figure out event source or some binary thing?
(06:14:34 PM) pfrazee: 
sure sounds good to me
(06:15:00 PM) Kyran: 
That works. What's the default going to be? HTML like for Hyperdrive?
(06:15:48 PM) Mauve: 
For dat-fetch I was thinking of doing the same thing as Hyperdrive with the Accept header
(06:16:18 PM) Kyran: 
Yup. That makes sense.
(06:16:18 PM) Kyran: 
EDIT: Yup. That makes sense. :-)

kylemathews Kyran 

kylemathews Kyran 
(06:24:08 PM) Mauve: 
 @pfrazee Kyran: Mind if I copy this convo to a GitHub issue?
(06:24:24 PM) pfrazee: 
@Mauve go for it
(06:24:26 PM) Kyran: 
Sounds good
(06:24:26 PM) Kyran: 
EDIT: @Mauve Sounds good