In this issue I will propose two extensions which both aim implement the features needed for parallel uploads and non-contiguous chunks (see #3). In the end we have to choose one of them or go with something else.
Before starting I want to clarify that streaming uploads (with unknown length at the beginning) have not been included in these thoughts since you should not use them in conjunction with parallel uploads.
The first solution uses offset ranges. Instead of defining one offset which starts at 0 and is incremented for each PATCH
request the server would store one or multiple ranges of allowed and free offsets. These offsets will be returned in the Offset-Range
header in the HEAD
request replacing the Offset
response (!) header. The client then uses this information to choose the offset and uploads the same way as it's currently implemented.
Here is an example of a 300-byte file of which the second 100 bytes (100-199) have been uploaded:
HEAD /files/foo HTTP/1.1
HTTP/1.1 204 No Content
Enitity-Length: 300
Offset-Range: 0-99, 200-299
PATCH /files/foo HTTP/1.1
Content-Length: 100
Offset: 200
[100 bytes]
HTTP/1.1 204 No Content
HEAD /files/foo HTTP/1.1
HTTP/1.1 204 No Content
Enitity-Length: 300
Offset-Range: 0-99
The range of the last 100 bytes (200-299) has been removed since this buffer has been filled successfully by the upload.
While this solution allows the maximum of flexibility (compared to my second proposal) since you can upload at any offset (as long as it's available) it may be a though extension to implement for the servers. It has to ensure that the start of the offset range against which the chunk is uploaded is available and the end of the offset. Using the example from above you're not allowed to patch a 150-byte chunk at the offset of 0 because the bytes starting from 100 have already been written.
The second solution I came up with involves a bit more: When creating a new upload (using the file creation extension or somehow else) a blocksize is defined using which the file is separated into different blocks. For example, considering a file of 5KB and a blocksize of 2KB you would end up with two blocks of 2KB and a single one of 1KB. The important point is that each of the blocks has its own offset which starts at position 0 relative to the starting position of the block.
Considering the last example, the relative offset 100 of the second block would be the absolute offset of 2148: 2048 (2KB starting position of the second block) + 100 relative offset.
Only one upload is allowed at the same time per block. In this example a maximum of three parallel uploads are allowed. Each new PATCH
request must resume where the last upload of the block has stopped, jumps are not allowed.
In following example we consider having a file of 5KB with the blocksize of 2KB. The first block is already fully uploaded (2048 bytes), the second with is filled with 100 bytes and the last one has not a single write yet. We are going to upload 100 bytes to the relative offset of 100 into the second block:
HEAD /files/bar HTTP/1.1
HTTP/1.1 204 No Content
Enity-Length: 5120
Blocksize: 2048
Block-Offset: 2048, 100, 0
PATCH /files/bar HTTP/1.1
Content-Length: 100
Offset: 2148
[100 bytes]
HTTP/1.1 204 No Content
HEAD /files/bar HTTP/1.1
HTTP/1.1 204 No Content
Enity-Length: 5120
Blocksize: 2048
Block-Offset: 2048, 200, 0
Please post your opinion about these solutions (I prefer the my last proposal) or any additional way we could achieve parallel and non-contiguous uploads. Also take the time to consider the work of implementations for servers and clients.