The following is a description of a possible bottleneck and a proposal which, if I'm right, could shave 9 seconds off of tile merging in my described use case.
I've spent a lot of the evening reading cogeo*
/tiler*
code, and I think I may have discovered the largest current bottleneck, at least with my dataset: a full mercator tile's data image bitmap and mask is created for each asset, regardless of the amount of overlap.
I think that's a big performance hit when you have several underlying images. In my mosaic_tiler
profiling towards the bottom of this post, you can see that the biggest time hit is vrt.read
, i.e. creating the data arrays. It's ~1.8s per asset, regardless of the amount of overlap.
Here's an example of where that really hurts. For a web mercator tile at zoom 12 covering West Hollywood (blue), there are six assets that are required to load (brown). Despite using a simpler pixel selection method, i.e. FirstMethod
, all 6 assets need to be loaded and parsed, even though only 1 covers a majority of the tile.
![image](https://user-images.githubusercontent.com/15164633/77286861-cfc7df00-6c99-11ea-84e9-8ed584b030c0.png)
Note, however, that since there's essentially zero overlap between assets, if rasterio windows
could be used to only read the subset of the tile with valid source , and if vrt.read
is linear with the amount of pixels generated, then you could potentially save 9 seconds, or 83% faster on this tile load. (I.e. there are currently 5 extra sets of 512x512 vrt.read
operations beyond the one that's needed, and 1.8s * 5 = 9.)
6 underlying assets is the median for my MosaicJSON, so this is a very common occurrence. Here's the distribution of how many assets are in each quadkey:
count 142516.000000
mean 5.508160
min 1.000000
25% 4.000000
50% 6.000000
75% 6.000000
max 18.000000
Proposal
The slowest part of mosaic_tiler
is creating data arrays that line up with the mercator tile for each underlying asset. I propose to use rasterio
's windows
to create a data array and mask using the minimal bounding box of valid source data within the mercator tile of interest.
Take my intial contrived/sketched example above, where an asset's bounds overlap just the top left of the mercator tile. In that case, since the overlap is rectangular in mercator coordinates, the window could read just the overlapping asset data and return an object like the following:
data
: a numpy ndarray of size 3 x overlapping mercator width x overlapping mercator height. Pixel values for each band of only the parts of the mercator tile where source data exists.
mask
: a numpy ndarray of size 1 x overlapping mercator width x overlapping mercator height. In the case where the overlap is rectangular in mercator coordinates, this would be entirely True
.
mercator_bounds
: [0, 400, 100, 512]
. The lower left and upper right corners in mercator tile coordinates. This would be used in the pixel merging code once each tile's data has loaded.
I believe finding this intersection of valid data would be possible by intersecting the georeferenced bounds of the asset with the bounds of the mercator tile.
In the more general case (e.g. of Landsat data), where the intersection of the image scene and the mercator tile is not rectangular in mercator coordinates, you could still take the minimal intersecting bounding box in mercator coordinates. In this case, the mask
is used to filter out the pixels without source data.
In addition to potentially taking much less time, it would take much less memory and could be run on a smaller lambda instance.
Code to update
- Update
_tile_read
in rio_tiler.utils
with the option to return a rectangular subset of the desired mercator tile.
- Update
tile
in rio_tiler.main
to find the intersection between the bounds of asset and the bounds of the mercator tile, and pass that to _tile_read
. Note that cogeo-tiler-mosaic
would also need those bounds returned from the function, so for backwards compatibility, creating a separate function might be better.
- Update
rio-tiler-mosaic
's pixel selection methods to work with data arrays of subsets of the tile, instead of full arrays.
I'd be happy to submit PR's for these, because if I'm right it could make cogeo-mosaic-tiler
really fast.
Profiling
Profiling is done using AWS X-Ray, inserting custom timing sections. I changed mosaic_tiler
to run single-threaded for profiling, but it ran about the same speed as when it was using ThreadPoolExecutor
.
Here's a profile of mosaic_tiler
running for a mercator tile with four assets. Each asset takes a total of 2.5 seconds to load, with ~1.8 seconds of that just in vrt.read
here.
![image](https://user-images.githubusercontent.com/15164633/77289999-c5f5aa00-6ca0-11ea-816a-5aaf248a782c.png)