deltares / pyflwdir Goto Github PK
View Code? Open in Web Editor NEWFast methods to work with hydro- and topography data in pure Python.
Home Page: https://deltares.github.io/pyflwdir/latest
License: MIT License
Fast methods to work with hydro- and topography data in pure Python.
Home Page: https://deltares.github.io/pyflwdir/latest
License: MIT License
Adding new functionality
I have seen a slope function 'pyflwdir.dem.slope'. But it seems to me like slope of D8 flow direction. I need to calculate the slope of D-inf flow direction. I guess some changes in the code of 'pyflwdir.dem.slope' can calculate slope of D-inf flow direction. I don't understand how to do. Could someone help me please?
No response
No response
Hi,
I would like to create polygons of the contributing area for > 10,000 points. I have all information that I need for this, including a 30m DEM, a river network (LineString geodataframe file) and derived flow direction raster. All compiled using pyflwdir.
I'm currently using the pyflwdir tool, but this takes around 5 minutes per point.. Has anyone any clues on how to approach this easier and quicker? Also, to insert the streams from a LineString geodataframe (instead of this flow.stream_order() > 4 command). This is my current script:
with rasterio.open(flwdir_fn, "r") as src:
flwdir = src.read(1)
crs = src.crs
flw = pyflwdir.from_array(
flwdir,
ftype="d8",
transform=src.transform,
latlon=crs.is_geographic,
cache=True,
)
for point in geo_df["geometry"].head():
x = point.x
y = point.y
print(x, y)
subbasins = flw.basins(xy=(x, y), streams=flw.stream_order() >= 4)
Many many thanks!
downstream hydromt fails because of what i think is an issue with a new numpy version
https://github.com/Deltares/hydromt/actions/runs/6430182597/job/17460685968?pr=535
I think I have the error narrowed down to how the JIT compiler views either the rasterio object or the parameters sent to pyflwdir.from_dem() function. I have a simple example based on the Quickstart from the documentation. The error occurs when i try to create a flow direction raster from elevation read through rasterio.
https://colab.research.google.com/drive/1engswQ4uaKB0UMMM4zt8YzC8wEAnK1o1?usp=sharing
import numpy as np
import rasterio as rio
import pyflwdir
def accumulate_flow(
flow_direction_filename,
headwaters_filename
):
with rio.open(flow_direction_filename) as src:
data = src.read(1)
nodata = src.nodata
profile = src.profile
temp = np.ndarray(shape=np.shape(data), dtype=np.uint8)
temp[data == 1] = 1
temp[data == 2] = 128
temp[data == 3] = 64
temp[data == 4] = 32
temp[data == 5] = 16
temp[data == 6] = 8
temp[data == 7] = 4
temp[data == 8] = 2
temp[data == nodata] = 247
flw = pyflwdir.from_array(temp, ftype='d8', check_ftype=False)
del temp
# Read the flow direction raster
with rio.open(headwaters_filename) as src:
headwaters = src.read(1)
nodata = src.nodata
flowaccum = flw.accuflux(headwaters, nodata=nodata, direction='up')
...
In trying to accumulate flow (accuflux) on larger rasters generated from 1m LiDAR Data, Segmentation Faults are occurring.
The specifics:
1m LiDAR flow direction file & headwaters file (same size raster)
>>> np.shape(data)
(63708, 79780)
flow_direction_filename is 253M
head_waters_filename is 66M
Rasters generated from 3m LiDAR data will not seg fault, and process successfully.
>>> np.shape(data)
(21236, 26593)
flow_direction_filename is 107M
head_waters_filename is 7.4M
When the Python script is called from a shell script, an Exit Status of 139 is observed. Further debugging:
export PYTHONFAULTHANDLER=1
root@f7f7e3af5d8b:/# python3 $srcDir/accumulate_headwaters.py -fd flowdir_d8_burned_filled_0.tif -wg headwaters_0.tif
Fatal Python error: Segmentation fault
Current thread 0x00007ff9300d8000 (most recent call first):
File "/usr/local/lib/python3.10/dist-packages/pyflwdir/flwdir.py", line 215 in order_cells
File "/usr/local/lib/python3.10/dist-packages/pyflwdir/pyflwdir.py", line 272 in idxs_seq
File "/usr/local/lib/python3.10/dist-packages/pyflwdir/flwdir.py", line 555 in accuflux
File "/accumulate_headwaters.py", line 73 in accumulate_flow
File "/accumulate_headwaters.py", line 110 in <module>
Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, yaml._yaml, numba.core.typeconv._typeconv, numba._helperlib, numba._dynfunc, numba._dispatcher, numba.core.runtime._nrt_python, numba.np.ufunc._internal, numba.experimental.jitclass._box, scipy._lib._ccallback_c, _cffi_backend, scipy.linalg._fblas, scipy.linalg._flapack, scipy.linalg._cythonized_array_utils, scipy.linalg._flinalg, scipy.linalg._solve_toeplitz, scipy.linalg._matfuncs_sqrtm_triu, scipy.linalg.cython_lapack, scipy.linalg.cython_blas, scipy.linalg._matfuncs_expm, scipy.linalg._decomp_update, scipy.sparse._sparsetools, _csparsetools, scipy.sparse._csparsetools, scipy.sparse.linalg._isolve._iterative, scipy.sparse.linalg._dsolve._superlu, scipy.sparse.linalg._eigen.arpack._arpack, scipy.sparse.csgraph._tools, scipy.sparse.csgraph._shortest_path, scipy.sparse.csgraph._traversal, scipy.sparse.csgraph._min_spanning_tree, scipy.sparse.csgraph._flow, scipy.sparse.csgraph._matching, scipy.sparse.csgraph._reordering, scipy.special._ufuncs_cxx, scipy.special._ufuncs, scipy.special._specfun, scipy.special._comb, scipy.special._ellip_harm_2, scipy.integrate._odepack, scipy.integrate._quadpack, scipy.integrate._vode, scipy.integrate._dop, scipy.integrate._lsoda, scipy.optimize._minpack2, scipy.optimize._group_columns, scipy._lib.messagestream, scipy.optimize._trlib._trlib, numpy.linalg.lapack_lite, scipy.optimize._lbfgsb, _moduleTNC, scipy.optimize._moduleTNC, scipy.optimize._cobyla, scipy.optimize._slsqp, scipy.optimize._minpack, scipy.optimize._lsq.givens_elimination, scipy.optimize._zeros, scipy.optimize.__nnls, scipy.optimize._highs.cython.src._highs_wrapper, scipy.optimize._highs._highs_wrapper, scipy.optimize._highs.cython.src._highs_constants, scipy.optimize._highs._highs_constants, scipy.linalg._interpolative, scipy.optimize._bglu_dense, scipy.optimize._lsap, scipy.spatial._ckdtree, scipy.spatial._qhull, scipy.spatial._voronoi, scipy.spatial._distance_wrap, scipy.spatial._hausdorff, scipy.spatial.transform._rotation, scipy.optimize._direct, scipy.ndimage._nd_image, _ni_label, scipy.ndimage._ni_label, rasterio._version, rasterio._err, rasterio._filepath, rasterio._env, _brotli, rasterio._transform, rasterio._base, rasterio.crs, rasterio._features, rasterio._warp, rasterio._io (total: 98)
Segmentation fault (core dumped)
Naively, the source code was modified (call to order_cells) hardcoding method="sort"
. This did not work. As seen above, line 215 in order_cells, calls core.idxs_seq which appears to be the root of the problem. No further investigation has been made past this point.
Ideally larger rasters would process without segmentation faults. If not, the exception could potentially be handled a little more elegantly from python with a message stating that the raster is too big to process, or....
Another option might entail providing documentation/examples to users on how to split larger rasters into chunks, and then provide the tool/utility to join/concatenate 'blocked' or 'chunked' rasters back into a single pyflwdir.FlwdirRaster
object once processing (whether it be accuflux
or other FlwdirRaster
methods) is finished.
Memory usage was tracked, and it was observed that less than 1/3 of the available RAM was in use when the segmentation fault occurred.
Dear dev team,
first of all, thank you very much for your great work.
I would like to request the feature to be able to set up a FlwdirRaster that has a circular raster referenced. I apply your tools on a global grid defined in a lat-lon grid with lon ranging from -180 to 180 deg. The discontinuity at the boundary causes basins to be split. It would be great to have a way of defining the grid so that the longitutes are angular.
Thanks!
Dear all,
thanks for this package. I am running conda 4.11 on Win10 and have installed pyflwdir as instructed. When following the user guide instructions for plotting I stumble upon below error.
Python 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:22:46) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from utils import quickplot, colors, cm
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'utils'
Any hints on how to alleviate the situation are highly welcome. Same counts for install under Ubuntu WSL.
Kind regards
Sebastian
method to give unique ids to branches
a method that maps subgrid cells to river cell (at model resolution) outflow points based on D8, sorts the HAND values (in future also manning?) and returns the contributing area for fixed HAND intervals
Dear all,
I am trying to resemble a functionality similar to GisToSWMM using pyflwdir. In my understanding of this algorithm a subcatchment hierarchy is determined considering landuse. Every (sub-)subcatchment contains only one landuse and routes either to another (sub-)subcatchment or an outlet. Outlets are given in the form of enlarged nodes, thus rasterized sink information derived from outlet nodes and DEM sinks.
Reading through pyflwdir documentation I figure that the Pfafstetter method allows for deriving hierarchical subcatchment information. If understood correctly it permits to figure out a form of subcatchment routing, thus which subcatchment drains to another subcatchment or outlet.
I am seeking ways to mask this routing method using landuse, so that every subcatchment contains only one landuse. Any ideas if and/or how to achieve this goal are highly appreciated.
Kind regards
Sebastian
import os.path
from datetime import datetime
import geopandas as gpd
import numpy as np
import rasterio
import pyflwdir
filename = r"c:\temp\hand\altaisk_krai_30m.tif"
min_sto = 6
start_time = datetime.now()
start_time_global = start_time
print(f"Read file: {filename}")
print(f"Start time: {start_time:%Y-%m-%d %H:%M:%S}")
with rasterio.open(filename, "r") as src:
elevtn = src.read(1)
nodata = src.nodata
transform = src.transform
crs = src.crs
extent = np.array(src.bounds)[[0, 2, 1, 3]]
latlon = src.crs.is_geographic
prof = src.profile
end_time = datetime.now()
print(f"End time: {end_time:%Y-%m-%d %H:%M:%S}")
print(f"Duration: {end_time - start_time}")
start_time = datetime.now()
print(f"Calculate flow directions")
print(f"Start time: {start_time:%Y-%m-%d %H:%M:%S}")
flw = pyflwdir.from_dem(
data=elevtn,
nodata=src.nodata,
transform=transform,
latlon=latlon,
outlets="edge",
)
print(f"End time: {datetime.now():%Y-%m-%d %H:%M:%S}")
print(f"Duration: {datetime.now() - start_time}")
start_time = datetime.now()
print(f"Calculate stream network")
print(f"Start time: {datetime.now():%Y-%m-%d %H:%M:%S}")
feats = flw.streams(min_sto=4)
print(f"End time: {datetime.now():%Y-%m-%d %H:%M:%S}")
print(f"Duration: {datetime.now() - start_time}")
start_time = datetime.now()
print(f"Vectorize stream network")
print(f"Start time: {datetime.now():%Y-%m-%d %H:%M:%S}")
gdf = gpd.GeoDataFrame.from_features(feats, crs=crs)
print(f"End time: {datetime.now():%Y-%m-%d %H:%M:%S}")
print(f"Duration: {datetime.now() - start_time}")
start_time = datetime.now()
print(f"Write stream network to file")
print(f"Start time: {datetime.now():%Y-%m-%d %H:%M:%S}")
gdf.to_file(os.path.splitext(filename)[0] + '.shp')
print(f"End time: {datetime.now():%Y-%m-%d %H:%M:%S}")
print(f"Duration: {datetime.now() - start_time}")
print(f"Total duration: {datetime.now() - start_time_global}")
You cannot use rasters larger than ~44000x20000 on a computer with 16gb of memory or less
Crashed here on this line of my script
feats = flw.streams(min_sto=4)
pyflwdir.pyflwdir.FlwdirRaster.streams
pyflwdir/pyflwdir.py:977
...
if xs is None or ys is None:
idxs0 = np.arange(self.size, dtype=np.intp)
-> xs, ys = gis.idxs_to_coords(idxs0, self.transform, self.shape)
...
pyflwdir/gis_utils.py:222
...
-> xs, ys = transform * transform.translation(coff, roff) * (cols, rows)
...
Do not create the whole array with pixel coordinates at once, but only those that are necessary to form geometry in pyflwdir.gis_utils.features
I tried to specify the coordinate accuracy to float32
pyflwdir/gis_utils.py:207
pyflwdir.gis_utils.xy
rows, cols = np.asarray(rows, dtype=np.float32), np.asarray(cols, dtype=np.float32)
But I also got the error
I tried to specify the coordinate accuracy to float16
pyflwdir/gis_utils.py:207
pyflwdir.gis_utils.xy
rows, cols = np.asarray(rows, dtype=np.float16), np.asarray(cols, dtype=np.float16)
I was able to do the calculations, but the accuracy was very poor
upscale_check
and upscale_error
methodsnp.random.seed(11)
dem = np.random.rand(15, 10)
flwdir = pyflwdir.from_dem(dem, outlets='min')
dem1 = flwdir.dem_adjust(dem)
np.all((dem1 - flwdir.downstream(dem1))>=0)
>> False
we get errors in from_dem method with numba 0.51
see #6 for context
I'm trying to get started with pyflwdir -- it looks very powerful. But the code in the quickstart at https://deltares.github.io/pyflwdir/latest/quickstart.html fails when executed in a newly created Python 3.11.6 virtualenv (on Linux, Debian sid, if that matters) after pip installing pyflwdir and rasterio:
(and sorry for the formatting -- I'm using "code <>" but github's markdown seems to be treating python interpreter output like an email message instead of code):
`
import rasterio
with rasterio.open("n35_w107_1arc_v3.tif", "r") as src:
... elevtn = src.read(1)
... nodata = src.nodata
... transform = src.transform
... crs = src.crs
...
import pyflwdir
flw = pyflwdir.from_dem(
... data=elevtn,
... nodata=src.nodata,
... transform=transform,
... latlon=crs.is_geographic,
... )
Traceback (most recent call last):
File "", line 1, in
File "/home/akkana/Data/demnames/dem-full-env/lib/python3.11/site-packages/pyflwdir/pyflwdir.py", line 97, in from_dem
d8 = dem.fill_depressions(
^^^^^^^^^^^^^^^^^^^^^
File "/home/akkana/Data/demnames/dem-full-env/lib/python3.11/site-packages/numba/core/dispatcher.py", line 468, in _compile_for_args
error_rewrite(e, 'typing')
File "/home/akkana/Data/demnames/dem-full-env/lib/python3.11/site-packages/numba/core/dispatcher.py", line 409, in error_rewrite
raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function() found for signature:
heappush(list(Tuple(int16, uint8, uint32, uint32))<iv=None>, Tuple(int64, uint8, uint32, uint32))
There are 2 candidate implementations:
During: resolving callee type: Function()
During: typing of call at /home/akkana/Data/demnames/dem-full-env/lib/python3.11/site-packages/pyflwdir/dem.py (130)
File "dem-full-env/lib/python3.11/site-packages/pyflwdir/dem.py", line 130:
def fill_depressions(
if ~queued[r, c]: # add to queue
heapq.heappush(q, (z1, np.uint8(0), np.uint32(r), np.uint32(c)))
^
`
This error was mentioned in Issue #10 and there was a request to try a more recent numba, but that was in 2021 and I see this with the numba currently in pip.
The elevation data I'm using is SRTM downloaded four years ago -- maybe there's a newer format that pyflwdir is expecting?
My apologies for opening this issue as I don't know how else to do this, but I was wondering if the pyfldir toolkit can be used to calculate the distance to a river network? I've been desperately looking for an open-source tool to do this. So a method that uses a flow direction raster to calculate the flow distance to the nearest stream.
Subgrid river lengths can locally be very short which slows down the model. Potentially we can redistribute the subgrid river lengths within a single river branch to speed up calculations without making a large computational error.
method in flwdir.py
numba jitted method in streams.py
suggested algorithm
from up- to downstream:
while not at next confluence or less than n cells:
get lengths & indices
average lengths
assert that the total river length is the same
Dear @DirkEilander ,
This is Jiangchao Qiu, a Ph.D. student from Sun Yat-sen University, China. I am major in simulating storm surge using physical model.
I know you by reading your paper : The effect of surge on riverine flood hazard and impact in deltas globally. Great works!
Recently, my collaborator provides me some global modeling result using CaMa-Flood in 15 arcmin resolution. I am trying to analyze these result and make some visualization, especially the river network and river mouths along the coastlines.
Firstly, I used the function of cmf_maps_io.py (in your compound_hotpots repositories) to transform the nextxy.bin file to nextxy.tif file
glb_15min.zip
Secondly, I want to use the pydlwdir package to visualize the river network, but some errors occurred when I parse the nextxy.tif file to
the actionable common format.
the following is the screensnaps
I try to use the function of pyflwdir.FlwdirRaster and pyflwdir.from_array, but both failed, I don't know how to fix it, could you please help me check and give some suggestion about how to solve this problem.
By the way, the environment of my package seems no problem, since all of the example in the notebook folder can be carried out smothly.
Many thanks to you and looking forward to your reply.
Best regards,
Jiangchao
ValueError: The flow direction data with type "d8" is invalid.
Adding new functionality
pyflwdir is the best module I have worked with for watershed delineation. It is very fast and easier to handle for large raster. However, I have not found any function regarding D-inf flow direction which is also used widely. If possible, please add this functionality
No response
No response
Hi @DirkEilander ,
I want to identify all of the river mouth and make a plot using the 3 arcmin river network dataset nextxy.bin.
But it is odd that the figure I plotted here seems reversed in the latitude. Could you please have a look? Many thanks.
Best regards,
Jiangchao
my scripts:
import pyflwdir
import numpy as np
bbox=[-180, -90, 180, 90]
data, transform = pyflwdir.read_nextxy(
fn='glb_03min/nextxy.bin', nrow=3600, ncol=7200, bbox=bbox)
import xarray as xr
xr.DataArray(data[0]).plot()
x = data[0]
lon = np.linspace(-180,180,7200)
lat = np.linspace(-90,90,3600)
lon[np.where(x==-9)[1]],lat[np.where(x==-9)[0]]
import geopandas as gpd
from shapely.geometry import Point
gdf = gpd.GeoDataFrame([Point(x, y) for x,y in zip(lon[np.where(x==-9)[1]], lat[np.where(x==-9)[0]])])
gdf.columns = ['geometry']
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
fig, ax = plt.subplots()
gdf.plot(ax=ax,markersize=0.1, color='red')
world.plot(ax=ax);
Feature request: This may be in the code but I haven't see it in the docs anywhere. The flow path trace function returns a list of pixel indices in the linear index format. A helper function to convert to the coordinate system of the raster would be helpful.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.