Giter Site home page Giter Site logo

tpilz / lumpr Goto Github PK

View Code? Open in Web Editor NEW
11.0 4.0 12.0 1.72 MB

Landscape Unit Mapping Program for R

License: GNU General Public License v3.0

R 84.09% Fortran 5.73% PHP 9.60% Shell 0.57%
lumpr r grass-gis databases odbc hydrological-modelling wasa-sed landscape-discretisation foss

lumpr's Introduction

lumpR

Landscape Unit Mapping Program for R

DESCRIPTION

This project deals with an R-package called "lumpR". The package provides functions for a semi-automated approach of the delineation and description of landscape units and partition into terrain components. It can be used for the pre-processing of semi-distributed large-scale hydrological and erosion models using catena-representation (WASA-SED, CATFLOW). It is closely connected to and uses functionalities of GRASS GIS. Additional pre-processing tools beyond the scope of the original LUMP algorithm are included.

INSTALLATION

  • command line installation:
install.packages("devtools") 
library(devtools)
Sys.setenv(R_REMOTES_NO_ERRORS_FROM_WARNINGS=TRUE) #tell git_install() to ignore warnings. Otherwise, it gets stuck at each warning
install_github("tPilz/lumpR")
  • from zip/tar:
    • download zip/tar from github: >LINK<
    • install via R-GUI

The main branch relies on GRASS7. The migration of the packaga to GRASS8 is underway, but not fully tested: [https://github.com/tpilz/lumpR/tree/grass8]

MORE INFORMATION

Have a look at our wiki for more detailed information: >LINK<

FEEDBACK and BUGS

Feel free to comment via github issues: >LINK<

LICENSE

lumpR is distributed under the GNU General Public License version 3 or later. The license is available in the file GPL-3 of lumpR's source directory or online: >LINK<

NOTE

This package was also known as LUMP and has been renamed by Jan 9th 2017 to distinguish it from the LUMP algorithm published by Francke et al. (2008).

REFERENCES

A paper describing lumpR along with an example study was published in GMD:

Pilz, T., Francke, T., and Bronstert, A.: lumpR 2.0.0: an R package facilitating landscape discretisation for hillslope-based hydrological models, Geosci. Model Dev., 10, 3001-3023, doi: 10.5194/gmd-10-3001-2017, 2017.

See also the accompanying github repository: https://github.com/tpilz/lumpr_paper

For the original LUMP algorithm see:

Francke, T., Güntner, A., Mamede, G., Müller, E. N., and Bronstert, A.: Automated catena-based discretization of landscapes for the derivation of hydrological modelling units, Int. J. Geogr. Inf. Sci., 22, 111-132, doi:10.1080/13658810701300873, 2008.

lumpr's People

Contributors

tillf avatar tpilz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

lumpr's Issues

Actual coordinates of most representative catena

Could you please give a guidnance how to add a printout of the actual (x,y) coordinates of the (central line? ) of all catenas to rstats.txt in addition to id. It is easy to get id of the most representative catenas, but to explore them in the field (we do soil profiling) actual coordinates would be an asset. Thanks!

lumpR & WASA-snow

It would be nice to have lu2.dat automatically produced by lumpR, as needed by WASA-snow.
The parameters "aspect" and "slope" would be needed, if "do_rad_corr"=T in snow_params.crtl (thus, a radiation correction is executed)

create database based on function output

Function that creates a (e.g. mysql) database and writes parameters estimated by existing functions into that database. I am not sure if there is a practical way to realise that in an R function.

calc_subbas: Snapping to flowlines can be problematic

Snapping to flow lines (river) can result in outlet points, which are not intended (see pic: circle: original outlet point; arrow: snapped point; red cells: river from flowaccumualtion; orange: erroneously constructed basin; black line: expected watersshed).
snap2line

parameter database: correcting fractions of entities (normalization)

The current praxis in "filter_small_areas" drops the respective tables and uploads the corrected ones.
Fine with me, but this will probably violate constraints/relations - do you still use them anywhere.
If so, using the SQL-queries in my lates version of database.R may be the better option to do the correction.

rainy_season() Fortran code

External code function rainy_season() uses is old Fortran 77 with implicit declarations, hardly any comments etc. Thus, re-coding in, e.g. Fortran 90/95 and some comments would be a nice improvement though it is not absolutely necessary.

parameter database: Include reservoir/river tables

Include reservoir table into parameter database.

Possibly affects:

  • db_create()
  • db_update() as this comes with a new database version
  • db_fill() based on output of function wasa_reservoir_par()
  • db_check() ?!

Suggestions for performance improvement

to be updated

Some suggestions for performance improvement:

  • area2catena() / prof_class(): catena_file could be written/imported as Rdata file (maybe optionally via argument flag) which would speed up writing/reading operations; relevant for large catchments and/or high resolution -> commit 24e281c

  • prof_class(): source code is still rather messy and could be improved (consider tidyverse philosophy)

to be updated

remarks lumpR description & output

2 small remarks:

  1. the do.dat currently produced by lumpR contains space marks after the [ brackets in line 6 and 7
  2. in soter.dat, frgw_delay[day] is estimated with 11 decimal places (for significant digits, maybe no decimal places are needed)

parameter database: sometimes error occurs when closing sqlite database

Function odbcClose() from package RODBC which is used internally within db_* functions sometimes causes an error when using sqlite 3 (reprocuded under opensuse using unixODBC and Windows):

library(RODBC)
# connect to ODBC sqlite database
con <- odbcConnect(dbname, believeNRows=F)
# fetch data from database table
dat <- sqlFetch(con, table)
# close database
odbcClose(con)

Fehler in odbcGetErrMsg(channel) : 
  erstes Argument ist kein offener RODBC-Kanal
Zusätzlich: Warnmeldungen:
1: In odbcClose(con) : [RODBC] Fehler in SQLDisconnect
2: In odbcClose(con) : [RODBC] Fehler in SQLFreeconnect

However, channel is still open as this works:

library(RODBC)
con <- odbcConnect(dbname, believeNRows=F)
dat <- sqlFetch(con, table)
dat2 <- sqlFetch(con, other_table)

And this causes no trouble:

library(RODBC)
con <- odbcConnect(dbname, believeNRows=F)
odbcClose(con)

Using another DBMS works without errors as well.

Classified catena output

Could you please provide the txt output with partitioning class (x-distance, y-relative elevation gain) for each catena of the resulting classes that are currently printed to the final section of plots_prof_class.pdf. This is needed for direct processing in the code different from WASA model

doMC breaks Windows-compatibility

Afaik doMC only runs on *nix. It should be moved to the "suggests" section in the DESCRIPTION-file and the related calls made optional. Otherwise, the packege cannot be installed under windows.

Revise messaging

The print() command was used to generate informative messages during function execution which was a mistake. It should be replaced by message(). The whole messaging system of the package should be revised.

Problem with installation of lumpR

On a latest version of RGui and Rtools running on Windows 10 x64 machine I can not install the current version of lumpR
Error massage is the following
Error: package or namespace load failed for 'lumpR' in namespaceExport(ns, exports): undefined exports: db_compute_musleK
I use the standart set of commands from command line of RGui
install.packages("devtools")
library(devtools)
install_github("tPilz/lumpR")

reduce GRASS messages

When executing GRASS commands within function a lot of GRASS messages appear on the screen which seems a bit messy. That should be reduced.

parameter database: db_check should only check

In the first place, "db_check" should only check and issue an report. Only when setting an argument fix=TRUE the changes should be made. Thus, one may be able to detect the reason for certain inconsistencies better and probably change them manually.

prof_class(): problems when classification method = 1

When classification method = 1 (i.e. specifying the overall number of classes and weighting factor for each attribute) the calculation of the dissimilarity matrix using function daisy() causes problems. If classification method = -1 the matrix is re-calculated ("quick and dirty computation of distance matrix to allow the rest of the script to be run without problems"). The procedure should be revised.

Performance db-cleaning operations

I went thru the db-cleaning operations and noticed some things that could be changed if performance becomes an issue:

  • the removal of obsolete entities could be made by a selective SQL-statement, instead of dropping and reloading the entire table (or was there any reason for this)
  • given the former, the obsolete entries could be queried by a SQL-query instead of requesting the entire table
  • the entire routine could be made more generic by iterating through this list of tables (subbas, r_subbas_contains_lu, lu, r_lu_contains_svc, ...) and checking for each entriy in the main table, if is is contained in the subsequent one.

use g.region to focus on core area

Currently, we mainly employ the MASK to define the area of interest.
g.region is used only once in lump_grass_prep.R to define the region, but I assume it would be better to save the region with a specific name and set this region explicitly any of the subsequent steps. This would ensure more consistent behaviour when resuming the workflow inbetween:
g.region --overwrite zoom=MASK save=saved_region #creates region from MASK
g.region region=saved_region #recalls saved region

RAM limitation observation with area2catena() and large study areas

Dear Tobias and Till,
We found out something about study area size and RAM limitation working with lumpR, that might be interesting for you:
My study area is quite large: with 90*90 m resolution it has 75545028 cells. (details further down). Area2catena works with 6 layers, that means that
Total Cells: 75545028 * Layer (6)= 453 270 168 Cells are used, transfered and processed in the function. That process reaches the limits of my computer RAM (8 GB). Using the recommend parallelization (more cores) leads to some kind of an overflow of the RAM requirements. Observing a task manager you can see, that the CPU drops to 1-5% while RAM is full (95%). The function will never end (I interrupted it after 5 days of computing during the long easter weekend). You need to force break the whole machine.
The solution that worked for us is to only use 1 core. With that, the function is successfully computed. But: the computation duration is about 12 h and the RAM is at limit (nothing works anymore, R session gives you the dubious "cannot allocate memory" message with any action you try and you need to restart the computer).
A possible explanation is, that by needing more RAM than your computer offers, processes are transferred to swap (deferred for later). Transferring data to swap and back needs much time, what can be an explanation for long computing duration.

Summary: For study areas larger than ours you might use a server or a computer that has more RAM than 8 GB.

Details:
Ehas= ca 28.000 (with the parameter settings quite all of them are included)

Region:
projection: 1 (UTM)
zone: -23
datum: wgs84
ellipsoid: wgs84
north: 8394608.94608946
south: 7573260.73260733
west: 150000
east: 895015.88956751
nsres: 90.00090001
ewres: 89.99950345
rows: 9126
cols: 8278
cells: 75545028

Mask:
Type of Map: raster Number of Categories: 1
Data Type: CELL
Rows: 9126
Columns: 8278
Total Cells: 75545028
Projection: UTM (zone -23)
N: 8394608.94608946 S: 7573260.73260733 Res: 90.00090001
E: 895015.88956751 W: 150000 Res: 89.99950345
Range of data: min = 1 max = 1
Data Description:
generated by r.mapcalc
Comments:
if(isnull(elev_riv), null(), 1)
+----------------------------------------------------------------------------+

DATA:

Digital Elevation Model:

|----------------------------------------------------------------------------|
| |
| Type of Map: raster Number of Categories: 255 |
| Data Type: FCELL |
| Rows: 9126 |
| Columns: 8278 |
| Total Cells: 75545028 |
| Projection: UTM (zone -23) |
| N: 8394608.94608946 S: 7573260.73260733 Res: 90.00090001 |
| E: 895015.88956751 W: 150000 Res: 89.99950345 |
| Range of data: min = 299 max = 2076 |
| |
| Data Description: |
| generated by r.mapcalc |
| |
| Comments: |
| if(mask_with_dam == 100, dem_shrink + 100, dem_shrink) |
| |
+----------------------------------------------------------------------------+

Flow Accum

|----------------------------------------------------------------------------|
| |
| Type of Map: raster Number of Categories: 255 |
| Data Type: DCELL |
| Rows: 9126 |
| Columns: 8278 |
| Total Cells: 75545028 |
| Projection: UTM (zone -23) |
| N: 8394608.94608946 S: 7573260.73260733 Res: 90.00090001 |
| E: 895015.88956751 W: 150000 Res: 89.99950345 |
| Range of data: min = 1 max = 7048270 |
| |
| Data Description: |
| generated by r.mapcalc |
| |
| Comments: |
| abs(flow_accum_t) |
| |
+----------------------------------------------------------------------------+

Eha

|----------------------------------------------------------------------------|
| |
| Type of Map: raster Number of Categories: 0 |
| Data Type: CELL |
| Rows: 9126 |
| Columns: 8278 |
| Total Cells: 75545028 |
| Projection: UTM (zone -23) |
| N: 8394608.94608946 S: 7573260.73260733 Res: 90.00090001 |
| E: 895015.88956751 W: 150000 Res: 89.99950345 |
| Range of data: min = 21189 max = 53450 |
| |
| Data Description: |
| generated by r.grow |
| |
| Comments: |
| r.grow input="eha_t2" output="eha" radius=100 metric="euclidean" |
| |
+----------------------------------------------------------------------------+

dist_riv

|----------------------------------------------------------------------------|
| |
| Type of Map: raster Number of Categories: 255 |
| Data Type: DCELL |
| Rows: 9126 |
| Columns: 8278 |
| Total Cells: 75545028 |
| Projection: UTM (zone -23) |
| N: 8394608.94608946 S: 7573260.73260733 Res: 90.00090001 |
| E: 895015.88956751 W: 150000 Res: 89.99950345 |
| Range of data: min = 0 max = 197.989052746315 |
| |
| Data Description: |
| generated by r.mapcalc |
| |
| Comments: |
| dist_riv_t / 90.000202 |
| |
+----------------------------------------------------------------------------+

elev_riv

|----------------------------------------------------------------------------|
| |
| Type of Map: raster Number of Categories: 255 |
| Data Type: FCELL |
| Rows: 9126 |
| Columns: 8278 |
| Total Cells: 75545028 |
| Projection: UTM (zone -23) |
| N: 8394608.94608946 S: 7573260.73260733 Res: 90.00090001 |
| E: 895015.88956751 W: 150000 Res: 89.99950345 |
| Range of data: min = -78 max = 669 |
| |
| Data Description: |
| generated by r.stream.distance |
| |
| |
+----------------------------------------------------------------------------+

svc

|----------------------------------------------------------------------------|
| |
| Type of Map: raster Number of Categories: 960 |
| Data Type: CELL |
| Rows: 9126 |
| Columns: 8278 |
| Total Cells: 75545028 |
| Projection: UTM (zone -23) |
| N: 8394608.94608946 S: 7573260.73260733 Res: 90.00090001 |
| E: 895015.88956751 W: 150000 Res: 89.99950345 |
| Range of data: min = 0 max = 960 |
| |
| Data Description: |
| generated by r.cross |
| |
| |
+----------------------------------------------------------------------------+

Kind regards,
Lisa and Josee

calc_subbas(): Single flow vs. Multiple flow algorithm

At the moment for subbasin delineation using the GRASS function r.watershed the single flow direction (SFD) algorithm is used. In literature it is usually stated that multiple flow direction (MFD) algorithm should be superior. In my tests, however, when employing MFD, (still using GRASS 6.4.5) there have been problems as the generated flow accumulation map diverges at some points (causing trouble with the identification of river cells and catchment outlets) and generated subbasins are more variable in size.

One could think a litter deeper on that problem. So far I decided to use the SFD algorithm.

Suggestions for revision

The code in its current form is too messy and chances are too high of introducing new bugs with commits. A list of general points that should be considered for code revision:

  • Implement object-oriented programming
    • e.g. object of class lumpr which contains meta-information, which would make it easier to process the chain of lumpR functions (user only needs to pass the lumpr object from function to function)
    • S3 would probably be the most straightforward way; see this general introduction
  • replace the messy tryCatch() calls by on.exit(), where it makes sense
  • re-organise code and package structure following these guidelines
    • let roxygen2 organise NAMESPACE to prevent bugs such as b7e5021
    • more sub-functions
  • improve performance
  • create an example dataset and write vignettes covering different topics
    • use data stored as binaries to avoid license conflicts (and reduce memory)?!
  • add unit tests to reduce the chance of introducing new bugs with commits
  • implement default parameter (function argument) values as much as possible to simplify the workflow for new users as much as possible (I think at the moment it's just too confusing for new users)

Test different OS

Package developed on Linux opensuse 13.1 and GRASS 6.4.3.

Should be tested for (at least) Windows and MAC.

db_compute_musleK extra options

Wish list of additional features:

  • option to automatically set musle_p to 1, if desired
  • automatically copy (from "horizons.dat"), column "coarse_frag" for 1st soil horizon into data base table "soil_veg_components", column "coarse_frac"
  • inconsistent nomenclature of "horizons.dat" column "coarse_frag" and "svc.dat" column "coarse_frac" ?

Error in data cleaning in db_check delete obsolete

db_check(dbname,

  •      check=c("delete_obsolete"),
    
  •      fix=T,
    
  •      verbose=T)
    

When executing following error message returns:

% Write changes into database and update 'meta_info' (might take a while) ...
Error in [.data.frame(dat_tbl, , key_t[1]) : undefined columns selected

The corresponding data base can be found at: V:\xchange\erwin\4TPilz

sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] lumpR_2.2.0 [...]

Compatibility to GRASS 7

The package needs to be adjusted to be compatible with GRASS 7. This, however, is not downward compatible to GRASS 6.4 as many argument names of functions changed etc. I.e., I guess it will take a few days to adjust lumpR. Nevertheless, it should be worth as GRASS 7 shall be more efficient and maybe lumpR will run faster.

memory issues with calc_subbas

For large catchments (e.g. Sao Francisco) calc_subbas issues this error:

Calculate drainage and river network...
ABSCHNITT 1a (von 4): Initiiere den Speicher.
Aktuelle Region Zeilen: 20217, Spalten: 14047
FEHLER: G_malloc: Kann nicht 2272853112 Byte auf init_vars.c:134 reservieren

According to other users, this is a memory issue, created by GRASS GIS
r.watershed

It is recommended to activate the flag
-m
(Enable disk swap memory option: Operation is slow
Only needed if memory requirements exceed available RAM; see manual on how to calculate memory requirements, http://grass.osgeo.org/grass70/manuals/r.watershed.html#in-memory-mode-and-disk-swap-mode)

prof_class: cluster analysis

In function prof_class() option of cluster analysis is deactivated and variance-based method is used. Maybe cluster analysis could be re-activated in a revised form?

Potential simplifications

A list of potential simplifications to make life easier, especially for new users:

-- to be updated --

  • calc_subbas(): determine reasonable parameter values automatically (e.g., use resolution and region size as proxies)
    -- to be updated --

revise parallelisation of area2catena

In the current form, the parallelisation of area2catena seems to require the replication of the large grids. Instead, the parallelisation of calls just using the data required for single EHAs could improve performance significantly.

include contents of database.R (database interface)

Proposed subroutines:

  • create_db (con, ver_no)
  • update_db(con, ver_no)
  • fill_db(filenames, ...)
  • check_db [probably not a true function, but needs to be completed line by line)
  • export_db(con, dest_dir) (from make_wasa_input minus the checking)

Improve generation of do.dat

Some more parts in do.dat could be set automatically according to content of db:

  • doreservoirs
  • doacudes
  • dosnow (? tables yet to be implemented)

reservoir_lumped(): unify behaviour

Currently, reservoir_lumped() directly creates WASA-input files. It would be better if it would behave like the other routines: create output files, which can be re-imported into the db using db_fill.

issues grass7_2018

- to be continued -

Remaining issues I observed with my test dataset along with GRASS 7 adaptation:

  • calc_subbas(): message of number of subbasins left after removing spurious subbasins is wrong (i.e. not the actual number found in GRASS) -> resolved by commit 3fc4c67
  • calc_subbas(): differing number of categories in subbas and drainage points might be intentional (e.g. when drain_points and thresh_sub are given) -> remove or adjust warning message -> resolved by commit 3fc4c67
  • calc_subbas(): column subbas_id in output <points_processed>_snap does not correspond to categories in output basin_out -> resolved by commit 3fc4c67
  • calc_subbas(): if column subbas_id is given in input drain_points the categories in the resulting basin_out raster do NOT correspond to this column as promised by the documentation -> resolved by commit 3fc4c67
  • area2catena(): GRASS reclass files produced even with argument grass_files = FALSE; this occurs only if an element in arg supp_qual comes without explicit mapset declaration
  • reservoir_lumped() and reservoir_strategic(): @TillF Adaptation to Windows might be necessary (relates to readVECT() mapset issues)
  • db_create(): When trying to re-create an existing database v. 26 with overwrite="drop" there is an error: Table 'x_seasons' already exists when updating to version 26. Rename / delete manually, and repeat update. Note: The message with x_seasons is exemplary, when removing x_seasons the same error occurs with the next table. I don't understand how I should overwrite an existing table.
  • db_create(): does not work with overwrite = 'empty' applied to an existing database (error raised by internal call to db_update(), see post below.

Observed changes in behaviour in comparison to lumpR v2.5.0, latest GRASS 6.4 based version (@TillF check if this could be reasonable):

  • order of subbasins (i.e. their ID) in output files changed (but statistics, i.e. cell counts for specific subbasins, remain unchanged) -> intended and no problem
  • lump_grass_prep(): differences in the EHA map (IDs -> no problem; slightly different EHA sizes, max. 13 grid cells in my test setup) -> I made some tests and it turned out the raw outputs of r.watershed (i.e. argument half.basin/half_basin) are different, even with the same input data, threshold values and the same algorithm (SFD)
  • results from prof_class() deviate slightly in some occasions (TC definition), even with the same seed and the same input files and parameter settings (should not be problematic)
  • reservoir_outlet(): for vector map res_vct columns res_id and name are now required. I don't understand the reason, I think this is unnecessary. Maybe an argument should be added where the user can chose which column contains reservoir IDs (the standard should be column cat). -> removed necessity to contain column name; res_id should be fine
  • reservoir_lumped(): When looking at the source code I get the feeling that there are superfluous commands resulting in higher computation time (e.g. res_lump <- readVECT(res_vect_class, type="point") in lines 323 and 341

Function execution times:

  • calc_subbas() now takes twice as much execution time in my example -> seems to be case specific, in a further test I made with different data it was faster... I guess, on the one hand GRASS operations are faster, but at the other hand some recent extensions (specifically, ensuring that drain points are not directly in the middle of a cell which includes some additional GRASS as well as read and write operations) make it slower
  • lump_grass_prep() takes less than half of the former runtime which is good but might be connected to the issue above? -> GRASS operations are faster, different results caused by different behaviour of r.watershed, see above
  • reservoir_lumped() is considerably slower

- to be continued -

calc_seasonality() performance

Function uses loops in R code which make it very slow for large datasets. These loops could be integrated to the underlying Fortran code which would result in a large improvement in performance.

R CMD check issues in v1.0.0 (RainySeason.f)

Running R CMD check LUMP for release v1.0.0 produces one warning and one note which still have to be resolved:

Compiler warnings from RainySeason.f:

Found the following significant warnings:
  Warning: Possible change of value in conversion from REAL(8) to REAL(4) at (1)
  Warning: Possible change of value in conversion from REAL(8) to REAL(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(8) to REAL(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(8) to REAL(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(8) to REAL(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(8) to REAL(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)
  Warning: Possible change of value in conversion from REAL(4) to INTEGER(4) at (1)

These implicit type conversions might be an issue.

One note regarding calc_seasonality.f90:

* checking compiled code ... NOTE
File/home/tobias/R/R.checks/LUMP.Rcheck/LUMP/libs/LUMP.so:
  Found_gfortran_stop_string’, possibly fromstop’ (Fortran)
    Object:calc_seasonality.oCompiled code should not call entry points which might terminate R nor
write to stdout/stderr instead of to the console.

SeeWriting portable packagesin theWriting R Extensionsmanual.

database operations are slow

Database operation using RODBC are very slow which becomes significant when processing large amounts of datasets. I also noticed that on Linux it is slower than on Windows (for whatever reason) and there are also differences regarding the employed DBMS (e.g., sqlite is slower than MariaDB/MySQL).

There are a few discussions around regarding this issue. However, it seems to be necessary to employ a different R package (and adapt lumpR accordingly) to speed up database processing.

A solution could be the package RJDBC, see http://stackoverflow.com/questions/30943748/r-painfully-slow-read-performance-using-rodbc-sql-server.

dists (column closest_dist in lu.dat) is always zero

This column served as indicating the (virtual) distance of the closest (most similar) catena to the LU-toposequence. It is not needed for operational purposes.
Still, the column should be computed correctly or removed completely.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.