sdtaylor / grasslandmodels Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Basically have an open ended dictionary to store whatever in. Will allow for things like fitting params, data, etc.
Add to Base()
init
self.metadata = {}
Have an update metadata so things aren't overwritten
def update_metadata(self, **kwargs):
self.metadata.update(kwargs)
Make sure it's saved by adding it to get_model_info
def _get_model_info(self):
return {'model_name': type(self).__name__,
'parameters': self._fitted_params,
'metadata': self.metadata}
Read it in from saved files in load_model_parameters
else:
# For all other ones just need to pass the parameters
Model = load_model(model_info['model_name'])
model = Model(parameters=model_info['parameters'])
model.metadata = model_info['metadata']
need to make sure the phenograss numpy methods can handle any number of dimensions as long as time is the first one, which is how its designed.
Copy the test data onto several other dimensions and make sure the predictions match along them.
need a load_fitted_model
util to load specific parameter sets.
ie. load_fitted_model('phenograss-original')
for the Hufkens 2016 parameters
load_fitted_model('CholerPR1-original')
or maybe load_prefit_model
Otherwise they extrapolate to large f values
example outlined here looks very promising with cython
https://ipython-books.github.io/56-optimizing-cython-code-by-writing-less-python-and-more-c/
Soil water holding capacity is not actually used in the PR1 model, but it is used in PR2 and PR3.
If it's set to 1 then V will potentially never decrease due to the senesence part of eq. 2 cancelling out
d * b3 * V[i] * (1-V[i])
Will only happen if V reaches 1.0 to begin with, which is a huge outlier but still possible.
see #2
Need to drop the phenograss parameter b4 and make b1 actually be used instead of it just being set to Wp.
Maybe keep the original phenograss intact so it matches the phenograss.f90 from the original paper? And make this adjustment just to PhenoGrassNDVI
They aren't needed as the scaling associated with these is done on the normal python end.
The python code generates a csv with both the modelled and observed GCC. The R code recreates the timeseries and R2 figures from Hufkins 2016
Python:
from GrasslandModels import models, utils
import numpy as np
GCC, predictor_vars = utils.load_test_data()
original_phenograss_params = {'b1':124.502121,
'b2':0.00227958267,
'b3':0.0755224228,
'b4':0.519348383,
'L':2.4991734,
'Phmin':8.14994431,
'h': 222.205673,
'Topt':33.3597641,
'Phmax':37.2918091}
m = models.PhenoGrass(parameters='original_phenograss_params')
prediction = m.predict(predictor_vars)
available_sites = ['freemangrass_grass',
'ibp_grassland',
'kansas_grassland',
'lethbridge_grassland',
'marena_canopy',
'vaira_grass']
# put the modelled GCC back into site files
import pandas as pd
all_site_data = pd.read_csv('GrasslandModels/data/site_data.csv.gz')
all_site_data['modelled_gcc'] = np.nan
for site_i, site_name in enumerate(available_sites):
all_site_data.loc[all_site_data.Site == site_name,'modelled_gcc'] = prediction[:,site_i]
all_site_data.to_csv('phenograss_test_run_cython.csv', index=False)
R code
library(tidyverse)
phenograss_output = read_csv('~/projects/GrasslandModels/phenograss_test_run_cython.csv',
col_types = cols(gcc=col_double()))
phenograss_output$Site = factor(phenograss_output$Site,
levels = c("marena_canopy","freemangrass_grass","kansas_grassland","vaira_grass","lethbridge_grassland","ibp_grassland"),
labels = c('Marena', 'Freemangrass', 'Kansas', 'Vaira', 'Lethbridge', 'IBP'))
# Timeseries plots
phenograss_output %>%
filter(year>=2012) %>%
select(date, phenocam_gcc = gcc, modelled_gcc, year, site=Site) %>%
# gather(gcc_source, gcc_value, phenocam_gcc, modelled_gcc) %>%
ggplot(aes(x=date)) +
geom_point(aes(y=phenocam_gcc), color='grey40') +
geom_line(aes(y=modelled_gcc), color='red', size=1) +
facet_wrap(~site, ncol=1)
# R2 plots
phenograss_output %>%
filter(year>=2012) %>%
select(date, phenocam_gcc = gcc, modelled_gcc, year, site=Site) %>%
# gather(gcc_source, gcc_value, phenocam_gcc, modelled_gcc) %>%
ggplot(aes(x=phenocam_gcc, y=modelled_gcc)) +
geom_point() +
geom_abline(intercept = 0, slope=1) +
facet_wrap(~site, ncol=1, scales='free')
right now they're declared 3 times, twice in phenograss.py
and again in phenograss_cython.pyx
Would be nice to have a spin option to save initial conditions. that way the entire timeseries doesn't have to be run each model iterations
on the other hand the entire timeseries would need to be run with each set of parameters anyway....
causes very confusing fitting errors if ,ie. Tm
, is used if Tm
isn't used in the model
Just need to take out the scaling stuff I think.
Need to be able to get all state variables out of a model like W and Dt.
Hopefully someway to do it automatically.
maybe
model.state_variables = ['Dt','W','V']
etc.
The following wrapper can be used in the optimize arguments to fit using whatever dask distributed array is setup.
from dask.distributed import Client
client = Client()
def dask_scipy_mapper(func, iterable, c=client):
futures = c.map(func, iterable)
return [f.result() for f in futures]
de_fitting_params = {'maxiter':5,
'popsize':10,
'mutation':(0.5,1),
'recombination':0.25,
'workers': dask_scipy_mapper,
'disp':True}
m = models.PhenoGrass()
m.fit(GCC, predictor_vars,
optimizer_params=de_fitting_params,
debug = True)
The workers
argument in differential evolution (and other optimize functions) can take a map-like callable in the form map(func, iterable)
and expects a list of results back (since scipy 1.2.0).
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.differential_evolution.html
need better validation of the shapes for when multiple sites are being used. ie. if site level predictors are shape (2,), then the timeseries ones need to be shape (xxx, 2).
also potentially check that axis 0 of timeseries predictors are actually the timeseries (generally should be the longest, but not always)
Eq. 9 not represented anywhere in phenograss.f90 code
The phenograss.f90 b1
is actually set, and used for, Wp
, while the phenograss.f90 params b2
,b3
,b4
correspond to the paper b1
,b2
,b3
, respectively.
like whether it completed from max iterations of finding an optimal solution
Models within choler2010.py
and choler2011.py
were copy pasted from the original phenograss model (they're all really similar) and then adjusted. Thus there is a lot of commented out code that should be removed.
Things like m
and Sd
which were present in the example fortran code but not actually used. Also things like d
which, unlike fortran, don't need to be declared before their assigned.
they're fine as just rmse
instead of nan_rmse
/home/shawn/miniconda3/lib/python3.7/site-packages/scipy/fft/__init__.py:97
/home/shawn/miniconda3/lib/python3.7/site-packages/scipy/fft/__init__.py:97: DeprecationWarning: The module numpy.dual is deprecated. Instead of using dual, use the functions directly from numpy or scipy.
from numpy.dual import register_func
/home/shawn/miniconda3/lib/python3.7/site-packages/scipy/sparse/sputils.py:17: 15 tests with warnings
/home/shawn/miniconda3/lib/python3.7/site-packages/scipy/sparse/sputils.py:17: DeprecationWarning: `np.typeDict` is a deprecated alias for `np.sctypeDict`.
supported_dtypes = [np.typeDict[x] for x in supported_dtypes]
/home/shawn/miniconda3/lib/python3.7/site-packages/scipy/special/orthogonal.py:81
/home/shawn/miniconda3/lib/python3.7/site-packages/scipy/special/orthogonal.py:81
/home/shawn/miniconda3/lib/python3.7/site-packages/scipy/special/orthogonal.py:81: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
from numpy import (exp, inf, pi, sqrt, floor, sin, cos, around, int,
/home/shawn/miniconda3/lib/python3.7/site-packages/scipy/optimize/lbfgsb.py:339: 37 tests with warnings
test/test_core_models.py: 2 tests with warnings
/home/shawn/miniconda3/lib/python3.7/site-packages/scipy/optimize/lbfgsb.py:339: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
task_str = task.tostring()
/home/shawn/miniconda3/lib/python3.7/site-packages/scipy/optimize/lbfgsb.py:360: 16 tests with warnings
test/test_core_models.py: 1 test with warning
/home/shawn/miniconda3/lib/python3.7/site-packages/scipy/optimize/lbfgsb.py:360: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
task_str = task.tostring().strip(b'\x00').strip()
test/test_core_models.py::test_internal_broadcasting[PhenoGrass-fitted_model0]
test/test_core_models.py::test_phenograss_internal_methods
/home/shawn/projects/GrasslandModels/GrasslandModels/models/phenograss.py:325: RuntimeWarning: invalid value encountered in power
g[:] = ((Tmax - Tm[i]) / (Tmax - Topt)) * (((Tm[i] - Tmin) / (Topt - Tmin)) ** (Topt/(Tmax-Topt)))
test/test_core_models.py::test_internal_broadcasting[PhenoGrassNDVI-fitted_model1]
/home/shawn/projects/GrasslandModels/GrasslandModels/models/phenograss.py:594: RuntimeWarning: invalid value encountered in power
g[:] = ((Tmax - Tm[i]) / (Tmax - Topt)) * (((Tm[i] - Tmin) / (Topt - Tmin)) ** (Topt/(Tmax-Topt)))
-- Docs: https://docs.pytest.org/en/latest/warnings.html
instead of just GCC from the phenograss paper.
make sure validation is working correctly, expect errors when site numbers don't match
some known value testing for parameter estimation
drop this from the base
method, ie here
It's probably not needed since these are not phenology models and definitely slows things down.
The phenograss model especially, need various constraints on the parameters. This is done to some to extent in the fitting by specificity parameter ranges, but sometimes more is needed. ie see here 7787263
Potentially add a model method for this
def check_constraints(self):
Phmax > Phmin
...
# inside the mode fitting
if self.constraits_met:
raise warning
V[:] = 1e10
return V
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.