Giter Site home page Giter Site logo

flux-data-qaqc's Introduction

Documentation Status Automated tests

flux-data-qaqc

flux-data-qaqc provides a framework to create reproducible workflows for validation and analysis of eddy covariance data. The package is intended for those who need to post-process flux data, particularly for generating daily and monthly evapotranspiration (ET) timeseries estimates with energy balance closure corrections applied. Applications where this software may be useful include analysis of eddy covariance data, hydrologic or atmospheric model validation, and irrigation and water consumption studies.

Key functionalities and tools include:

  • data validation with methods for quality-based filtering
  • time series tools, e.g. gap-filling and temporal aggregation
  • energy balance closure algorithms and other meterological calculations
  • data provenance, e.g. from metadata management and file structure
  • downloading and management of gridMET meterological data
  • customizable and interactive visualizations
  • built-in unit conversions and batch processing tools

Documentation

ReadTheDocs

Installation

Using PIP:

pip install fluxdataqaqc

PIP should install the necessary dependencies however it is recommended to use conda and first install the provided virtual environment. This is useful to avoid changing your local Python environment. Note, flux-data-qaqc has been tested for Python 3.7+, although it may work with versions greater than or equal to 3.4.

First make sure you have the fluxdataqaqc environment file, you can download it here. Next to install run,

conda env create -f environment.yml

To activate the environment before using the flux-data-qaqc package run,

conda activate fluxdataqaqc

Now install using PIP:

pip install fluxdataqaqc

Now all package modules and tools should be available in your Python environment PATH and able to be imported. Note if you did not install the Conda virtual environment above, PIP should install dependencies automatically but be sure to be using a version of Python above or equal to 3.4. To test that everything has installed correctly by opening a Python interpretor or IDE and run the following:

import fluxdataqaqc

and

from fluxdataqaqc import Data, QaQc, Plot

If everything has been installed correctly you should get no errors.

How to cite

Volk et al., (2021). flux-data-qaqc: A Python Package for Energy Balance Closure and Post-Processing of Eddy Flux Data. Journal of Open Source Software, 6(66), 3418, https://doi.org/10.21105/joss.03418

flux-data-qaqc's People

Contributors

dependabot[bot] avatar dgketchum avatar inkenbrandt avatar johnvolk avatar pdebuyl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flux-data-qaqc's Issues

Low code coverage

If I run

❯ pytest --cov=fluxdataqaqc tests/
==================================== test session starts ====================================
platform linux -- Python 3.8.9, pytest-6.2.5, py-1.10.0, pluggy-1.0.0
rootdir: /home/avmo/src/sandbox/flux-data-qaqc, configfile: pytest.ini
plugins: cov-2.12.1, nbval-0.9.6
collected 11 items                                                                          

tests/test_fluxdataqaqc.py ...........                                                [100%]

----------- coverage: platform linux, python 3.8.9-final-0 -----------
Name                       Stmts   Miss  Cover
----------------------------------------------
fluxdataqaqc/__init__.py       8      0   100%
fluxdataqaqc/data.py         488    114    77%
fluxdataqaqc/plot.py         552    290    47%
fluxdataqaqc/qaqc.py         663    619     7%
fluxdataqaqc/util.py          90     60    33%
----------------------------------------------
TOTAL                       1801   1083    40%


==================================== 11 passed in 21.02s ====================================

I notice that the tests mostly target only the data module. If the tutorials cover the rest then it should be good enough and one could check the coverage as follows.

❯ cd examples/Basic_usage
❯ pytest --cov=fluxdataqaqc --nbval Tutorial.ipynb ../../tests/

However, in order to do that, #9 should be fixed first.

Update date_parse to accommodate deprecation

Lib\site-packages\fluxdataqaqc\data.py:1123: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.

Basic usage tutorial errors

>>> d.plot(output_type='notebook', plot_width=700)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
/tmp/ipykernel_793706/3470416942.py in <module>
----> 1 d.plot(output_type='notebook', plot_width=700)

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/fluxdataqaqc/data.py in plot(self, ncols, output_type, out_file, suptitle, plot_width, plot_height, sizing_mode, merge_tools, link_x, **kwargs)
    537 
    538         # create aggregrated plot structure from fluxdataqaqc.Plot._plot()
--> 539         ret = self._plot(
    540             self, ncols=ncols, output_type=output_type, out_file=out_file,
    541             suptitle=suptitle, plot_width=plot_width, plot_height=plot_height,

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/fluxdataqaqc/plot.py in _plot(self, FluxObj, ncols, output_type, out_file, suptitle, plot_width, plot_height, sizing_mode, merge_tools, link_x, **kwargs)
   1172             if fig is not None:
   1173                 daily_line.append(fig)
-> 1174             theta_vars = [
   1175                 v for v in variables if theta_re.match(v) and v in\
   1176                     monthly_df.columns

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/fluxdataqaqc/plot.py in <listcomp>(.0)
   1174             theta_vars = [
   1175                 v for v in variables if theta_re.match(v) and v in\
-> 1176                     monthly_df.columns
   1177             ]
   1178             if fig is not None and monthly and len(theta_vars) > 0:

NameError: free variable 'monthly_df' referenced before assignment in enclosing scope
>>> # creating a QaQc instance will automatically convert to daily
>>> d = Data('US-Tw3_config.ini')
>>> q = QaQc(d)
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
/tmp/ipykernel_793706/3547759801.py in <module>
      1 # creating a QaQc instance will automatically convert to daily
      2 d = Data('US-Tw3_config.ini')
----> 3 q = QaQc(d)

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/fluxdataqaqc/qaqc.py in __init__(self, data, drop_gaps, daily_frac, max_interp_hours, max_interp_hours_night)
    288 
    289             # data will be loaded if it has not yet via Data.df
--> 290             self.temporal_freq = self._check_daily_freq(
    291                 drop_gaps, daily_frac, max_interp_hours, max_interp_hours_night
    292             )

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/fluxdataqaqc/qaqc.py in _check_daily_freq(self, drop_gaps, daily_frac, max_interp_hours, max_interp_hours_night)
    785             self.n_samples_per_day = 1
    786 
--> 787         self._df = df.rename(self.variables)
    788         return freq
    789 

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    322         @wraps(func)
    323         def wrapper(*args, **kwargs) -> Callable[..., Any]:
--> 324             return func(*args, **kwargs)
    325 
    326         kind = inspect.Parameter.POSITIONAL_OR_KEYWORD

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/pandas/core/frame.py in rename(self, mapper, index, columns, axis, copy, inplace, level, errors)
   5032         4  3  6
   5033         """
-> 5034         return super().rename(
   5035             mapper=mapper,
   5036             index=index,

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/pandas/core/generic.py in rename(self, mapper, index, columns, axis, copy, inplace, level, errors)
   1145             # GH 13473
   1146             if not callable(replacements):
-> 1147                 indexer = ax.get_indexer_for(replacements)
   1148                 if errors == "raise" and len(indexer[indexer == -1]):
   1149                     missing_labels = [

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_indexer_for(self, target, **kwargs)
   5274         """
   5275         if self._index_as_unique:
-> 5276             return self.get_indexer(target, **kwargs)
   5277         indexer, _ = self.get_indexer_non_unique(target)
   5278         return indexer

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_indexer(self, target, method, limit, tolerance)
   3435         # returned ndarray is np.intp
   3436         method = missing.clean_reindex_fill_method(method)
-> 3437         target = self._maybe_cast_listlike_indexer(target)
   3438 
   3439         self._check_indexing_method(method, limit, tolerance)

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/pandas/core/indexes/datetimelike.py in _maybe_cast_listlike_indexer(self, keyarr)
    599     def _maybe_cast_listlike_indexer(self, keyarr):
    600         try:
--> 601             res = self._data._validate_listlike(keyarr, allow_object=True)
    602         except (ValueError, TypeError):
    603             res = com.asarray_tuplesafe(keyarr)

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/pandas/core/arrays/datetimelike.py in _validate_listlike(self, value, allow_object)
    701         # Do type inference if necessary up front
    702         # e.g. we passed PeriodIndex.values and got an ndarray of Periods
--> 703         value = pd_array(value)
    704         value = extract_array(value, extract_numpy=True)
    705 

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/pandas/core/construction.py in array(data, dtype, copy)
    344         elif inferred_dtype == "string":
    345             # StringArray/ArrowStringArray depending on pd.options.mode.string_storage
--> 346             return StringDtype().construct_array_type()._from_sequence(data, copy=copy)
    347 
    348         elif inferred_dtype == "integer":

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/pandas/core/arrays/string_.py in _from_sequence(cls, scalars, dtype, copy)
    345         else:
    346             # convert non-na-likes to str, and nan-likes to StringDtype.na_value
--> 347             result = lib.ensure_string_array(
    348                 scalars, na_value=StringDtype.na_value, copy=copy
    349             )

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/pandas/_libs/lib.pyx in pandas._libs.lib.ensure_string_array()

~/src/sandbox/flux-data-qaqc/venv/lib/python3.8/site-packages/pandas/_libs/lib.pyx in pandas._libs.lib.ensure_string_array()

IndexError: too many indices for array

Can you help please with this TypeError?


TypeError Traceback (most recent call last)
Cell In[33], line 3
1 # creating a QaQc instance will automatically convert to daily
2 d = Data('US-Tw3_config.ini')
----> 3 q = QaQc(d)

File ~/anaconda3/envs/fluxdataqaqc/lib/python3.11/site-packages/fluxdataqaqc/qaqc.py:290, in QaQc.init(self, data, drop_gaps, daily_frac, max_interp_hours, max_interp_hours_night)
287 self.inv_map[user_G_name] = 'G'
289 # data will be loaded if it has not yet via Data.df
--> 290 self.temporal_freq = self._check_daily_freq(
291 drop_gaps, daily_frac, max_interp_hours, max_interp_hours_night
292 )
293 # check units, convert if possible for energy balance, ppt, Rs, vp,
294 self._check_convert_units()

File ~/anaconda3/envs/fluxdataqaqc/lib/python3.11/site-packages/fluxdataqaqc/qaqc.py:671, in QaQc._check_daily_freq(self, drop_gaps, daily_frac, max_interp_hours, max_interp_hours_night)
668 sum_cols = list(set(sum_cols).intersection(df.columns))
669 mean_cols = set(df.columns) - set(sum_cols)
--> 671 means = df.loc[:,mean_cols].apply(
672 pd.to_numeric, errors='coerce').resample('D').mean().copy()
673 # issue with resample sum of nans, need to drop first else 0
674 sums = df.loc[:,sum_cols].dropna().apply(
675 pd.to_numeric, errors='coerce').resample('D').sum()

File ~/anaconda3/envs/fluxdataqaqc/lib/python3.11/site-packages/pandas/core/indexing.py:1091, in _LocationIndexer.getitem(self, key)
1089 @Final
1090 def getitem(self, key):
-> 1091 check_dict_or_set_indexers(key)
1092 if type(key) is tuple:
1093 key = tuple(list(x) if is_iterator(x) else x for x in key)

File ~/anaconda3/envs/fluxdataqaqc/lib/python3.11/site-packages/pandas/core/indexing.py:2618, in check_dict_or_set_indexers(key)
2610 """
2611 Check if the indexer is or contains a dict or set, which is no longer allowed.
2612 """
2613 if (
2614 isinstance(key, set)
2615 or isinstance(key, tuple)
2616 and any(isinstance(x, set) for x in key)
2617 ):
-> 2618 raise TypeError(
2619 "Passing a set as an indexer is not supported. Use a list instead."
2620 )
2622 if (
2623 isinstance(key, dict)
2624 or isinstance(key, tuple)
2625 and any(isinstance(x, dict) for x in key)
2626 ):
2627 raise TypeError(
2628 "Passing a dict as an indexer is not supported. Use a list instead."
2629 )

TypeError: Passing a set as an indexer is not supported. Use a list instead.

installation instructions give error

Using an existing conda env:
git clone https://github.com/Open-ET/flux-data-qaqc.git
cd flux-data-qaqc/
then
conda install --file environment.yml gives CondaValueError: could not parse 'name: fluxdataqaqc' in: environment.yml
And from conda base: conda env create -f environment.yml gives CondaValueError: The target prefix is the base prefix. Aborting.

I might also suggest creating a requirements.txt with pip freeze that spells out the package dependencies, as the .yml does not get that specific.

This is using miniconda3 base, with pip list:

Package                Version            
---------------------- -------------------
certifi                2020.4.5.1         
cffi                   1.14.0             
chardet                3.0.4              
conda                  4.8.3              
conda-package-handling 1.7.0              
cryptography           2.9.2              
idna                   2.9                
pip                    20.0.2             
pycosat                0.6.3              
pycparser              2.20               
pyOpenSSL              19.1.0             
PySocks                1.7.1              
requests               2.23.0             
ruamel-yaml            0.15.87            
setuptools             46.4.0.post20200518
six                    1.14.0             
tqdm                   4.46.0             
urllib3                1.25.8             
wheel                  0.34.2  

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.