mattwillflood / entropyhub.jl Goto Github PK

View Code? Open in Web Editor NEW

22.0 3.0 2.0 2.23 MB

An open-source toolkit for entropic data analysis

Home Page: https://mattwillflood.github.io/EntropyHub.jl/stable

License: Apache License 2.0

Julia 100.00%

entropy uncertainty statistical-physics nonlinear complexity time-series signal-processing random research matlab

entropyhub.jl's Introduction

Hey! ☺️

My name is Matt and I'm a biomedical engineer / researcher / scientist.

My research primarily focuses on movement disorders and the application of mathematical methods to better understand them. In particular, I use various physiological sensing modalities to record neuromuscular and kinematic activity, and then apply nonlinear signal processing techniques to identify the factors underlying the dysfunction.

entropyhub.jl's People

Contributors

Stargazers

Watchers

Forkers

iholter choco-bear-23

entropyhub.jl's Issues

Join ComplexityMeasures.jl?

Hi there,

with this issue we would like to do three things:

1. share with you the paper we just wrote for ComplexityMeasures.jl and ask for your feedback and corrections (we compare vs EntropyHub),
1. invite you to join ComplexityMeasures.jl, and
1. offer an alternative if 2) isn't really an option for you!

1: Comparison

Our paper was just put on arXiv: https://arxiv.org/abs/2406.05011. The associated code base that does the performance comparison is here: https://github.com/Datseris/ComplexityMeasuresPaper . We compared performance of the Julia version, assuming that this would be the most performant implementation of the software. Please provide us with numbers from Python or MATLAB versions if you believe these should be faster.

Please also let us know if you think the overall comparison with EntropyHub in our Table 1 is accurate or you believe it is unfair (and if so, how should we fix it). In particular, it was difficult for us to compare against and accurately estimate the total number of measures in EntropyHub.
Mainly because it was unclear to us what "Multivariate" entropies, or "Bidimensional entropies" are (some personal feedback here from George Datseris: it would be nice if there would be explanations in the docs for what these quantities actually are, beyond citing the articles).

Bidimensional

In ComplexityMeasures.jl we have spatial outcome spaces, so that one can estimate the entropy of permutation patterns but on spatial data (2D or arbitrarily-high D), given a stencil. This is done in the standard way one estimates the permutation entropy, but instead of timeseries, the "stencil" (which for normal permutation entropy would be a view of length-m) is iterated over the 2D image. We were confused by the statements in the docs that one would "run out of memory". In our implementations for spatial complexity measures, practically no memory is allocated, and one would be able to estimate the spatial permutation entropy for arbitrarily large matrices. This makes us suspect: perhaps we are not talking about the same thing...?

Multivariate

In ComplexityMeasures.jl we make no distinction between uni-variate and multi-variate timeseries. Everything that can be discretized / cast into a symbol, from which to estimate probabilities to give to the Shannon formula, is valid input that we just call "timeseries". When estimating the permutation entropy, we cast the input timeseries into a sequence of ordinal patterns.

In EntropyHub, we are not sure what "multivariate permutation entropy" means, exactly, from a computing point of view. We are mainly confused by the documentation statement "the permutation entropy for the M multivariate sequences in Data". A multivariate input has only a single multivariate sequence, that's kind of why its called "multivariate". Otherwise the input would be many univariate timeseries from which one picks "M" multivariate sequences. Can you please explain what we have misunderstood and whether, or how, the multivariate permutation entropy is different than the univariate permutation entropy?

2: Join ComplexityMeasures.jl

More than 7 years ago, the DynamicalSystems.jl project started with the goal of making nonlinear dynamics and nonlinear timeseries analysis accessible in the sense of giving people a well-tested, well-documented, universal software, that is also easy to become a developer of and contribute there, all the while following as open of a development approach as possible. ComplexityMeasures.jl shares these goals. In our paper we highlight our approach to open development of code, and we take special care to design the source code so that it is very easy for a newcomer to contribute.

There is clearly a lot of overlap between ComplexityMeasures.jl and EntropyHub: most measures are implemented in both. We believe it is best for the wider community of academics and non-academics interested in complexity and entropy applications, that a single, overarching software exists, that includes all positive aspects of all current disparate software while addressing and eliminating as many negative aspects as possible.

EntropyHub clearly had a lot of thought and development effort go into it, as is reflected by the nunerous measures it provides, some of which are not in ComplexityMeasures.jl yet. Additionally, given the clarification of the questions of the previous section, there may be much more functionality and methods from the literature in EntropyHub that doesn't yet exist in ComplexityMeasures.jl that we've simply missed. The developer(s) of EntropyHub have clearly spent their fair share of time in entropic/complexity timeseries analysis and have generated significant expertise. Having such a developer in ComplexityMeasures.jl would increase productivity for the community as a whole, and generate even more discussions and suggestions for new measures in the already extensive open list of possible additions to the software (https://github.com/JuliaDynamics/ComplexityMeasures.jl/issues). Lastly, EntropyHub has experience in providing a software for multiple languages, which we have little experience with. Unifying the two would had big positive impact on users of Python and Matlab. Note: there is currently a large amount of code duplication, not only individually in the EntropyHub.jl repo, but in essence, the code is duplicated thrice when considering the Python and Matlab version too. In our wished unified future the Python version should be a wrapper of the Julia code via PythonCall.jl, and similarly for Matlab. This would allow Python and Matlab users to harvest the power of the highly performant Julia code, while severely reducing maintenance effort across three separate libraries (this would be especially impactful on code reliability, as only one code base would need to be tested)

ComplexityMeasures.jl, on the other hand, has the main advantage of its fundamentally novel design of orthogonalizing how to create/compute a complexity measure, which is the main point of our paper, Section 2. It took us about 2 years of intense research to conclude in this design. Please do have a read in Section 2, as we believe this design has some truly unique advantages over the traditional approach of one function per estimator.
Some other advantages of ComplexityMeasures.jl are the large performance improvements over EntropyHub, smaller source code (per measure), larger documentation, an extensive test suite, and software development knowledge (stemming from 7+ years of building JuliaDynamics). For example, not having a plotting dependency making compilation of the package much faster, and this can now be managed with package extensions (c.f. #5).

Porting all of these advantages over to EntropyHub would take much more effort, and more redesigning, than porting EntropyHub's advantages to ComplexityMeasures.jl, which already has implemented an extendable design.

We definitely believe that unifying our efforts, instead of both of us "reinventing the wheel" all the time by implementing features existing in the other software is really the best outcome for the community. That is why we hope you will consider joining ComplexityMeasures.jl in favor of further developing EntropyHub.

3: The alterative

Being realistic, we understand that the above is unlikely to happen, because EntropyHub already has a publication associated with it. That is why we propose a middle ground solution to stop the reinventing the wheel problem, allow us to join hands, while still allowing you to have and promote EntropyHub.

Make EntropyHub.jl a wrapper of ComplexityMeasures.jl. All of its source code would be wiped out, and replaced by wrapper functions that have the same name as they currently do, but they call the corresponding ComplexityMeasures.jl implementation. For example, the 200 lines of source code of the permutation entropy would be replaced by

function PermEn(Sig::AbstractArray{T,1} where T<:Real; m::Int=2, tau::Int=1, 
        Typex::String="none", tpx::Union{Real,Nothing}=nothing, Logx::Real=2, Norm::Bool=false)
    
    # decide the type based on the string
    if Typex == "none"
        ospace = OrdinalPatterns(; m, τ = tau)
    else
        # more here, utilizing tpx
    end
    est = Shannon(; base = Logx))
    if Norm
        return information_normalized(est, ospace, Sig)
    else
        return information(est, ospace, Sig)
    end
end

Additionally, in this way, the actual estimation of the permutation entropy would be tested against our existing test suite.

When you want to add a new method, you implement it normally in ComplexityMeasures.jl following our Developer's Documentation, and then make a wrapper function in EntropyHub. We follow agile development practices: even the tiniest addition to ComplexityMeasures.jl generates instantly a new package version, so any change in ComplexityMeasures.jl would immediately be reflected in EntropyHub.

But despite this possible solution, we still believe that having a single unified software is the best way forwards.

We hope you consider our proposal, and we stress again that we want to make the comparison in our paper as fair as possible; if we missed anything, or mis-represented EntropyHub in any way, please do let us know.

best,
George and Kristian (cc @kahaaga)

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Extract entropy functions to a lighter-weight package

There are a lot of deps here like Plots.jl that are more front-end than the core entropy calcs.

It would be helpful to make a EntropyMetrics.jl that this depends on, that the user could use without taking on plotting deps.

Compatibility Issues

Hello,

Is this package still maintained? I'm having these compatibility issues:

Unsatisfiable requirements detected for package FourierAnalysis [e7e9c730]: FourierAnalysis [e7e9c730] log: ├─possible versions are: 1.0.0-1.2.1 or uninstalled ├─restricted to versions * by an explicit requirement, leaving only versions 1.0.0-1.2.1 ├─restricted by compatibility requirements with DSP [717857b8] to versions: 1.0.0-1.1.2 or uninstalled, leaving only versions: 1.0.0-1.1.2 │ └─DSP [717857b8] log: │ ├─possible versions are: 0.5.1-0.7.5 or uninstalled │ ├─restricted to versions * by an explicit requirement, leaving only versions 0.5.1-0.7.5 │ └─restricted by compatibility requirements with EntropyHub [3938faea] to versions: 0.6.8-0.6.10 │ └─EntropyHub [3938faea] log: │ ├─possible versions are: 0.1.0-0.2.0 or uninstalled │ └─restricted to versions * by an explicit requirement, leaving only versions 0.1.0-0.2.0 └─restricted by compatibility requirements with RecipesBase [3cdcf5f2] to versions: 1.2.0-1.2.1 or uninstalled — no versions left └─RecipesBase [3cdcf5f2] log: ├─possible versions are: 0.4.0-1.2.1 or uninstalled ├─restricted by compatibility requirements with Plots [91a5bcdd] to versions: 0.4.0-1.2.1 │ └─Plots [91a5bcdd] log: │ ├─possible versions are: 0.12.1-1.29.0 or uninstalled │ ├─restricted to versions * by an explicit requirement, leaving only versions 0.12.1-1.29.0 │ ├─restricted by compatibility requirements with EntropyHub [3938faea] to versions: 1.10.3-1.29.0 │ │ └─EntropyHub [3938faea] log: see above │ └─restricted by compatibility requirements with RecipesBase [3cdcf5f2] to versions: 1.0.0-1.23.6 or uninstalled, leaving only versions: 1.10.3-1.23.6 │ └─RecipesBase [3cdcf5f2] log: see above ├─restricted by compatibility requirements with FourierAnalysis [e7e9c730] to versions: [0.7.0, 1.1.2] │ └─FourierAnalysis [e7e9c730] log: see above └─restricted by compatibility requirements with Plots [91a5bcdd] to versions: 1.0.0-1.2.1, leaving only versions: 1.1.2 └─Plots [91a5bcdd] log: see above

I tried installing it on Julia v 1.7 using Pkg.add("EntropyHub") and also ]add EntropyHub.

Best regards,
Yasir

Register

@JuliaRegistrator register

Entropy for multiple arrays

Hello thanks for sharing your fantastic work!
I would like to add to my loss function measure of dissimilarity of parameters
Let's say I have 100 3d arrays that are parameters in neural network and I would like to maximize amount of information and minimize similarity between them.
As I want to add it to loss function entropy measure needs to be differentiable
What can I use ?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.