Giter Site home page Giter Site logo

shuhuagao / geppy Goto Github PK

View Code? Open in Web Editor NEW
195.0 10.0 72.0 8.55 MB

A framework for gene expression programming (an evolutionary algorithm) in Python

Home Page: https://geppy.readthedocs.io/en/latest/

License: GNU Lesser General Public License v3.0

Python 100.00%
gene-expression-programming genetic-programming evolutionary-algorithm gep symbolic-regression system-identification evolutionary-computation

geppy's Introduction

geppy: a gene expression programming framework in Python

geppy is a computational framework dedicated to Gene Expression Programming (GEP), which is proposed by C. Ferreira in 2001 [1]. geppy is developed in Python 3.

What is GEP?

Gene Expression Programming (GEP) is a popular and established evolutionary algorithm for automatic generation of computer programs and mathematical models. It has found wide applications in symbolic regression, classification, automatic model design, combinatorial optimization and real parameter optimization problems [2].

GEP can be seen as a variant of the traditional genetic programming (GP) and it uses simple linear chromosomes of fixed lengths to encode the genetic information. Though the chromosome (genes) is of fixed length, it can produce expression trees of various sizes thanks to its genotype-phenotype expressio system. Many experiments show that GEP is more efficient than GP, and the trees evolved by GEP tend to have a smaller size than the ones of GP.

geppy and DEAP

geppy is built on top of the excellent evolutionary computation framework DEAP for rapid prototyping and testing of ideas with GEP. DEAP provides fundamental support for GP, while lacking support for GEP. geppy tries the best to follow the style of DEAP and attempts to maintain compatibility with the major infrastructure of DEAP. In other words, to some degree geppy may be considered as a plugin of DEAP to specially support GEP. If you are familiar with DEAP, then it is easy to grasp geppy. Besides, comprehensive documentation is also available.

Features

  • Compatibility with the DEAP infrastructure and easy accessibility to DEAP's functionality including:
    • Multi-objective optimisation
    • Straightforward parallelization of fitness evaluations for speedup
    • Hall of Fame of the best individuals that lived in the population
    • Checkpoints that take snapshots of a system regularly
    • Statistics and logging
  • Core data structures in GEP, including the gene, chromosome, expression tree, and K-expression.
  • Implementation of common mutation, transposition, inversion and crossover operators in GEP.
  • Boilerplate algorithms, including the standard GEP algorithm and advanced algorithms integrating a local optimizer for numerical constant optimization.
  • Support numerical constants inference with a third Dc domain in genes: the GEP-RNC algorithm.
  • Flexible built-in algorithm interface, which can support an arbitrary number of custom mutation and crossover-like operators.
  • Visualization of the expression tree.
  • Symbolic simplification of a gene, a chromosome, or a K-expression in postprocessing.
  • Examples of different applications using GEP with detailed comments in Jupyter notebook.

Installation

From PyPI (recommended)

pip install geppy

From source

You can install it from sources.

  1. First download or clone this repository
git clone https://github.com/ShuhuaGao/geppy
  1. Change into the root directory, i.e., the one containing the setup.py file and install geppy using pip
cd geppy
pip install .

Documentation

Check geppy documentation for GEP theory and tutorials as well as a comprehensive introduction of geppy's API and typical usages with comprehensive tutorials and examples.

Examples

A getting started example is presented in the Jupyter notebook Boolean model identification, which infers a Boolean function from given input-output data with GEP. More examples are listed in the following.

Simple symbolic regression

  1. Boolean model identification (Getting started with no constants involved)
  2. Simple mathematical expression inference (Constants finding with ephemeral random constants (ERC))
  3. Simple mathematical expression inference with the GEP-RNC algorithm (Demonstrating the GEP-RNC algorithm for numerical constant evolution)

Advanced symbolic regression

  1. Improving symbolic regression with linear scaling (Use the linear scaling technique to evolve models with continuous real constants more efficiently)

  2. Use the GEP-RNC algorithm with linear scaling on the UCI Power Plant dataset See how to apply GEP based symbolic regression on a real machine learning dataset.

Requirements

  • Python 3.6 and afterwards
  • DEAP, which should be installed automatically if you haven't got it when installing geppy.
  • [optional] To visualize the expression tree using the geppy.export_expression_tree method, you need the graphviz module.
  • [optional] Since GEP/GP doesn't simplify the expressions during evolution, its final result may contain many redundancies, and the tree can be very large, like x + 5 * (2 * x - x - x) - 1, which is simply x - 1. You may like to simplify the final model evolved by GEP with symbolic computation to get better understanding of this model. The corresponding geppy.simplify method depends on the sympy package.

Common pitfalls in using GP

Always keep in mind that evolution is random. Thus, any values may be input into a function. If issues like "overflow", "nan", or "not a number", or unreasonally huge values are encounterred, the most possible reason is that you did not protect a possibly dangerous function. For example, if the sqrt function lies in the function set, then in evaluating one individual evolved by geppy (or GP in general), it is likely that a negative input sqrt(-1.24) may happen.

Refer to issues #28 #26 #4 for more details.

Reference

The bible of GEP is definitely Ferreira, C.'s monograph: Ferreira, C. (2006). Gene expression programming: mathematical modeling by an artificial intelligence (Vol. 21). Springer.

You can also get a lot of papers/documents by Googling 'gene expression programming'.

[1] Ferreira, C. (2001). Gene Expression Programming: a New Adaptive Algorithm for Solving Problems. Complex Systems, 13. [2] Zhong, J., Feng, L., & Ong, Y. S. (2017). Gene expression programming: a survey. IEEE Computational Intelligence Magazine, 12(3), 54-72.

How to cite geppy

If you find geppy useful in your projects, please cite it such that more researchers/engineers will know it. A BibTeX entry for geppy is given below.

@misc{geppy_2020,
    author       = {Shuhua Gao},
    title        = {{geppy: a Python framework for gene expression programming }},
    month        = July,
    year         = 2020,
    doi          = {10.5281/zenodo.3946297},
    version      = {0.1},
    publisher    = {Zenodo},
    url          = {https://github.com/ShuhuaGao/geppy}
    }

Alternatively, if you want a more academic citation, you may cite our relevant paper

@ARTICLE{learn_async,
  author={S. {Gao} and C. {Sun} and C. {Xiang} and K. {Qin} and T. H. {Lee}},
  journal={IEEE Transactions on Cybernetics}, 
  title={Learning Asynchronous Boolean Networks From Single-Cell Data Using Multiobjective Cooperative Genetic Programming}, 
  year={2020},
  volume={},
  number={},
  pages={1-15},
  doi={10.1109/TCYB.2020.3022430}}

geppy's People

Contributors

bytesumoltd avatar minkymorgan avatar rabbytr avatar shuhuagao avatar xiaomocandy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

geppy's Issues

question about "minimize"

from deap import creator, base, tools

creator.create("FitnessMin", base.Fitness, weights=(-1,)) # to minimize the objective (fitness)
creator.create("Individual", gep.Chromosome, fitness=creator.FitnessMin)


Hi, pardon, in the examples of geppy, there writes "Our objective is to minimize the MSE (mean squared error) for data fitting" using above code.

"FitnessMin" is used, and weights=(-1,). Then this is maximization. But why it is still to minimize the objective (fitness)?

Linking operators for 3 or more genes in a chromosome

Hi Shuhua,

Thank you for creating this great package!

Do you have any example with the number of genes more than 2 in a chromosome?
I tried to increase the number of genes to 3 in example #2 (Simple mathematical expression inference), however I got "TypeError: add expected 2 arguments, got 3".
I'm new to Python, not sure if I understand the code correctly, but it seems that the "add" operator does not work as a linker for more than 2 genes since it only accepts two parameters.

Regards,
Shaun

Modifying the number of genes in GEP code produces error

I am using this code for developing an expression in my research, but I have experienced one issue, that I can't change the n_genes to value other than 2.

I havenot modified anything in the code except my data and the parameters of head size and generations.

Please have a look on it, that how can we increase number of genes in this code and what necessary modifications are required in code to make it work.

I have tried using 3 but it throws following error:


TypeError Traceback (most recent call last)
Cell In[325], line 9
6 hof = tools.HallOfFame(3) # only record the best three individuals ever found in all generations
8 # start evolution
----> 9 pop, log = gep.gep_simple(pop, toolbox, n_generations=n_gen, n_elites=1,
10 stats=stats, hall_of_fame=hof, verbose=True)

File c:\users\sarmed wahab.desktop-ul8783a\appdata\local\programs\python\python39\lib\site-packages\geppy\algorithms\basic.py:100, in gep_simple(population, toolbox, n_generations, n_elites, stats, hall_of_fame, verbose)
98 invalid_individuals = [ind for ind in population if not ind.fitness.valid]
99 fitnesses = toolbox.map(toolbox.evaluate, invalid_individuals)
--> 100 for ind, fit in zip(invalid_individuals, fitnesses):
101 ind.fitness.values = fit
103 # record statistics and log

Cell In[321], line 5, in evaluate_linear_scaling(individual)
2 """Evaluate the fitness of an individual with linearly scaled MSE.
3 Get a and b by minimizing (a*Yp + b - Y)"""
4 func = toolbox.compile(individual)
----> 5 Yp = np.array(list(map(func, bc, fc, ef, tf, bf, lf)))
7 # special cases: (1) individual has only a terminal
8 # (2) individual returns the same value for all test cases, like 'x - x + 10'. np.linalg.lstsq will fail in such cases.
10 if isinstance(Yp, np.ndarray):

File c:\users\sarmed wahab.desktop-ul8783a\appdata\local\programs\python\python39\lib\site-packages\geppy\tools\parser.py:50, in compile_..(*x)
48 else:
49 return lambda *x: tuple((f(*x) for f in fs))
---> 50 return lambda x: linker((f(*x) for f in fs))

TypeError: add expected 2 arguments, got 3

REgrading Reproductibility

Hi,
I am running UCI_power_plant symbolic regression problem. I see you use seed to reproduce the results. But when I am running the case, for each run with the same parameters used in the example, It results in different expressions. Is the seed for reproducing the train/test data or for the actual results?

Any settings or solutions for 'De-duplication' in the Hall of Fame?

Thanks for you to develop this great work first. There is one problem of my project. Assume that I set the size of hof to be 100 and get best 100 individuals ever round, however, many of them are duplication. There are only 10 unique ones in the hof. Is there any settings or solutions to de-duplication? Looking forward to your reply

after add function of log, problems occured to visualize

Hello, thank you for sharing these codes. I have added the function of log in the code, but it failed to visualize as it should be. is there any method to fix it?

Our symbolic regression process found the following equation offers our best prediction:

     -0.0504234833139171 + 0.24544201652149*(d3 + log(d6))*log(-(11*d1 + 2)*(d2 - d6))/d4

which formally is presented as:

Traceback (most recent call last):
File "C:\Python38\lib\site-packages\sympy\core\cache.py", line 94, in wrapper
retval = cfunc(*args, **kwargs)
File "C:\Python38\lib\site-packages\pandas\core\generic.py", line 1665, in hash
raise TypeError(
TypeError: 'Series' objects are mutable, thus they cannot be hashed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Python38\lib\site-packages\sympy\core\cache.py", line 94, in wrapper
retval = cfunc(*args, **kwargs)
File "C:\Python38\lib\site-packages\pandas\core\generic.py", line 1665, in hash
raise TypeError(
TypeError: 'Series' objects are mutable, thus they cannot be hashed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "GEP-data01.py", line 272, in
predPE = CalculateBestModelOutput(holdout.d1, holdout.d2, holdout.d3, holdout.d4, holdout.d5, holdout.d6, str(symplified_best))
File "GEP-data01.py", line 261, in CalculateBestModelOutput
return eval(model)
File "", line 1, in
File "C:\Python38\lib\site-packages\sympy\core\cache.py", line 96, in wrapper
retval = func(*args, **kwargs)
File "C:\Python38\lib\site-packages\sympy\core\function.py", line 465, in new
result = super().new(cls, *args, **options)
File "C:\Python38\lib\site-packages\sympy\core\cache.py", line 96, in wrapper
retval = func(*args, **kwargs)
File "C:\Python38\lib\site-packages\sympy\core\function.py", line 280, in new
evaluated = cls.eval(*args)
File "C:\Python38\lib\site-packages\sympy\functions\elementary\exponential.py", line 622, in eval
if arg.is_Number:
File "C:\Python38\lib\site-packages\pandas\core\generic.py", line 5136, in getattr
return object.getattribute(self, name)
AttributeError: 'Series' object has no attribute 'is_Number'

NaN, infinity or a value too large

Hi Shuhua,
When running the example of GEP_RNC_for_ML_with_UCI_Power_Plant_dataset, I added a symbolic function x^y:
pset.add_function(operator.pow, 2)
it throwed a error as the following:
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
Please give me some advice .
Thank you

Input contains NaN, infinity or a value too large for dtype('float64')

Hello.

I have the following error when running the evolution: "ValueError: Input contains NaN, infinity or a value too large for dtype('float64')." (I upload the whole text in "error.txt")

The input data do not have any NaN value, and the format is 'float64'. The problem may be in 'y_pred', because when I delete the functions pow, sqrt and log, everything works fine (using any of them, the problem arises):

pset = gep.PrimitiveSet('Main', input_names=inputs[:-1])
pset.add_function(operator.add, 2)
pset.add_function(operator.sub, 2)
pset.add_function(operator.mul, 2)
pset.add_function(protected_div, 2)
pset.add_function(operator.pow, 2)
pset.add_function(np.sqrt, 1)
pset.add_function(np.log, 1)
pset.add_rnc_terminal()

I also upload the implementation file in the .zip

Thank you very much in advance

error_and_py.zip

Modifying the number of genes in GEP code produces error

I am using this code for developing an expression in my research, but I have experienced one issue, that I can't change the n_genes to value other than 2.

I havenot modified anything in the code except my data and the parameters of head size and generations.

Please have a look on it, that how can we increase number of genes in this code and what necessary modifications are required in code to make it work.

I have tried using 3 but it throws following error:


TypeError Traceback (most recent call last)
Cell In[325], line 9
6 hof = tools.HallOfFame(3) # only record the best three individuals ever found in all generations
8 # start evolution
----> 9 pop, log = gep.gep_simple(pop, toolbox, n_generations=n_gen, n_elites=1,
10 stats=stats, hall_of_fame=hof, verbose=True)

File c:\programs\python\python39\lib\site-packages\geppy\algorithms\basic.py:100, in gep_simple(population, toolbox, n_generations, n_elites, stats, hall_of_fame, verbose)
98 invalid_individuals = [ind for ind in population if not ind.fitness.valid]
99 fitnesses = toolbox.map(toolbox.evaluate, invalid_individuals)
--> 100 for ind, fit in zip(invalid_individuals, fitnesses):
101 ind.fitness.values = fit
103 # record statistics and log

Cell In[321], line 5, in evaluate_linear_scaling(individual)
2 """Evaluate the fitness of an individual with linearly scaled MSE.
3 Get a and b by minimizing (a*Yp + b - Y)"""
4 func = toolbox.compile(individual)
----> 5 Yp = np.array(list(map(func, bc, fc, ef, tf, bf, lf)))
7 # special cases: (1) individual has only a terminal
8 # (2) individual returns the same value for all test cases, like 'x - x + 10'. np.linalg.lstsq will fail in such cases.
10 if isinstance(Yp, np.ndarray):

File c:\programs\python\python39\lib\site-packages\geppy\tools\parser.py:50, in compile_..(*x)
48 else:
49 return lambda *x: tuple((f(*x) for f in fs))
---> 50 return lambda x: linker((f(*x) for f in fs))

TypeError: add expected 2 arguments, got 3

division function: protected_div

Hi,I am writing a thesis with GEP. Thanks for your project and I found a problem when using the examples.

def protected_div(x1, x2):
    if abs(x2) < 1e-6:
        return 1
    return x1 / x2

This protected division is defined to avoid dividing by zero.But how can it be used when you predict the aim value with the target function you've got by GEP. I've got many 'inf' in my prediction.And this is caused by the real division when using 'eval' to calculate. I am new in GEP. Do you know how does the GEP algorithm solve the division?

Save an individual, or how to create an individula knowing its expression

Hello,

I am using geppy to test it on flow control, the context is symbolic regression. To avoid restart evolutions, I want to be able to create an individual from a known expression (for instance x+y*z-2).

I launch an evolution, at the end I get my best individual thanks to the hall of fame, then I print it and I get for instance this output :

mul(
mul(protected_div(add(protected_div(sub(y, z), x), y), sub(protected_div(y, -0.9790698363740189), add(z, x))), x),
mul(protected_div(add(protected_div(sub(y, z), y), x), sub(protected_div(y, -0.9790698363740189), add(z, z))), x)
)

I want to make some post processing with matplotlib, and I need to use this best individual without restarting the whole evolution. How can I use the print output to create directly this individual and use it in my postprocessing script ?

Thanks and regards,
Rémi MOCHON

how to implement multigenes with different head length?

Hi there,
I'm trying use GEP to evolve trading rules(EDT-RNC) to detect buy/sell signal, but I have trouble in setting parameters.
I've got four genes in one chromosome with length of different heads and RNC arrays. The example shows that fixed head length and rnc array.
I'd like to ask for help that how should I rewrite the code of multigenes in the chromosome?
Words are not enough to express my gratitude.
螢幕快照 2019-04-02 下午3 32 19

question: limit function parameter to const integers

For example, function shift(a, N) takes two parameters. The value for 'N' must be in the range of 1 to 5. The value comes from a random integer number directly, not from output of any other functions.

Is it possible to specify this kind of constraints?

gep_simplify can't be used in evaluation

When I tried the example "numerical_expression_inference-RNC", I attempted to modify the evaluation function. I called gep_simplify in the evaluation function:

def evaluate(individual):
    """Evalute the fitness of an individual: MAE (mean absolute error)"""
    func = toolbox.compile(individual)
    Yp = np.array(list(map(func, X)))
    individual_simplified = gep.simplify(individual)
    x_occrrence = str(individual_simplified).count("x")
    
    return np.mean(np.abs(Y - Yp)), x_occrrence

However when I started the evolution I always got error:

 ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-13-de916872db16> in <module>
      8 # start evolution
      9 pop, log = gep.gep_simple(pop, toolbox, n_generations=n_gen, n_elites=1,
---> 10                           stats=stats, hall_of_fame=hof, verbose=True)

~/Projects/geppy/geppy/algorithms/basic.py in gep_simple(population, toolbox, n_generations, n_elites, stats, hall_of_fame, verbose)
     98         invalid_individuals = [ind for ind in population if not ind.fitness.valid]
     99         fitnesses = toolbox.map(toolbox.evaluate, invalid_individuals)
--> 100         for ind, fit in zip(invalid_individuals, fitnesses):
    101             ind.fitness.values = fit
    102 

<ipython-input-9-7da034cb5577> in evaluate(individual)
      3     func = toolbox.compile(individual)
      4     Yp = np.array(list(map(func, X)))
----> 5     individual_simplified = gep.simplify(individual)
      6     x_occrrence = str(individual_simplified).count("x")
      7 

~/Projects/geppy/geppy/support/simplification.py in simplify(genome, symbolic_function_map)
    129             except:
    130                 linker = genome.linker
--> 131             return sp.simplify(linker(*simplified_exprs))
    132     else:
    133         raise TypeError('Only an argument of type KExpression, Gene, and Chromosome is acceptable. The provided '

TypeError: unsupported operand type(s) for +: 'Add' and 'str'

I tested the gep_simplify called at the post-processing part and it worked well there. I tried but could not find the problem. Thank you very much if you can help.

import problem

Hi. I installed geppy following the instructions in readme, however if I cd to another project folder I can't import geppy. The error is

ModuleNotFoundError: No module named 'geppy.core'

Currently after installing geppy it can only be imported from its folder?

installation not working with python 3.7.4

Hi i wanted to use the gep so i tried to install your works, but got error and unable to install it:

  • Tried to install it using pip, got error:
ERROR: Could not find a version that satisfies the requirement geppy (from versions: none)
ERROR: No matching distribution found for geppy
  • Tried to install it from Source using:
cd geppy-master
pip install .

seems to work, but when i leave the folder where geppy placed, and use

import geppy as gep

i got the error:

>>> import geppy as gep
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\tyker1\AppData\Local\Programs\Python\Python37\lib\site-packages\geppy-0.1.0a0-py3.7.egg\geppy\__init__.py", line 37, in <module>
ModuleNotFoundError: No module named 'geppy.core'

it works and only works at the directory where the source file placed.

Please provide the symbolic function mapping for 'tan' in symbolic_function_map.

Hi, Shuhua,
Thanks for providing the gene expression programming.
When using the provided example (Use the GEP-RNC algorithm with the UCI Power Plant dataset), I tried to use some functions such as sin, cos, and tan to increase the precision of developed formula.
However, some errors were found.
pset.add_function(math.sin, 1):
Please provide the symbolic function mapping for 'sin' in symbolic_function_map.
pset.add_function(math.tan, 1):
Please provide the symbolic function mapping for 'tan' in symbolic_function_map.
pset.add_function(math.cos, 1):
Please provide the symbolic function mapping for 'cos' in symbolic_function_map.
pset.add_function(math.log, 1):
ValueError Traceback (most recent call last)
ValueError: math domain error
pset.add_function(sp.log, 1):
TypeError Traceback (most recent call last)
TypeError: No loop matching the specified signature and casting was found for ufunc lstsq_n
Could you please give me some suggestions?
Thanks again.

SVD did not converge in Linear Least Squares

Hi Shuhua,

I have modified the example: 'numerical_expression_inference-Linear_scaling.ipynb' by modifying the input function: f(x)=x**1.5. Then adding pset.add_function(operator.pow, 2). However, a problem occurred as numpy.linalg.LinAlgError: SVD did not converge in Linear Least Squares.

Please give me some advice on how to overcome this issue.

Thanks,
Hoan Nguyen

can you explain the gep_simple method outputs-nevals?

i dont understand the meanings of nevals.

And, can you explain how the mate happens?

assume i have a population(n=50), which produce 50 expressions.

how they mate? The permution of 50 expr should be 50*49/2.

I think it wont do that many times of mate, so, how it select parents and generate offstring?

Thanks, i am working on gp, and i think geppy is very friendly.

how to generate all the expressions on a given primitive set?

I think when i confirmed the primitive set. the expression tree space is confirmed( the number of expression trees for a fixed depth).

How can i generate them all?

I think genFull or genHalf is randomly generator, how can i generate all without missing?

Thanks

how to perform multi dimensioned GEP

Hello
I’d like to use geppy to calculate a formula where the output is a tensor and the inputs are several tensors and scalars, is there any tutorial to show how to do that?

Yours Sincerely

cannot add pow or exp function

Hi,
I cannot add power or exponential function in my programme.
if I add
pset.add_function(operator.pow, 2), the following things return. Could you please help me on this regard?

:1: RuntimeWarning: overflow encountered in double_scalars
Traceback (most recent call last):
File "G:/My Drive/Afifa Tamanna/PhD/AI/GEP/Diffn_Analysis_GEP_model/Analysis/testviscosity.py", line 180, in
pop, log = gep.gep_simple(pop, toolbox, n_generations=n_gen, n_elites=1,
File "C:\Program Files (x86)\Python38-32\lib\site-packages\geppy\algorithms\basic.py", line 100, in gep_simple
for ind, fit in zip(invalid_individuals, fitnesses):
File "<array_function internals>", line 5, in lstsq
File "C:\Users\atam0001\AppData\Roaming\Python\Python38\site-packages\numpy\linalg\linalg.py", line 2259, in lstsq
x, resids, rank, s = gufunc(a, b, rcond, signature=signature, extobj=extobj)
File "C:\Users\atam0001\AppData\Roaming\Python\Python38\site-packages\numpy\linalg\linalg.py", line 109, in _raise_linalgerror_lstsq
raise LinAlgError("SVD did not converge in Linear Least Squares")
numpy.linalg.LinAlgError: SVD did not converge in Linear Least Squares

Issue with linker function

Hello,

I am using your code for a project involving GEP.
When using the linker function "operator.mul" you can only have two genes in your optimisation.
if you try a different number (for example 4 genes) I get the error : "op_mul expected 2 arguments, got 4".
Am I doing something wrong? Shouldn't I be able to use any number of gene with a given kind of linker?

I can solve this by defining a new class for linker that take four input variables but It results in problem during the simplification.

Thank you very much this code, it is very helpfull.

Questions about using numba to speed up evaluation functions

I see that you are using JIT of the numba library to speed up the evaluation function in your example, but when I used it, I found that numba does not seem to recognize those custom classes, so an error will be reported directly when using JIT. Is there any good solution to this situation?

GPU acceleration

Is it possible to accelerate geppy evolution/evaluation process using GPU? Any tips what I could do to achieve that?

How to set n_genes = 3

Hello!

I want to set n_genes = 3, but since I did it, it throwed a error as the following:

image

image

How can I add functions such as SQRT, EXP, X^2, POW, INV, LN in my code of Gene Expression Programming

Hi, I cannot add the above mentioned functions functions. If I add function such as
pset.add_function(operator.pow,2)

then the following things return;
:1: RuntimeWarning: overflow encountered in double_scalars
Traceback (most recent call last):
File "G:/My Drive/Afifa Tamanna/PhD/AI/GEP/Diffn_Analysis_GEP_model/Analysis/testviscosity.py", line 198, in
pop, log = gep.gep_simple(pop, toolbox, n_generations=n_gen, n_elites=1,
File "C:\Program Files (x86)\Python38-32\lib\site-packages\geppy\algorithms\basic.py", line 100, in gep_simple
for ind, fit in zip(invalid_individuals, fitnesses):
File "<array_function internals>", line 5, in lstsq
File "C:\Users\atam0001\AppData\Roaming\Python\Python38\site-packages\numpy\linalg\linalg.py", line 2259, in lstsq
x, resids, rank, s = gufunc(a, b, rcond, signature=signature, extobj=extobj)
File "C:\Users\atam0001\AppData\Roaming\Python\Python38\site-packages\numpy\linalg\linalg.py", line 109, in _raise_linalgerror_lstsq
raise LinAlgError("SVD did not converge in Linear Least Squares")
numpy.linalg.LinAlgError: SVD did not converge in Linear Least Squares

Please assist me in solving the issue. Many Thanks.

How to add more symbolic functions in pset.add_function()

Hi Shuhua,
I am trying to add more symbolic functions but have trouble with several function.
I am doing the samples as your 'Use the GEP-RNC algorithm with linear scaling on the UCI Power Plant dataset '
1.For 'sin()' function
I have set
pset.add_function(math.sin,1)
best_ind=hof[0]
symplified_best=gep.simplify(best_ind)
But after running,when I simplify the best individual,the GEPPY give me that:
Please provide the symbolic function mapping for 'sin' in symbolic_function_map.
I don't know HOW TO MAP IT.
2.For 'sqrt()' function
I have set
def protected_sqrt(x3):
if x3<1e-6:
return 1
return math.sqrt(x3)
and
pset.add_function(protected_sqrt,1)
But GEPPY give me that:
Please provide the symbolic function mapping for 'protected_sqrt' in symbolic_function_map.
I don't know HOW TO DO IT,TOO.
thanks

Deep learning, aka tensor based input and functions

Hey there,

Really great work, been looking for a library like this in Python for a while.

Have you thought or considered integrating your package to use tensor input and operations etc.?

So instead of resolving to a function you may resolve to some deep learning network configuration. I am wondering if this is something you have considered. My interest would be to upgrade your package to support tensor operations etc.

Anyway, fantastic work.

Cheers,

Micheal

multi

Hello Again @ShuhuaGao @bolz213

I am trying geppy to fit N 3X3 tensor Bij (it is actually a symmetrical
tensor, so I wrote it like a vector of size 6)with two scalar list I1,I2 of size N and two 3X3 tensor list V1,V2 with fitness defined as tensorDot(Bij, PBij)/(tensordot(Bij,Bji)*tensordot(PBij, PBji)). but I get errors in my evaluation function like this:
[[ATraceback (most recent call last):
File "testGEPPY_DNS.py", line 109, in
stats=stats, hall_of_fame=hof, verbose=True)
File "/Users/weizhang/software/backup/geppy-master/geppy/algorithms/basic.py", line 100, in gep_simple
for ind, fit in zip(invalid_individuals, fitnesses):
File "testGEPPY_DNS.py", line 71, in evaluate
Rp_i=np.array([[Yp[i,1],Yp[i,2],Yp[i,3]],[Yp[i,2],Yp[i,4],Yp[i,5]],[Yp[i,3],Yp[i,5],Yp[i,:6]]])
IndexError: too many indices for array

here is my evaluation function:
`def evaluate(individual):
"""Evalute the fitness of an individual: MSE (mean squared error)"""
func = toolbox.compile(individual)
Yp = np.array(list(map(func,T1,T2,T3))) # predictions with the GEP model

#print (np.shape(Yp),Yp)
a=0
b=0
c=0
for i in range(size):
Ri=np.array([[bij[i,1],bij[i,2],bij[i,3]],[bij[i,2],bij[i,4],bij[i,5]],[bij[i,3],bij[i,5],bij[i,6]]])
Rp_i=np.array([[Yp[i,1],Yp[i,2],Yp[i,3]],[Yp[i,2],Yp[i,4],Yp[i,5]],[Yp[i,3],Yp[i,5],Yp[i,6]]])

print (Ri,np.shape(Ri))

a=a+np.tensordot(Rp_i,Ri)
b=a+np.tensordot(Ri,Ri.T)
c=c+np.tensordot(Rp_i,Rp_i.T)

return a/(b*c),`

I tried to print the PB, seems give me a array of(N,1) rather than expected (N,3,3)
can you please give me some suggestion

fitting for multi dimension vector.

Hello Again Gao & Joach

I am trying geppy to fit N 3X3 tensor Bij with two scalar list I1,I2 of size N and two 3X3 tensor list V1,V2 with fitness defined as tensorDot(Bij, PBij)/(tensordot(Bij,Bji)*tensordot(PBij, PBji)). but I get errors in my evaluation function like this:
[[ATraceback (most recent call last):
File "testGEPPY_DNS.py", line 109, in
stats=stats, hall_of_fame=hof, verbose=True)
File "/Users/weizhang/software/backup/geppy-master/geppy/algorithms/basic.py", line 100, in gep_simple
for ind, fit in zip(invalid_individuals, fitnesses):
File "testGEPPY_DNS.py", line 71, in evaluate
Rp_i=np.array([[Yp[i,1],Yp[i,2],Yp[i,3]],[Yp[i,2],Yp[i,4],Yp[i,5]],[Yp[i,3],Yp[i,5],Yp[i,:6]]])
IndexError: too many indices for array

here is my evaluation function:
`def evaluate(individual):
"""Evalute the fitness of an individual: MSE (mean squared error)"""
func = toolbox.compile(individual)
Yp = np.array(list(map(func,T1,T2,T3))) # predictions with the GEP model

#print (np.shape(Yp),Yp)
a=0
b=0
c=0
for i in range(size):
   Ri=np.array([[bij[i,1],bij[i,2],bij[i,3]],[bij[i,2],bij[i,4],bij[i,5]],[bij[i,3],bij[i,5],bij[i,:6]]])
   Rp_i=np.array([[Yp[i,1],Yp[i,2],Yp[i,3]],[Yp[i,2],Yp[i,4],Yp[i,5]],[Yp[i,3],Yp[i,5],Yp[i,:6]]])
#   print (Ri,np.shape(Ri))
   a=a+np.tensordot(Rp_i,Ri)
   b=a+np.tensordot(Ri,Ri.T) 
   c=c+np.tensordot(Rp_i,Rp_i.T) 

return a/(b*c),

`

I tried print the PB, seems give me a array of(N,1) rather than expected (N,3,3)
can you please give me some suggestion

cannot install geppy

Hi, I cannot install geppy neither by pip nor by cd geppy. I'd like to ask for solution. Thanks!

螢幕快照 2019-03-31 下午7 28 16

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.