shuhuagao / geppy Goto Github PK

A framework for gene expression programming (an evolutionary algorithm) in Python

Home Page: https://geppy.readthedocs.io/en/latest/

License: GNU Lesser General Public License v3.0

Python 100.00%

gene-expression-programming genetic-programming evolutionary-algorithm gep symbolic-regression system-identification evolutionary-computation

geppy's Introduction

geppy: a gene expression programming framework in Python

geppy is a computational framework dedicated to Gene Expression Programming (GEP), which is proposed by C. Ferreira in 2001 [1]. geppy is developed in Python 3.

What is GEP?

Gene Expression Programming (GEP) is a popular and established evolutionary algorithm for automatic generation of computer programs and mathematical models. It has found wide applications in symbolic regression, classification, automatic model design, combinatorial optimization and real parameter optimization problems [2].

GEP can be seen as a variant of the traditional genetic programming (GP) and it uses simple linear chromosomes of fixed lengths to encode the genetic information. Though the chromosome (genes) is of fixed length, it can produce expression trees of various sizes thanks to its genotype-phenotype expressio system. Many experiments show that GEP is more efficient than GP, and the trees evolved by GEP tend to have a smaller size than the ones of GP.

geppy and DEAP

geppy is built on top of the excellent evolutionary computation framework DEAP for rapid prototyping and testing of ideas with GEP. DEAP provides fundamental support for GP, while lacking support for GEP. geppy tries the best to follow the style of DEAP and attempts to maintain compatibility with the major infrastructure of DEAP. In other words, to some degree geppy may be considered as a plugin of DEAP to specially support GEP. If you are familiar with DEAP, then it is easy to grasp geppy. Besides, comprehensive documentation is also available.

Features

Compatibility with the DEAP infrastructure and easy accessibility to DEAP's functionality including:
- Multi-objective optimisation
- Straightforward parallelization of fitness evaluations for speedup
- Hall of Fame of the best individuals that lived in the population
- Checkpoints that take snapshots of a system regularly
- Statistics and logging
Core data structures in GEP, including the gene, chromosome, expression tree, and K-expression.
Implementation of common mutation, transposition, inversion and crossover operators in GEP.
Boilerplate algorithms, including the standard GEP algorithm and advanced algorithms integrating a local optimizer for numerical constant optimization.
Support numerical constants inference with a third Dc domain in genes: the GEP-RNC algorithm.
Flexible built-in algorithm interface, which can support an arbitrary number of custom mutation and crossover-like operators.
Visualization of the expression tree.
Symbolic simplification of a gene, a chromosome, or a K-expression in postprocessing.
Examples of different applications using GEP with detailed comments in Jupyter notebook.

Installation

From PyPI (recommended)

pip install geppy

From source

You can install it from sources.

First download or clone this repository

git clone https://github.com/ShuhuaGao/geppy

Change into the root directory, i.e., the one containing the setup.py file and install geppy using pip

cd geppy
pip install .

Documentation

Check geppy documentation for GEP theory and tutorials as well as a comprehensive introduction of geppy's API and typical usages with comprehensive tutorials and examples.

Examples

A getting started example is presented in the Jupyter notebook Boolean model identification, which infers a Boolean function from given input-output data with GEP. More examples are listed in the following.

Simple symbolic regression

Boolean model identification (Getting started with no constants involved)
Simple mathematical expression inference (Constants finding with ephemeral random constants (ERC))
Simple mathematical expression inference with the GEP-RNC algorithm (Demonstrating the GEP-RNC algorithm for numerical constant evolution)

Advanced symbolic regression

Improving symbolic regression with linear scaling (Use the linear scaling technique to evolve models with continuous real constants more efficiently)
Use the GEP-RNC algorithm with linear scaling on the UCI Power Plant dataset See how to apply GEP based symbolic regression on a real machine learning dataset.

Requirements

Python 3.6 and afterwards
DEAP, which should be installed automatically if you haven't got it when installing geppy.
[optional] To visualize the expression tree using the geppy.export_expression_tree method, you need the graphviz module.
[optional] Since GEP/GP doesn't simplify the expressions during evolution, its final result may contain many redundancies, and the tree can be very large, like x + 5 * (2 * x - x - x) - 1, which is simply x - 1. You may like to simplify the final model evolved by GEP with symbolic computation to get better understanding of this model. The corresponding geppy.simplify method depends on the sympy package.

Common pitfalls in using GP

Always keep in mind that evolution is random. Thus, any values may be input into a function. If issues like "overflow", "nan", or "not a number", or unreasonally huge values are encounterred, the most possible reason is that you did not protect a possibly dangerous function. For example, if the sqrt function lies in the function set, then in evaluating one individual evolved by geppy (or GP in general), it is likely that a negative input sqrt(-1.24) may happen.

Refer to issues #28 #26 #4 for more details.

Reference

The bible of GEP is definitely Ferreira, C.'s monograph: Ferreira, C. (2006). Gene expression programming: mathematical modeling by an artificial intelligence (Vol. 21). Springer.

You can also get a lot of papers/documents by Googling 'gene expression programming'.

[1] Ferreira, C. (2001). Gene Expression Programming: a New Adaptive Algorithm for Solving Problems. Complex Systems, 13. [2] Zhong, J., Feng, L., & Ong, Y. S. (2017). Gene expression programming: a survey. IEEE Computational Intelligence Magazine, 12(3), 54-72.

How to cite geppy

If you find geppy useful in your projects, please cite it such that more researchers/engineers will know it. A BibTeX entry for geppy is given below.

@misc{geppy_2020,
    author       = {Shuhua Gao},
    title        = {{geppy: a Python framework for gene expression programming }},
    month        = July,
    year         = 2020,
    doi          = {10.5281/zenodo.3946297},
    version      = {0.1},
    publisher    = {Zenodo},
    url          = {https://github.com/ShuhuaGao/geppy}
    }

Alternatively, if you want a more academic citation, you may cite our relevant paper

@ARTICLE{learn_async,
  author={S. {Gao} and C. {Sun} and C. {Xiang} and K. {Qin} and T. H. {Lee}},
  journal={IEEE Transactions on Cybernetics}, 
  title={Learning Asynchronous Boolean Networks From Single-Cell Data Using Multiobjective Cooperative Genetic Programming}, 
  year={2020},
  volume={},
  number={},
  pages={1-15},
  doi={10.1109/TCYB.2020.3022430}}

geppy's People

Contributors

Stargazers

Watchers

Forkers

vishalbelsare bytesumo bytesumoltd xiaomocandy xgpeng bolzzzz 20cmdingding jimmy-ksu lucaswbg brenda151295 waynezw0618 ericklarac erickcai zoo991230 tacitate lqyuan0428 321hg juanantoniobellido felix660 traderlife8 emergx thetradingflow zhaominger qiwang-sjtu afifa-tamanna jimmy-inl tangqianyan xerxeschongxian26 niinjoy singh-t liningbo hamidrezakaboli meng004 lzkcrow swartben rabbytr pgkang wenjieooo tbcc66 damonch minghao2016 lkampoli seanigami adnanhama jevenry hariyanobuki yiquxiangsi yutiansut imancivil gyf135 minkymorgan rafaelstevenson ryanz518 liuyy70 wangmy22 roelvdp viko-tse hua-ku-ku jasmine969 zuokuijun jingmouren gaomath liuznil smoothparticle mick-phemex ryanheminway alainlompo wwsheldons barca0412 gmlewis pistilreaper

geppy's Issues

question about "minimize"

from deap import creator, base, tools

creator.create("FitnessMin", base.Fitness, weights=(-1,)) # to minimize the objective (fitness)
creator.create("Individual", gep.Chromosome, fitness=creator.FitnessMin)

Hi, pardon, in the examples of geppy, there writes "Our objective is to minimize the MSE (mean squared error) for data fitting" using above code.

"FitnessMin" is used, and weights=(-1,). Then this is maximization. But why it is still to minimize the objective (fitness)?

Linking operators for 3 or more genes in a chromosome

Hi Shuhua,

Thank you for creating this great package!

Do you have any example with the number of genes more than 2 in a chromosome?
I tried to increase the number of genes to 3 in example #2 (Simple mathematical expression inference), however I got "TypeError: add expected 2 arguments, got 3".
I'm new to Python, not sure if I understand the code correctly, but it seems that the "add" operator does not work as a linker for more than 2 genes since it only accepts two parameters.

Regards,
Shaun

Modifying the number of genes in GEP code produces error

I am using this code for developing an expression in my research, but I have experienced one issue, that I can't change the n_genes to value other than 2.

I havenot modified anything in the code except my data and the parameters of head size and generations.

Please have a look on it, that how can we increase number of genes in this code and what necessary modifications are required in code to make it work.

I have tried using 3 but it throws following error:

TypeError Traceback (most recent call last)
Cell In[325], line 9
6 hof = tools.HallOfFame(3) # only record the best three individuals ever found in all generations
8 # start evolution
----> 9 pop, log = gep.gep_simple(pop, toolbox, n_generations=n_gen, n_elites=1,
10 stats=stats, hall_of_fame=hof, verbose=True)

File c:\users\sarmed wahab.desktop-ul8783a\appdata\local\programs\python\python39\lib\site-packages\geppy\algorithms\basic.py:100, in gep_simple(population, toolbox, n_generations, n_elites, stats, hall_of_fame, verbose)
98 invalid_individuals = [ind for ind in population if not ind.fitness.valid]
99 fitnesses = toolbox.map(toolbox.evaluate, invalid_individuals)
--> 100 for ind, fit in zip(invalid_individuals, fitnesses):
101 ind.fitness.values = fit
103 # record statistics and log

Cell In[321], line 5, in evaluate_linear_scaling(individual)
2 """Evaluate the fitness of an individual with linearly scaled MSE.
3 Get a and b by minimizing (a*Yp + b - Y)"""
4 func = toolbox.compile(individual)
----> 5 Yp = np.array(list(map(func, bc, fc, ef, tf, bf, lf)))
7 # special cases: (1) individual has only a terminal
8 # (2) individual returns the same value for all test cases, like 'x - x + 10'. np.linalg.lstsq will fail in such cases.
10 if isinstance(Yp, np.ndarray):

File c:\users\sarmed wahab.desktop-ul8783a\appdata\local\programs\python\python39\lib\site-packages\geppy\tools\parser.py:50, in compile_..(*x)
48 else:
49 return lambda *x: tuple((f(*x) for f in fs))
---> 50 return lambda x: linker((f(*x) for f in fs))

TypeError: add expected 2 arguments, got 3

REgrading Reproductibility

Hi,
I am running UCI_power_plant symbolic regression problem. I see you use seed to reproduce the results. But when I am running the case, for each run with the same parameters used in the example, It results in different expressions. Is the seed for reproducing the train/test data or for the actual results?

Do you support Automatically Defined Functions?

how to assign some illegal operation which would not be search

hello, I have designed some operator, for example

f(x, n):
....

the second param must be constant int. How can I make a rule that the algorithm would not waste time searching.

Any settings or solutions for 'De-duplication' in the Hall of Fame?

Thanks for you to develop this great work first. There is one problem of my project. Assume that I set the size of hof to be 100 and get best 100 individuals ever round, however, many of them are duplication. There are only 10 unique ones in the hof. Is there any settings or solutions to de-duplication? Looking forward to your reply

after add function of log, problems occured to visualize

Hello, thank you for sharing these codes. I have added the function of log in the code, but it failed to visualize as it should be. is there any method to fix it?

Our symbolic regression process found the following equation offers our best prediction:

     -0.0504234833139171 + 0.24544201652149*(d3 + log(d6))*log(-(11*d1 + 2)*(d2 - d6))/d4

which formally is presented as:

Traceback (most recent call last):
File "C:\Python38\lib\site-packages\sympy\core\cache.py", line 94, in wrapper
retval = cfunc(*args, **kwargs)
File "C:\Python38\lib\site-packages\pandas\core\generic.py", line 1665, in hash
raise TypeError(
TypeError: 'Series' objects are mutable, thus they cannot be hashed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "GEP-data01.py", line 272, in
predPE = CalculateBestModelOutput(holdout.d1, holdout.d2, holdout.d3, holdout.d4, holdout.d5, holdout.d6, str(symplified_best))
File "GEP-data01.py", line 261, in CalculateBestModelOutput
return eval(model)
File "", line 1, in
File "C:\Python38\lib\site-packages\sympy\core\cache.py", line 96, in wrapper
retval = func(*args, **kwargs)
File "C:\Python38\lib\site-packages\sympy\core\function.py", line 465, in new
result = super().new(cls, *args, **options)
File "C:\Python38\lib\site-packages\sympy\core\cache.py", line 96, in wrapper
retval = func(*args, **kwargs)
File "C:\Python38\lib\site-packages\sympy\core\function.py", line 280, in new
evaluated = cls.eval(*args)
File "C:\Python38\lib\site-packages\sympy\functions\elementary\exponential.py", line 622, in eval
if arg.is_Number:
File "C:\Python38\lib\site-packages\pandas\core\generic.py", line 5136, in getattr
return object.getattribute(self, name)
AttributeError: 'Series' object has no attribute 'is_Number'

NaN, infinity or a value too large

Hi Shuhua,
When running the example of GEP_RNC_for_ML_with_UCI_Power_Plant_dataset, I added a symbolic function x^y:
pset.add_function(operator.pow, 2)
it throwed a error as the following:
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
Please give me some advice .
Thank you

Input contains NaN, infinity or a value too large for dtype('float64')

Hello.

I have the following error when running the evolution: "ValueError: Input contains NaN, infinity or a value too large for dtype('float64')." (I upload the whole text in "error.txt")

The input data do not have any NaN value, and the format is 'float64'. The problem may be in 'y_pred', because when I delete the functions pow, sqrt and log, everything works fine (using any of them, the problem arises):

pset = gep.PrimitiveSet('Main', input_names=inputs[:-1])
pset.add_function(operator.add, 2)
pset.add_function(operator.sub, 2)
pset.add_function(operator.mul, 2)
pset.add_function(protected_div, 2)
pset.add_function(operator.pow, 2)
pset.add_function(np.sqrt, 1)
pset.add_function(np.log, 1)
pset.add_rnc_terminal()

I also upload the implementation file in the .zip

Thank you very much in advance

error_and_py.zip

Modifying the number of genes in GEP code produces error

I am using this code for developing an expression in my research, but I have experienced one issue, that I can't change the n_genes to value other than 2.

I havenot modified anything in the code except my data and the parameters of head size and generations.

Please have a look on it, that how can we increase number of genes in this code and what necessary modifications are required in code to make it work.

I have tried using 3 but it throws following error:

File c:\programs\python\python39\lib\site-packages\geppy\algorithms\basic.py:100, in gep_simple(population, toolbox, n_generations, n_elites, stats, hall_of_fame, verbose)
98 invalid_individuals = [ind for ind in population if not ind.fitness.valid]
99 fitnesses = toolbox.map(toolbox.evaluate, invalid_individuals)
--> 100 for ind, fit in zip(invalid_individuals, fitnesses):
101 ind.fitness.values = fit
103 # record statistics and log

File c:\programs\python\python39\lib\site-packages\geppy\tools\parser.py:50, in compile_..(*x)
48 else:
49 return lambda *x: tuple((f(*x) for f in fs))
---> 50 return lambda x: linker((f(*x) for f in fs))

TypeError: add expected 2 arguments, got 3

division function: protected_div

Hi,I am writing a thesis with GEP. Thanks for your project and I found a problem when using the examples.

def protected_div(x1, x2):
    if abs(x2) < 1e-6:
        return 1
    return x1 / x2

This protected division is defined to avoid dividing by zero.But how can it be used when you predict the aim value with the target function you've got by GEP. I've got many 'inf' in my prediction.And this is caused by the real division when using 'eval' to calculate. I am new in GEP. Do you know how does the GEP algorithm solve the division?

Save an individual, or how to create an individula knowing its expression

Hello,

I am using geppy to test it on flow control, the context is symbolic regression. To avoid restart evolutions, I want to be able to create an individual from a known expression (for instance x+y*z-2).

I launch an evolution, at the end I get my best individual thanks to the hall of fame, then I print it and I get for instance this output :

mul(
mul(protected_div(add(protected_div(sub(y, z), x), y), sub(protected_div(y, -0.9790698363740189), add(z, x))), x),
mul(protected_div(add(protected_div(sub(y, z), y), x), sub(protected_div(y, -0.9790698363740189), add(z, z))), x)
)

I want to make some post processing with matplotlib, and I need to use this best individual without restarting the whole evolution. How can I use the print output to create directly this individual and use it in my postprocessing script ?

Thanks and regards,
Rémi MOCHON

how to implement multigenes with different head length?

Hi there,
I'm trying use GEP to evolve trading rules(EDT-RNC) to detect buy/sell signal, but I have trouble in setting parameters.
I've got four genes in one chromosome with length of different heads and RNC arrays. The example shows that fixed head length and rnc array.
I'd like to ask for help that how should I rewrite the code of multigenes in the chromosome?
Words are not enough to express my gratitude.

question: limit function parameter to const integers

For example, function shift(a, N) takes two parameters. The value for 'N' must be in the range of 1 to 5. The value comes from a random integer number directly, not from output of any other functions.

Is it possible to specify this kind of constraints?

gep_simplify can't be used in evaluation

When I tried the example "numerical_expression_inference-RNC", I attempted to modify the evaluation function. I called gep_simplify in the evaluation function:

def evaluate(individual):
    """Evalute the fitness of an individual: MAE (mean absolute error)"""
    func = toolbox.compile(individual)
    Yp = np.array(list(map(func, X)))
    individual_simplified = gep.simplify(individual)
    x_occrrence = str(individual_simplified).count("x")
    
    return np.mean(np.abs(Y - Yp)), x_occrrence

However when I started the evolution I always got error:

 ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-13-de916872db16> in <module>
      8 # start evolution
      9 pop, log = gep.gep_simple(pop, toolbox, n_generations=n_gen, n_elites=1,
---> 10                           stats=stats, hall_of_fame=hof, verbose=True)

~/Projects/geppy/geppy/algorithms/basic.py in gep_simple(population, toolbox, n_generations, n_elites, stats, hall_of_fame, verbose)
     98         invalid_individuals = [ind for ind in population if not ind.fitness.valid]
     99         fitnesses = toolbox.map(toolbox.evaluate, invalid_individuals)
--> 100         for ind, fit in zip(invalid_individuals, fitnesses):
    101             ind.fitness.values = fit
    102 

<ipython-input-9-7da034cb5577> in evaluate(individual)
      3     func = toolbox.compile(individual)
      4     Yp = np.array(list(map(func, X)))
----> 5     individual_simplified = gep.simplify(individual)
      6     x_occrrence = str(individual_simplified).count("x")
      7 

~/Projects/geppy/geppy/support/simplification.py in simplify(genome, symbolic_function_map)
    129             except:
    130                 linker = genome.linker
--> 131             return sp.simplify(linker(*simplified_exprs))
    132     else:
    133         raise TypeError('Only an argument of type KExpression, Gene, and Chromosome is acceptable. The provided '

TypeError: unsupported operand type(s) for +: 'Add' and 'str'

I tested the gep_simplify called at the post-processing part and it worked well there. I tried but could not find the problem. Thank you very much if you can help.

import problem

Hi. I installed geppy following the instructions in readme, however if I cd to another project folder I can't import geppy. The error is

ModuleNotFoundError: No module named 'geppy.core'

Currently after installing geppy it can only be imported from its folder?

installation not working with python 3.7.4

Hi i wanted to use the gep so i tried to install your works, but got error and unable to install it:

Tried to install it using pip, got error:

ERROR: Could not find a version that satisfies the requirement geppy (from versions: none)
ERROR: No matching distribution found for geppy

Tried to install it from Source using:

cd geppy-master
pip install .

seems to work, but when i leave the folder where geppy placed, and use

import geppy as gep

i got the error:

>>> import geppy as gep
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\tyker1\AppData\Local\Programs\Python\Python37\lib\site-packages\geppy-0.1.0a0-py3.7.egg\geppy\__init__.py", line 37, in <module>
ModuleNotFoundError: No module named 'geppy.core'

it works and only works at the directory where the source file placed.

Please provide the symbolic function mapping for 'tan' in symbolic_function_map.

Hi, Shuhua,
Thanks for providing the gene expression programming.
When using the provided example (Use the GEP-RNC algorithm with the UCI Power Plant dataset), I tried to use some functions such as sin, cos, and tan to increase the precision of developed formula.
However, some errors were found.
pset.add_function(math.sin, 1):
Please provide the symbolic function mapping for 'sin' in symbolic_function_map.
pset.add_function(math.tan, 1):
Please provide the symbolic function mapping for 'tan' in symbolic_function_map.
pset.add_function(math.cos, 1):
Please provide the symbolic function mapping for 'cos' in symbolic_function_map.
pset.add_function(math.log, 1):
ValueError Traceback (most recent call last)
ValueError: math domain error
pset.add_function(sp.log, 1):
TypeError Traceback (most recent call last)
TypeError: No loop matching the specified signature and casting was found for ufunc lstsq_n
Could you please give me some suggestions？
Thanks again.

How can I run the setup.py

i can't run setup.py on pycharm successfully.

Input contains NaN, infinity or a value too large for dtype('float64').

When running the example of GEP_RNC_for_ML_with_UCI_Power_Plant_dataset, I add a symbolic function.
pset.add_function(operator.pow, 2)
But GEPPY give me that:
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

SVD did not converge in Linear Least Squares

Hi Shuhua,

I have modified the example: 'numerical_expression_inference-Linear_scaling.ipynb' by modifying the input function: f(x)=x**1.5. Then adding pset.add_function(operator.pow, 2). However, a problem occurred as numpy.linalg.LinAlgError: SVD did not converge in Linear Least Squares.

Please give me some advice on how to overcome this issue.

Thanks,
Hoan Nguyen

How to add more symbolic functions into simplification.py?

Hi Shuhua,

I am trying to add more symbolic functions into GEPPY but have trouble with several function, e.g., pow, exp. Could you please give me some directions?

Thanks,
Hoan Nguyen

can you explain the gep_simple method outputs-nevals?

i dont understand the meanings of nevals.

And, can you explain how the mate happens?

assume i have a population(n=50), which produce 50 expressions.

how they mate? The permution of 50 expr should be 50*49/2.

I think it wont do that many times of mate, so, how it select parents and generate offstring?

Thanks, i am working on gp, and i think geppy is very friendly.

How how can I know that geppy converges

Excuse me, but how how can I know that geppy converges?

Is there any metric like fitness or loss to show that geppy converges?

Best!

how to generate all the expressions on a given primitive set?

I think when i confirmed the primitive set. the expression tree space is confirmed( the number of expression trees for a fixed depth).

How can i generate them all?

I think genFull or genHalf is randomly generator, how can i generate all without missing?

Thanks

If I changed your source code a littile, is it right to re-install from source again?

Sorry to bother again. I git clone this repo and change some code following your advice. Then is it right to just 'pip install .' again to use modified repo?

how to perform multi dimensioned GEP

Hello
I’d like to use geppy to calculate a formula where the output is a tensor and the inputs are several tensors and scalars, is there any tutorial to show how to do that?

Yours Sincerely

cannot add pow or exp function

Hi,
I cannot add power or exponential function in my programme.
if I add
pset.add_function(operator.pow, 2), the following things return. Could you please help me on this regard?

:1: RuntimeWarning: overflow encountered in double_scalars
Traceback (most recent call last):
File "G:/My Drive/Afifa Tamanna/PhD/AI/GEP/Diffn_Analysis_GEP_model/Analysis/testviscosity.py", line 180, in
pop, log = gep.gep_simple(pop, toolbox, n_generations=n_gen, n_elites=1,
File "C:\Program Files (x86)\Python38-32\lib\site-packages\geppy\algorithms\basic.py", line 100, in gep_simple
for ind, fit in zip(invalid_individuals, fitnesses):
File "<array_function internals>", line 5, in lstsq
File "C:\Users\atam0001\AppData\Roaming\Python\Python38\site-packages\numpy\linalg\linalg.py", line 2259, in lstsq
x, resids, rank, s = gufunc(a, b, rcond, signature=signature, extobj=extobj)
File "C:\Users\atam0001\AppData\Roaming\Python\Python38\site-packages\numpy\linalg\linalg.py", line 109, in _raise_linalgerror_lstsq
raise LinAlgError("SVD did not converge in Linear Least Squares")
numpy.linalg.LinAlgError: SVD did not converge in Linear Least Squares

Issue with linker function

Hello,

I am using your code for a project involving GEP.
When using the linker function "operator.mul" you can only have two genes in your optimisation.
if you try a different number (for example 4 genes) I get the error : "op_mul expected 2 arguments, got 4".
Am I doing something wrong? Shouldn't I be able to use any number of gene with a given kind of linker?

I can solve this by defining a new class for linker that take four input variables but It results in problem during the simplification.

Thank you very much this code, it is very helpfull.

raise TypeError, ("Both weights and assigned values must be a "

I just tried to import geppy and it shows raise TypeError, ("Both weights and assigned values must be a "

Questions about using numba to speed up evaluation functions

I see that you are using JIT of the numba library to speed up the evaluation function in your example, but when I used it, I found that numba does not seem to recognize those custom classes, so an error will be reported directly when using JIT. Is there any good solution to this situation?

GPU acceleration

Is it possible to accelerate geppy evolution/evaluation process using GPU? Any tips what I could do to achieve that?

How to set n_genes = 3

Hello!

I want to set n_genes = 3, but since I did it, it throwed a error as the following:

possible to use computer programming language to modify gene of human

How can I add functions such as SQRT, EXP, X^2, POW, INV, LN in my code of Gene Expression Programming

Hi, I cannot add the above mentioned functions functions. If I add function such as
pset.add_function(operator.pow,2)

then the following things return;
:1: RuntimeWarning: overflow encountered in double_scalars
Traceback (most recent call last):
File "G:/My Drive/Afifa Tamanna/PhD/AI/GEP/Diffn_Analysis_GEP_model/Analysis/testviscosity.py", line 198, in
pop, log = gep.gep_simple(pop, toolbox, n_generations=n_gen, n_elites=1,
File "C:\Program Files (x86)\Python38-32\lib\site-packages\geppy\algorithms\basic.py", line 100, in gep_simple
for ind, fit in zip(invalid_individuals, fitnesses):
File "<array_function internals>", line 5, in lstsq
File "C:\Users\atam0001\AppData\Roaming\Python\Python38\site-packages\numpy\linalg\linalg.py", line 2259, in lstsq
x, resids, rank, s = gufunc(a, b, rcond, signature=signature, extobj=extobj)
File "C:\Users\atam0001\AppData\Roaming\Python\Python38\site-packages\numpy\linalg\linalg.py", line 109, in _raise_linalgerror_lstsq
raise LinAlgError("SVD did not converge in Linear Least Squares")
numpy.linalg.LinAlgError: SVD did not converge in Linear Least Squares

Please assist me in solving the issue. Many Thanks.

Support types if input and output of functions

How can I specify the input types for a custom function like PrimitiveSetTyped in deap? @ShuhuaGao

what is the difference of GP and GEP in Engineering(running)?

It that running GEP require less RAM that GP

How to add more symbolic functions in pset.add_function()

Hi Shuhua,
I am trying to add more symbolic functions but have trouble with several function.
I am doing the samples as your 'Use the GEP-RNC algorithm with linear scaling on the UCI Power Plant dataset '
1.For 'sin()' function
I have set
pset.add_function(math.sin,1)
best_ind=hof[0]
symplified_best=gep.simplify(best_ind)
But after running,when I simplify the best individual,the GEPPY give me that:
Please provide the symbolic function mapping for 'sin' in symbolic_function_map.
I don't know HOW TO MAP IT.
2.For 'sqrt()' function
I have set
def protected_sqrt(x3):
if x3<1e-6:
return 1
return math.sqrt(x3)
and
pset.add_function(protected_sqrt,1)
But GEPPY give me that:
Please provide the symbolic function mapping for 'protected_sqrt' in symbolic_function_map.
I don't know HOW TO DO IT,TOO.
thanks

Deep learning, aka tensor based input and functions

Hey there,

Really great work, been looking for a library like this in Python for a while.

Have you thought or considered integrating your package to use tensor input and operations etc.?

So instead of resolving to a function you may resolve to some deep learning network configuration. I am wondering if this is something you have considered. My interest would be to upgrade your package to support tensor operations etc.

Anyway, fantastic work.

Cheers,

Micheal

multi

Hello Again @ShuhuaGao @bolz213

I am trying geppy to fit N 3X3 tensor Bij (it is actually a symmetrical
tensor, so I wrote it like a vector of size 6)with two scalar list I1,I2 of size N and two 3X3 tensor list V1,V2 with fitness defined as tensorDot(Bij, PBij)/(tensordot(Bij,Bji)*tensordot(PBij, PBji)). but I get errors in my evaluation function like this:
[[ATraceback (most recent call last):
File "testGEPPY_DNS.py", line 109, in
stats=stats, hall_of_fame=hof, verbose=True)
File "/Users/weizhang/software/backup/geppy-master/geppy/algorithms/basic.py", line 100, in gep_simple
for ind, fit in zip(invalid_individuals, fitnesses):
File "testGEPPY_DNS.py", line 71, in evaluate
Rp_i=np.array([[Yp[i,1],Yp[i,2],Yp[i,3]],[Yp[i,2],Yp[i,4],Yp[i,5]],[Yp[i,3],Yp[i,5],Yp[i,:6]]])
IndexError: too many indices for array

here is my evaluation function:
`def evaluate(individual):
"""Evalute the fitness of an individual: MSE (mean squared error)"""
func = toolbox.compile(individual)
Yp = np.array(list(map(func,T1,T2,T3))) # predictions with the GEP model

#print (np.shape(Yp),Yp)
a=0
b=0
c=0
for i in range(size):
Ri=np.array([[bij[i,1],bij[i,2],bij[i,3]],[bij[i,2],bij[i,4],bij[i,5]],[bij[i,3],bij[i,5],bij[i,6]]])
Rp_i=np.array([[Yp[i,1],Yp[i,2],Yp[i,3]],[Yp[i,2],Yp[i,4],Yp[i,5]],[Yp[i,3],Yp[i,5],Yp[i,6]]])

print (Ri,np.shape(Ri))

a=a+np.tensordot(Rp_i,Ri)
b=a+np.tensordot(Ri,Ri.T)
c=c+np.tensordot(Rp_i,Rp_i.T)

return a/(b*c),`

I tried to print the PB, seems give me a array of(N,1) rather than expected (N,3,3)
can you please give me some suggestion

How to change the fitness function

Hi Shuhua Gao,

I want to change the objective function. How can I do that?

Best regards,
Hoan

fitting for multi dimension vector.

Hello Again Gao & Joach

I am trying geppy to fit N 3X3 tensor Bij with two scalar list I1,I2 of size N and two 3X3 tensor list V1,V2 with fitness defined as tensorDot(Bij, PBij)/(tensordot(Bij,Bji)*tensordot(PBij, PBji)). but I get errors in my evaluation function like this:
[[ATraceback (most recent call last):
File "testGEPPY_DNS.py", line 109, in
stats=stats, hall_of_fame=hof, verbose=True)
File "/Users/weizhang/software/backup/geppy-master/geppy/algorithms/basic.py", line 100, in gep_simple
for ind, fit in zip(invalid_individuals, fitnesses):
File "testGEPPY_DNS.py", line 71, in evaluate
Rp_i=np.array([[Yp[i,1],Yp[i,2],Yp[i,3]],[Yp[i,2],Yp[i,4],Yp[i,5]],[Yp[i,3],Yp[i,5],Yp[i,:6]]])
IndexError: too many indices for array

#print (np.shape(Yp),Yp)
a=0
b=0
c=0
for i in range(size):
   Ri=np.array([[bij[i,1],bij[i,2],bij[i,3]],[bij[i,2],bij[i,4],bij[i,5]],[bij[i,3],bij[i,5],bij[i,:6]]])
   Rp_i=np.array([[Yp[i,1],Yp[i,2],Yp[i,3]],[Yp[i,2],Yp[i,4],Yp[i,5]],[Yp[i,3],Yp[i,5],Yp[i,:6]]])
#   print (Ri,np.shape(Ri))
   a=a+np.tensordot(Rp_i,Ri)
   b=a+np.tensordot(Ri,Ri.T) 
   c=c+np.tensordot(Rp_i,Rp_i.T) 

return a/(b*c),

I tried print the PB, seems give me a array of(N,1) rather than expected (N,3,3)
can you please give me some suggestion

cannot install geppy

Hi, I cannot install geppy neither by pip nor by cd geppy. I'd like to ask for solution. Thanks!

shuhuagao / geppy Goto Github PK

geppy's Introduction

geppy: a gene expression programming framework in Python

What is GEP?

geppy and DEAP

Features

Installation

From PyPI (recommended)

From source

Documentation

Examples

Simple symbolic regression

Advanced symbolic regression

Requirements

Common pitfalls in using GP

Reference

How to cite geppy

geppy's People

Contributors

Stargazers

Watchers

Forkers

geppy's Issues

print (Ri,np.shape(Ri))

Recommend Projects

Recommend Topics

Recommend Org