Giter Site home page Giter Site logo

theslimvreal / pse---la-meets-ml Goto Github PK

View Code? Open in Web Editor NEW
2.0 4.0 2.0 81.85 MB

Numerical Linear Algebra meets Machine Learning - PSE Project

License: BSD 2-Clause "Simplified" License

TeX 52.49% Python 43.41% C++ 3.26% Shell 0.85%
python neural-network keras ginkgo linear-algebra matrix suitesparse machine-learning classification matrix-calculations

pse---la-meets-ml's People

Contributors

annaric avatar eoli-an avatar fabiankof avatar theslimvreal avatar yannickfunk avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

pse---la-meets-ml's Issues

Save standard names

When saving datasets, it should be saved with a fix name, together with the current date and time.

Test edge cases on command parsing

The command parsing still has some bugs, like an error when entering 2 spaces instead of 1.
There should be more tests testing this edge cases.

Collect required technologies

Make a list of all required/preferred technologies along with some documenation or tutorial links that help to get started.

Find important features of metrices

Find out, which features of a matrix really matter for the efficiency of an algorithm (e.g. percentage of zeors, diagonal arrangement of the values, etc...)

Create Wiki

At the moment all of our guides/documentations are in the /utils folder.
This is not very self explaining for 3rd party users.
Therefore we should start using the Wiki of our respitory. This will also create a uniform style of our guides/documentations.
What needs to be done is to move the informations contained in the /utils folder to the wiki of our repository.

Improve doxygen doccumentation

The documentation created by Doxygen can now be found here.
The documentation of the code is quite good, but the surrounding page doesn't look very pretty.
It would be great if someone works himself into the Doxygen configuration to find out what can be done here.

global test cases

Collect ideas and create global test cases for the specification sheet

Implement size in collector module

The collector doesn't use the size the user entered. This should be changed. Standard size should be 128 and written into the config file. The user can enter a different size.

Run travis on lsdf-cluster

In order to test our code that uses the Ginkgo library, we need the Ginkgo environment.
But because it's very difficult to set up Ginkgo, we could try to run our code on the lsdf cluster, where we have a working Ginkgo installation.

In order to do this, we would have set up the travis build process differently:

  • Travis needs to use ssh to get a connection with the cluster
  • to do this it either needs to be logged into the kit-vpn
  • or use the server the ATIS provieds for each student (ssh into this server to ssh into cluster)
  • after that it copies all the files to the lsdf server
  • runs pytest
  • uploads coverage to CodeClimate
  • creates documentation
  • uploads documentation to GitHub-Pages

This will result in a slower build process but gets us the opportunity to run tests using the Ginkgo library, which would be very good to have.
We would also not have to always install SSGet ourselves.

If this approach is good, needs to be tested/evaluated.

Fix no input crash

When entering no input the program crashes.
The problem is w try to pop an element from an empty array.
Need to be fixed.

Define first steps

Define what needs to be done soon

e.g. what are the main goals, that we want to achieve, ...

Suppress collector warnings

The collector sometimes prints error messages to the command line when trying to check the regularity of some matrices. This warning message should not be displayed, but handlet internaly.

Prevent program crashes when input is missing

Running the collector without a provided name results in a crash of the program. This should not happen. The other commands should also be checked if similar behavior can happen there too.

system model

work out scenarios and use cases for the specification sheet

Increase test coverage in view

The view still has a quite low test coverage. This can be increased by more unit test. Maybe with some integration tests on other modules this can also be increased.

Command for balanced data set

At the moment, most of the matrices in a dataset will be labled with the same solver.
This is not very good for the learning process of the neural netowork.
It would be good to give the user the opportunity to create a dataset, where each solver occurs the same amount of times.

This can either be accomplished by creating a new command that connects the collector module directly to the labeling module. With this you could label each matrix immediatly after fetching and only allow amount of matrices / amount of solver matrices to be labeled with the same solver and discard the rest of them. This might be very difficult and/or slow.

A more easy way would be to just give the labeler a new flag -b/--balanced. This will get the solver to trim his created dataset of labeled matrices to a dataset where each solver will occur the same amount of times. This approach will only be sensible with a huge amount of matrices, because some solvers occur close to never. It will also result in an undeterministic size of the labled matrices dataset.

create a small test dataset for the unit tests to use

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Creation of pattern images

Find out how we can convert a pattern image out of a given matrix so we can pass it to Keras.
Some information in the paper we got from Markus.

Random thoughts

I made a list of things that stood out to me while working on the specification sheet. I´d love if we could discuss them.

*Is it usefull to let the user decide how he wants to split the training and test data? Since he can not change any of the other hyperparamters it seems out of place

*Why is user able to specify a path to safe the neural network to, but he can not specify a path to safe the matricies and labeld greyscale pattern images(maybe default results folder for every module)

*if the user only wants to use our classifier it might be reasonable to let him use our nn and not worry about specifing a path where the nn is in the file system.

*When someone uses the classifier and when someone uses the labling module a greayscale pattern image conversion has to be done. Would it make sense to create another module for the greyscale pattern image?

*we could add a "durchführbarkeitsanalyse" (similar work has been done by our supervisors, there a many preexisting libaries)

  • since the "tipps" includes explaining our product we might want to add a page of explaination for the problem we want to solve(what are iterative solvers, why is our work usefull...)

*We each should review the work someone else has done

command line (system model)

Gather ideas for command line usage of our software.
Later on, work out a entire command set for our system

Object orientation in Python

We need to figure out, how fare we can go with object orientation in Python (like polymorphy,...). So someone needs to collect some information about this and potentially present to the team, what is possible and how things are done in Python.

Write Readme

The README.md file is the first thing a user sees when visiting our repository.
At the moment this file only contains the most basic information.
This should be extended by more explanations of our project together with some examples on how to use our project.
That means, first the user wants to know what our programm does.
The user needs to know how he can run our code (setup & installation).
The user needs to know how he can use our programm (interaction examples).
For more specific information you can reference our wiki.

Error handling on loader/saver

When using the loader or saver, they should raise an IOException if something went wrong.
Every usage of them should therefore be in a sepcial try/except block to make sure the file got correctly loaded.
This step is important so we don't have program crashes on incorrect inputs/config file configurations.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.