Giter Site home page Giter Site logo

jeffrey-hokanson / psdr Goto Github PK

View Code? Open in Web Editor NEW
12.0 5.0 4.0 7.05 MB

Parameter space dimension reduction toolbox

License: GNU Affero General Public License v3.0

Python 32.76% Jupyter Notebook 66.77% Dockerfile 0.40% Shell 0.06%
parameter-reduction python

psdr's Introduction

PSDR: Parameter Space Dimension Reduction Toolbox

PyPI version Documentation Status Build Status Coverage Status

Author: Jeffrey M. Hokanson, Postdoctoral Fellow at the University of Colorado Boulder ([email protected])

Introduction

Given a function mapping some subset of an m-dimensional space to a scalar value

f: D subset R^m to R

parameter space dimension reduction seeks to identify a low-dimensional manifold of the input along which this function varies the most. Frequently we will choose to use a linear manifold and consequently identify linear combinations of input variables along which the function varies the most; we call this subspace-based dimension reduction. There also techniques that identify a set of active variables (coordinate-based dimension reduction) and methods that identity low-dimensional nonlinear manifolds of the input (nonlinear dimension reduction).

We emphasize that this library is for parameter space dimension reduction as the term 'dimension reduction' often appears in other contexts. For example, model reduction is often referred to as dimension reduction because it reduces the state-space dimension of a set of differential equations, yielding a smaller set of differential equations.

Simple example

One basic use of the library is to identify an active subspace using the outer product of gradients:

import psdr, psdr.demos
fun = psdr.demos.Borehole()    # load a test problem
X = fun.domain.sample(1000)    # sample points from the domain with uniform probabilty
grads = fun.grad(X)            # evaluate the gradient at the points in X
act = psdr.ActiveSubspace()    # initialize a class to find the Active Subspace
act.fit(grads)                 # estimate the active subspace using these Monte-Carlo samples
print(act.U[:,0])              # print the most important linear combination of variables

>>> array([ 9.19118904e-01, -2.26566967e-03,  2.90116247e-06,  2.17665629e-01,
        2.78485430e-03, -2.17665629e-01, -2.21695479e-01,  1.06310937e-01])

We can then create a shadow plot showing the projection of the input to this function onto a one-dimensional subspace spanned by the important linear combination identified above

import matplotlib.pyplot as plt
fX = fun(X)                    # evaluate the function at the points X
act.shadow_plot(X, fX)         # generate the shadow plot
plt.show()                     # draw the results

A shadow plot for the borehole function

We say this function is has low-dimensional structure since the output of the function is well described by the value of this one linear combination of its input parameters.

Documentation

For further documentation, please see our page on Read the Docs.

Similar Software

Contributing

I welcome contributions to this library, particularly of test functions similar to those in psdr.demos. Please submit a pull request along with unit tests for the proposed code. If you are submitting a complex test function that requires calling code outside of Python, please submit a Docker image along with a docker file generating that image (see the OpenAeroStruct demo function for an example of how to do this).

Contributors

  • Zach Grey
  • Lakshya Sharma

psdr's People

Contributors

jeffrey-hokanson avatar laksharma30 avatar zgrey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

psdr's Issues

Polynomial kernel approximation for large samples

Hi @jeffrey-hokanson, I found this library last week and I think it's great. I would like to compare the polynomial approximation to machine learning approaches like neural networks. However, I tried running the library and got an error when trying to use 50,000 samples. The SVD in PolynomialRidgeApproximation._finish requires too much memory and I was wondering if it's really necessary. Could it be replaced with linalg.lstsq? I don't fully understand what is going on there, but from the description it's trying to find the c coefficients that match the optimal U, isn't it?

Printer Actuator

A test problem with 6 variables based on an actuator for a dot-matrix printer. See Sec. 6, in

Large Sample Properties of Simulations Using Latin Hypercube Sampling. Michael Stein, Technometrics, Vol. 29, No. 2 (May, 1987), pp. 143-151

Investigate Halfspace for Bounded Voronoi Vertex Computation

See documentation from scipy: https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.HalfspaceIntersection.html#scipy.spatial.HalfspaceIntersection

Idea: The structure of LinIneqDomain yields a series of halfplanes that define the space. We can think of the equidistant constraints for Voronoi diagrams as similarly defining halfplanes. Thus enumerating through the points x, we can construct the bounded Voronoi diagram. Likely this is not feasible for more than a few dimensions.

Add support for constraints in ConvexHullDomain

In the current design of ConvexHullDomain, we cannot add constraints of the form:

  • linear inequality constraints
  • linear equality constraints
  • quadratic constraints

This should be simple to support by extending the constraints when solving the LPs for each of the main problems.

Faster Minimax Design

For the minimax design algorithm based on clustering [MJ18], we could replace the approximate max-norm center computed using the q-norm, with an exact max-center using CVXPY. To improve performance when there are a large number of points, we could incorporate Blitz [JG15] to estimate a working set and use Ritter's Bounding Sphere algorithm to initialize the feasible point.

Unicode in domain string descriptions

When printing domain information about the domain, the code currently renders this information in ASCII; e.g.,

>> print(dom)
<LinIneqDomain on R^8; 1 linear equality constraints>

However, it would be nice if we could use Unicode to render a nicer text description; e.g.,

>> print(dom)
<LinIneqDomain on ℝ⁸; 1 linear equality constraints>

Pros:

  • Similar to mathematical notation
  • Looks nicer

Cons:

  • Increases complexity
  • May not render on all text terminals
  • Unsure about how to encode fall back ASCII for limited terminals

Implement test problems from Hoburg and Abbeel

See paper [HA14]: Geometric Programming for Aircraft Design Optimization by Warren Hoburg and Pieter Abbeel in AIAA J 2014.

Although all their examples are geometric programs that are convex after a change in coordinates, these are useful test problems for treating as non-convex programs.

Improved quadrature rules for generic domains

The current quadrature rule is based on tensor product Gauss rules or a simple Monte-Carlo rule. It would be nice to have better rules, perhaps invoking the Lipschitz matrix to weight directions appropriately.

PRA - degree 1 polynomial broken

Resulting fit looks incorrect. See /PSDR/docs/source/ASAP.ipynb

pra = psdr.PolynomialRidgeApproximation(degree = 1,subspace_dimension = 1)

Reduce size of MULTIF docker image

The current MULTI-F docker image is larger than 5GB. This is causing testing to be slow
as travis-ci re-downloads this every time and will be a barrier to other users. So the goal should be to reduce the size of this image.

Compare methods to UQLab

UQLab is a Matlab based toolbox for uncertainty quantification developed out of ETH Zurich. Of particular importance is comparisons for reliability based design optimization (RBDO) in comparison to those based on ridge approximation implemented in this library.

Improve initial_sample for dimension >3

The function initial_sample tries to provide points uniformly sampled in the space defined by the metric L on the space. In an earlier implementation this was done by constructing samples uniformly randomly from convex combinations. Unfortunately, numerical experiments suggested this performed worse than simple random sampling. However, the fix I have adopted (sampling on the convex hull of these low-dimensional points) scales exponentially in dimension. Hence, the heuristic instead simply choses randomly in this instance

Create new tutorials and update old ones

Several of the tutorial notebooks use an older style of handling functions and constraints. Update these examples to reference the more modern approach (e.g., fun.domain) in which normalization is handled transparently.

Alternative bandwidth selection methods for local_linear

The performance of OuterProductGradient depends heavily on the accuracy of local_linear, which in turn depends heavily on the choice of bandwidth for the kernel weighting the importance of the other samples based on the distance in the ambient space. The perplexity based method works better than the static bandwidth recommended by Li18, but there may be better approaches. There is a wide literature on bandwidth selection techniques, see, e.g., kedd documentation (an R package) for a brief summary of many. Unfortunately, most of these are quadratic in the number of samples and don't seem to suggest a per-sample based bandwidth.

NACA0012 with Free-Form Deformation shape perturbation

Modify the NACA0012 test problem to include this parameterization. I'd like to use piecewise linear boxes and encode a negative curvature constraint by requiring that for each set of three adjacent points, the finite-difference approximation of the second derivative is negative for the upper surface and positive for lower surface.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.