jeffrey-hokanson / psdr Goto Github PK

View Code? Open in Web Editor NEW

12.0 5.0 4.0 7.05 MB

Parameter space dimension reduction toolbox

License: GNU Affero General Public License v3.0

Python 32.76% Jupyter Notebook 66.77% Dockerfile 0.40% Shell 0.06%

parameter-reduction python

psdr's Introduction

PSDR: Parameter Space Dimension Reduction Toolbox

Author: Jeffrey M. Hokanson, Postdoctoral Fellow at the University of Colorado Boulder ([email protected])

Introduction

Given a function mapping some subset of an m-dimensional space to a scalar value

parameter space dimension reduction seeks to identify a low-dimensional manifold of the input along which this function varies the most. Frequently we will choose to use a linear manifold and consequently identify linear combinations of input variables along which the function varies the most; we call this subspace-based dimension reduction. There also techniques that identify a set of active variables (coordinate-based dimension reduction) and methods that identity low-dimensional nonlinear manifolds of the input (nonlinear dimension reduction).

We emphasize that this library is for parameter space dimension reduction as the term 'dimension reduction' often appears in other contexts. For example, model reduction is often referred to as dimension reduction because it reduces the state-space dimension of a set of differential equations, yielding a smaller set of differential equations.

Simple example

One basic use of the library is to identify an active subspace using the outer product of gradients:

import psdr, psdr.demos
fun = psdr.demos.Borehole()    # load a test problem
X = fun.domain.sample(1000)    # sample points from the domain with uniform probabilty
grads = fun.grad(X)            # evaluate the gradient at the points in X
act = psdr.ActiveSubspace()    # initialize a class to find the Active Subspace
act.fit(grads)                 # estimate the active subspace using these Monte-Carlo samples
print(act.U[:,0])              # print the most important linear combination of variables

>>> array([ 9.19118904e-01, -2.26566967e-03,  2.90116247e-06,  2.17665629e-01,
        2.78485430e-03, -2.17665629e-01, -2.21695479e-01,  1.06310937e-01])

We can then create a shadow plot showing the projection of the input to this function onto a one-dimensional subspace spanned by the important linear combination identified above

import matplotlib.pyplot as plt
fX = fun(X)                    # evaluate the function at the points X
act.shadow_plot(X, fX)         # generate the shadow plot
plt.show()                     # draw the results

We say this function is has low-dimensional structure since the output of the function is well described by the value of this one linear combination of its input parameters.

Documentation

For further documentation, please see our page on Read the Docs.

Similar Software

Contributing

I welcome contributions to this library, particularly of test functions similar to those in psdr.demos. Please submit a pull request along with unit tests for the proposed code. If you are submitting a complex test function that requires calling code outside of Python, please submit a Docker image along with a docker file generating that image (see the OpenAeroStruct demo function for an example of how to do this).

Contributors

Zach Grey
Lakshya Sharma

psdr's People

Contributors

Stargazers

Watchers

Forkers

laksharma30 zgrey vishalbelsare xinwang-hnu

psdr's Issues

Wrap: "Optimizing color picture tubes by high-cost nonlinear programming"

A paper by den Hertog and Stehouwer describes four different problems involving a color CRT television design problem. These problems have 7, 23, 5, and 14 different parameters and as such, are a good candidate for parameter space dimension reduction. The code was expensive to run in 2002 (several hours).

Polynomial kernel approximation for large samples

Hi @jeffrey-hokanson, I found this library last week and I think it's great. I would like to compare the polynomial approximation to machine learning approaches like neural networks. However, I tried running the library and got an error when trying to use 50,000 samples. The SVD in PolynomialRidgeApproximation._finish requires too much memory and I was wondering if it's really necessary. Could it be replaced with linalg.lstsq? I don't fully understand what is going on there, but from the description it's trying to find the c coefficients that match the optimal U, isn't it?

Elliptic PDE Demo

Implement the Elliptic PDE example described in https://epubs.siam.org/doi/10.1137/130916138. Old implementation in VarProRidge uses numpad.

Printer Actuator

A test problem with 6 variables based on an actuator for a dot-matrix printer. See Sec. 6, in

Large Sample Properties of Simulations Using Latin Hypercube Sampling. Michael Stein, Technometrics, Vol. 29, No. 2 (May, 1987), pp. 143-151

Constrained Latin Hyper Cube Designs

See paper on this topic PIAL10.

Investigate Halfspace for Bounded Voronoi Vertex Computation

See documentation from scipy: https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.HalfspaceIntersection.html#scipy.spatial.HalfspaceIntersection

Idea: The structure of LinIneqDomain yields a series of halfplanes that define the space. We can think of the equidistant constraints for Voronoi diagrams as similarly defining halfplanes. Thus enumerating through the points x, we can construct the bounded Voronoi diagram. Likely this is not feasible for more than a few dimensions.

Write manuscript for JOSS

Write a short (250-1000 word) paper.md and place into the repository.

See instructions at: https://joss.readthedocs.io/en/latest/submitting.html

Add support for constraints in ConvexHullDomain

In the current design of ConvexHullDomain, we cannot add constraints of the form:

linear inequality constraints
linear equality constraints
quadratic constraints

This should be simple to support by extending the constraints when solving the LPs for each of the main problems.

Faster Minimax Design

For the minimax design algorithm based on clustering [MJ18], we could replace the approximate max-norm center computed using the q-norm, with an exact max-center using CVXPY. To improve performance when there are a large number of points, we could incorporate Blitz [JG15] to estimate a working set and use Ritter's Bounding Sphere algorithm to initialize the feasible point.

Unicode in domain string descriptions

When printing domain information about the domain, the code currently renders this information in ASCII; e.g.,

>> print(dom)
<LinIneqDomain on R^8; 1 linear equality constraints>

However, it would be nice if we could use Unicode to render a nicer text description; e.g.,

>> print(dom)
<LinIneqDomain on ℝ⁸; 1 linear equality constraints>

Pros:

Similar to mathematical notation
Looks nicer

Cons:

Increases complexity
May not render on all text terminals
Unsure about how to encode fall back ASCII for limited terminals

Sobol sequences respecting density functions for random variables

In sobol_seq, the inverse survival function is used to map the [0,1] Sobol sequence to a domain specified by a multivariate normal; generalize this to the random domains generated allowed by our library.

Implement test problems from Hoburg and Abbeel

See paper [HA14]: Geometric Programming for Aircraft Design Optimization by Warren Hoburg and Pieter Abbeel in AIAA J 2014.

Although all their examples are geometric programs that are convex after a change in coordinates, these are useful test problems for treating as non-convex programs.

Gaussian Ridge Functions

An alternating algorithm for constructing a ridge approximation using a GP ridge profile.

See paper: https://arxiv.org/abs/1802.00515

Improved quadrature rules for generic domains

The current quadrature rule is based on tensor product Gauss rules or a simple Monte-Carlo rule. It would be nice to have better rules, perhaps invoking the Lipschitz matrix to weight directions appropriately.

MHD Hartmann toy problem

See discussion in section 4.1 of https://doi.org/10.1002/sam.11355

PRA - degree 1 polynomial broken

Resulting fit looks incorrect. See /PSDR/docs/source/ASAP.ipynb

pra = psdr.PolynomialRidgeApproximation(degree = 1,subspace_dimension = 1)

Reduce size of MULTIF docker image

The current MULTI-F docker image is larger than 5GB. This is causing testing to be slow
as travis-ci re-downloads this every time and will be a barrier to other users. So the goal should be to reduce the size of this image.

GaussianProcess with tril structure can become ill-conditioned, yielding nans response.

Wrap Nuclear Reactor Model BISON

See discussion of setup in RK20 Sec. 3.4.

Compare methods to UQLab

UQLab is a Matlab based toolbox for uncertainty quantification developed out of ETH Zurich. Of particular importance is comparisons for reliability based design optimization (RBDO) in comparison to those based on ridge approximation implemented in this library.

DOM - parameter domain sampling over 1-dim subspace broken

See /PSDR/docs/source/ASAP.ipynb

dom2 = dom.add_constraints(A_eq = pra.U.T, b_eq = [0])
dom2.sample(1)

Gradients of Gaussian Process

The GP class should also allow the user to compute the gradient.

Improve initial_sample for dimension >3

The function initial_sample tries to provide points uniformly sampled in the space defined by the metric L on the space. In an earlier implementation this was done by constructing samples uniformly randomly from convex combinations. Unfortunately, numerical experiments suggested this performed worse than simple random sampling. However, the fix I have adopted (sampling on the convex hull of these low-dimensional points) scales exponentially in dimension. Hence, the heuristic instead simply choses randomly in this instance

Create new tutorials and update old ones

Several of the tutorial notebooks use an older style of handling functions and constraints. Update these examples to reference the more modern approach (e.g., fun.domain) in which normalization is handled transparently.

Active Subspace via Morris Screening

Implement the algorithm proposed in Gradient free active subspace construction using Morris screening elementary effects by Lewis, Smith, and Williams.

This will likely require implementing more generic screening methods appropriate for coordinate-based dimension reduction.

Implement multivariate Arnoldi approach for polynomial bases

Following Austin et al, implement a multivariate Arnoldi algorithm following Brubeck, Nakatsukasa, & Trefethen. This should hopefully correct the conditioning issues inherent in constructing multivariate polynomial bases. In particular, this will improve the conditioning of PolynomialRidgeApproximation allowing potentially higher order accuracy.

Spring Example from "Engineering Design via Surrogate Modeling: A Practical Guide"

Implement "Modified Active Subspace" of Lee

See JUQ paper: "Modified Active Subspaces Using the Average of Gradients".

BladeX Demo Function

Reading this paper on parameter space dimension reduction, they identify an active subspace in propeller blade design. Two pieces of code are used:

Generating propeller blade profile: https://github.com/mathLab/BladeX
Flow solution: PROCAL (cannot find source code)

ONERA M6 Wing Optimization

This is an example appearing in Active Subspaces for Shape Optimization. It is unclear how they are setting up the free form deformation (FFD) box for this problem, however there are similar examples in the SU2 documentation and appears in the optimization test cases, however in this example they seem to have used only one design parameter.

Note this is a 3D fluids problem and as parameterized has 198 parameters.

Alternative bandwidth selection methods for local_linear

The performance of OuterProductGradient depends heavily on the accuracy of local_linear, which in turn depends heavily on the choice of bandwidth for the kernel weighting the importance of the other samples based on the distance in the ambient space. The perplexity based method works better than the static bandwidth recommended by Li18, but there may be better approaches. There is a wide literature on bandwidth selection techniques, see, e.g., kedd documentation (an R package) for a brief summary of many. Unfortunately, most of these are quadratic in the number of samples and don't seem to suggest a per-sample based bandwidth.

NACA0012 with Free-Form Deformation shape perturbation

Modify the NACA0012 test problem to include this parameterization. I'd like to use piecewise linear boxes and encode a negative curvature constraint by requiring that for each set of three adjacent points, the finite-difference approximation of the second derivative is negative for the upper surface and positive for lower surface.

jeffrey-hokanson / psdr Goto Github PK

psdr's Introduction

PSDR: Parameter Space Dimension Reduction Toolbox

Introduction

Simple example

Documentation

Similar Software

Contributing

Contributors

psdr's People

Contributors

Stargazers

Watchers

Forkers

psdr's Issues

Recommend Projects

Recommend Topics

Recommend Org