Giter Site home page Giter Site logo

Proposal for linalg about stdlib HOT 23 OPEN

fortran-lang avatar fortran-lang commented on May 25, 2024 3
Proposal for linalg

from stdlib.

Comments (23)

marshallward avatar marshallward commented on May 25, 2024 2

@certik Mostly just agreement on my part. I am not a big user of linear algebra so I'm reluctant to weigh in here too heavily. But since you asked...

  1. I agree subroutine is the best option. You could use a type to bundle eigenvalues and eigenvectors, but then the user would have to use both the subroutine and the type which is clunky.

    1. A related consideration is ordering of arguments: output first or input first? I only ask because the decision will likely propagate into later functions. (And to tell the truth, I am not sure which I prefer...)

    2. I think @ivan-pi's suggestion of a solver object is interesting, but could also be designed to support the simple subroutine API.

  2. Agreed that allocation should not be used for output, and internally stack should be used if possible. Users need to control OS resources.

  3. I think in-place does need to be considered at some level; linalg operations are great at blowing away memory (again, users should control resources). Not sure if it should be in a separate function, but I think it ought to be at least provisionally resolved before development starts.

  4. Agree that users ought to have algorithm control, e.g. stiff matrices, or reproducibility (#12). But I also think it can be addressed later in development an optional argument.

    1. Should sparse arrays be considered here? Or separately? (This can also be handled later, btw!)

Mostly though I agree with @milancurcic - make the 🍌 !

from stdlib.

milancurcic avatar milancurcic commented on May 25, 2024 1

@milancurcic, @ivan-pi I think you just demonstrated that we need both "copy" and "inplace" API.

At best, I'd say we may need both. Let's focus on one, and consider another later.

I'd caution against involving derived types or OOP here from the get-go. Obviously there are implementations with procedures only. I suggest taking an incremental approach here:

  • Implement the most basic building block, say, eig that doesn't modify in-place (makes a copy);
  • Evaluate and hear from community if in-place variant is really needed. It's likely to be more computationally efficient, albeit slightly less elegant. Does the community want it? Great! Let's do it;
  • Evaluate the basic 2 implementations and discuss if higher-level abstractions (derived types) are needed.

In summary, let's make a banana first, and jungle later. Gorilla will come for the banana sooner or later. Let's worry about it then.

from stdlib.

milancurcic avatar milancurcic commented on May 25, 2024 1

Example of using generic interface to specific implementations:

interface :: eig
  module procedure :: deig_copy, deig_inplace
  ...
end interface eig

contains

subroutine deig_copy(A, lam, c)
  real(dp), intent(in) :: A(:, :)  ! matrix A
  complex(dp), intent(out) :: lam(:)  ! eigenvalues: A c = lam c
  complex(dp), intent(out) :: c(:, :)
  ...

subroutine deig_inplace(A, lam)
  real(dp), intent(in out) :: A(:, :)  ! matrix A and result c
  complex(dp), intent(out) :: lam(:)  ! eigenvalues: A c = lam c
  ...

from stdlib.

milancurcic avatar milancurcic commented on May 25, 2024 1

It could be done with optional argument instead. I don't know if there are performance penalties. I like specific procedures + generic interface better as its more explicit and clear what it does, even if more verbose. With optional arguments, A would be intent(in out) in both modes, and it's modified only in one.

from stdlib.

certik avatar certik commented on May 25, 2024

Let's discuss the API for eig. Fortran cannot return more than one result in a function, and eig must return both eigenvalues and eigenvectors. So the only solution is a subroutine. The first argument should be the input matrix, to be consistent with NumPy and Matlab. The second argument can be an optional B matrix for the generalized eigenvector problem. The next argument can be eigenvalues and the last argument eigenvectors (NumPy, Julia style) or they can be switched (Matlab style). For efficiency reasons the eigenvalues should be just a vector (NumPy, Julia style) not a diagonal matrix (Matlab style: the default is a matrix, but the eigvalOption input option can specify vector to return a vector as well).

That reasoning leads to the following API:

  interface eig
     module procedure seig
     module procedure seig_generalized
     module procedure deig
     module procedure deig_generalized
     module procedure zeig
     module procedure zeig_generalized
  end interface eig

Here s means single precision, d double, z complex (double). Here is an example of a double precision interface:

    subroutine deig(A, lam, c)
    ! Solves a standard eigenproblem A*c = lam*c
    real(dp), intent(in) :: A(:, :)  ! matrix A
    complex(dp), intent(out) :: lam(:)  ! eigenvalues: A c = lam c
    complex(dp), intent(out) :: c(:, :)  ! eigenvectors: A c = lam c; c(i,j) = ith component of jth vec.
    ...
    end subroutine

    subroutine deig_generalized(A, B, lam, c)
    ! Solves a generalized eigenproblem A*c = lam*B*c
    real(dp), intent(in) :: A(:, :)  ! matrix A
    real(dp), intent(in) :: B(:, :)  ! matrix B
    complex(dp), intent(out) :: lam(:)  ! eigenvalues: A c = lam c
    complex(dp), intent(out) :: c(:, :)  ! eigenvectors: A c = lam c; c(i,j) = ith component of jth vec.
    ...
    end subroutine

from stdlib.

certik avatar certik commented on May 25, 2024

Here are some common points that should be discussed:

  • Should the return arrays be allocatable, intent(out)? I would say no, for efficiency reasons.

  • Should there be some optional options such as algorithm in the Matlab's API? I think there could.

  • We need to implement all combinations of types. The above proposal assumes real input matrices. We should also allow a complex input matrix.

  • Modify inplace? Lapack sometimes returns the result by overwriting the input matrix, although in the case of eig that does not seem to be the case (it is the case for cholesky), but the input matrix is destroyed, so internally there is an extra copy happening. The simple solution is the above proposal, that requires an extra copy, but the input matrix is never modified. Another solution is to have two kinds of eig subroutines, the one above, and another one called eig_inplace which will modify the matrix A and return the eigenvalues there.

from stdlib.

certik avatar certik commented on May 25, 2024

@cmacmackin, @milancurcic, @septcolor what do you think of this proposal? I chose it so that it's something we might be able to agree quickly, and then we can figure out the proper workflow.

from stdlib.

milancurcic avatar milancurcic commented on May 25, 2024

I'm not experienced with linalg stuff, but all looks reasonable. Procedure names are short and intuitive.

I agree that allocatable for intent(out) dummy arguments should be avoided unless necessary.

I don't like APIs that modify variables in-place, so I suggest we make an extra copy in default implementation, but if desired by the community we can also have a variant with an intent(in out) argument as you suggested.

Otherwise, full steam ahead as far as I'm concerned.

from stdlib.

ivan-pi avatar ivan-pi commented on May 25, 2024

From my experience with using LAPACK routines in some fluid-solvers (immersed boundary method, lattice boltzmann equation, etc) and PDE routines (orthogonal collocation), the right hand-side is rarely required after the solution of the system. If it is needed the user can always make a copy beforehand. Would the A matrix be returned factorize or would there be another copy?

real, allocatable(:) :: x, b
real, allocatable(:) :: A
A = reshape([1,3,2,4],[2,2])
b = [5, 6]
x = b  ! user creates the copy
call solve(A,x) ! solve A*x=b

If you follow a more functional style you could write

x = solve(A,b)

The annoying thing is that a subroutine and a function cannot be in the same generic interface block (perhaps something to discuss in the J3 Fortran proposal), meaning we would need two routines with different names.

While I like the idea to have convenience routines, I think it is also worth considering a more object-oriented interface, similar to the one in the Eigen library: http://eigen.tuxfamily.org/dox/group__TutorialLinearAlgebra.html. A major difference compared to Eigen is that in Fortran we would like to use native real arrays and not some abstract matrix data types, meaning the solvers would likely become the abstract data type, e.g.:

type(linalg_solver) :: solver
! ... create A and b ...
call solver%init(A,factorization="LU") 
x = solver%solve(b)

from stdlib.

certik avatar certik commented on May 25, 2024

@milancurcic, @ivan-pi I think you just demonstrated that we need both "copy" and "inplace" API. Julia does that, they append ! after the name of functions that modify arguments inplace, e.g. cholesky vs cholesky!. Fortran does not allow ! as part of the name, so we need some other convention. Any ideas?

Finally, @ivan-pi also has a good point that besides the direct "copy" and "inplace" API, some people might also prefer an object oriented API. So I think we might need 3 APIs.

@milancurcic so I can submit a PR for eig. But I would like new code to land in some kind of "experimental" namespace in stdlib. So that we can start using it and trying it out, but to reserve a right to change the API, if we discover issues. Only after some time and real usage in real codes, we should consider moving a functionality from experimental to the actual stdlib, after which we must support the API in a backwards compatible way.

from stdlib.

certik avatar certik commented on May 25, 2024

@jacobwilliams, @marshallward do you have any thoughts on this workflow? Let's just concentrate on eig for now. We'll tackle solve and others later.

from stdlib.

cmacmackin avatar cmacmackin commented on May 25, 2024

Personally, my preference would be for using derived types to store the matrix and/or its factorisation. With LAPACK, the factorisation requires more information than can be stored in the original matrix (e.g., pivot information), requiring additional arrays to be passed in. There are use-cases that reuse the factorisation, so we don't want that information to be lost. An example of an OOP approach can be found here: https://github.com/cmacmackin/OOP-Fortran-Examples/blob/master/04-lapack-wrapper.f90.

In terms of returning multiple results, another approach, aside from using a subroutine, would be to define a transparent derived type with these results as components and just to return that.

from stdlib.

milancurcic avatar milancurcic commented on May 25, 2024

Fortran does not allow ! as part of the name, so we need some other convention. Any ideas?

In case of eig, it seems that both variants would need to be subroutines. The number of arguments will be different for the "copy" and "in-place" specific procedures. Can we then use a generic interface for both?

from stdlib.

certik avatar certik commented on May 25, 2024

I agree with your plan to start with eig, then possibly an inplace variant, and then possibly an OO interface.

from stdlib.

certik avatar certik commented on May 25, 2024

Regarding argument ordering: NumPy, Matlab and Julia just return the result. So we do not have a prior implementation. To me the natural thing is to do output arguments at the end of the argument list. So the API is then very similar to NumPy: one simply calls eig() like you would in NumPy, and then one looks up the output arguments and appends them.

Which ordering do people prefer?

from stdlib.

ivan-pi avatar ivan-pi commented on May 25, 2024
1. Should sparse arrays be considered here?  Or separately?  (This can also be handled later, btw!)

Since sparse arrays and sparse matrix solvers use quite different data structures than dense arrays and factorizations, I would prefer to see them in a separate module. That is also how it is done in Scipy (https://docs.scipy.org/doc/scipy/reference/sparse.html).

from stdlib.

certik avatar certik commented on May 25, 2024

@milancurcic I like your idea. My only worry is if this can work for all our routines or not. If not, then for consistency reason I would prefer to be explicit. Some ideas to use for eig!:

  • eig_inplace
  • eig_i or eig_I
  • eigI

From these I like the eigI version the most. It's visually similar to eig!, and I is for inplace.

from stdlib.

jvdp1 avatar jvdp1 commented on May 25, 2024
1. Should sparse arrays be considered here?  Or separately?  (This can also be handled later, btw!)

Since sparse arrays and sparse matrix solvers use quite different data structures than dense arrays and factorizations, I would prefer to see them in a separate module. That is also how it is done in Scipy (https://docs.scipy.org/doc/scipy/reference/sparse.html).

I agree with @ivan-pi: I would also prefer to see sparse matrices in a separate module, especially that there are mutiple ways to store sparse matrices.
Are sparse matrices something of high interest by the users? We could start with simple formats (e.g., COO and CRS3) and associate them to e.g., sparse BLAS?

from stdlib.

epagone avatar epagone commented on May 25, 2024

Example of using generic interface to specific implementations:

interface :: eig
  module procedure :: deig_copy, deig_inplace
  ...
end interface eig

contains

subroutine deig_copy(A, lam, c)
  real(dp), intent(in) :: A(:, :)  ! matrix A
  complex(dp), intent(out) :: lam(:)  ! eigenvalues: A c = lam c
  complex(dp), intent(out) :: c(:, :)
  ...

subroutine deig_inplace(A, lam)
  real(dp), intent(in out) :: A(:, :)  ! matrix A and result c
  complex(dp), intent(out) :: lam(:)  ! eigenvalues: A c = lam c
  ...

Why not defining only one deig subroutine with optional argument c? Are there efficiency penalties? Or it is to keep the implementation more streamlined?

from stdlib.

milancurcic avatar milancurcic commented on May 25, 2024

Some ideas to use for eig!:

  • eig_inplace
  • eig_i or eig_I`
  • eigI

From these I like the eigI version the most. It's visually similar to eig!, and I is for inplace.

I like eig_inplace and eig_i best, but eigI is okay. I'm not the likely user of this procedure, so happy to defer the vote to others. :)

from stdlib.

certik avatar certik commented on May 25, 2024

We can do eig_inplace. Presumably it will not be used as often as just eig, and so it's ok to make it a bit more verbose to know what it is doing.

from stdlib.

certik avatar certik commented on May 25, 2024

One thing we should try to do is to keep the serial API consistent with the possible parallel API in #67.

from stdlib.

certik avatar certik commented on May 25, 2024

I split the in-place discussion into its own issue #177.

from stdlib.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.