Comments (23)
@certik Mostly just agreement on my part. I am not a big user of linear algebra so I'm reluctant to weigh in here too heavily. But since you asked...
-
I agree subroutine is the best option. You could use a type to bundle eigenvalues and eigenvectors, but then the user would have to
use
both the subroutine and the type which is clunky.-
A related consideration is ordering of arguments: output first or input first? I only ask because the decision will likely propagate into later functions. (And to tell the truth, I am not sure which I prefer...)
-
I think @ivan-pi's suggestion of a solver object is interesting, but could also be designed to support the simple subroutine API.
-
-
Agreed that allocation should not be used for output, and internally stack should be used if possible. Users need to control OS resources.
-
I think in-place does need to be considered at some level; linalg operations are great at blowing away memory (again, users should control resources). Not sure if it should be in a separate function, but I think it ought to be at least provisionally resolved before development starts.
-
Agree that users ought to have algorithm control, e.g. stiff matrices, or reproducibility (#12). But I also think it can be addressed later in development an optional argument.
- Should sparse arrays be considered here? Or separately? (This can also be handled later, btw!)
Mostly though I agree with @milancurcic - make the 🍌 !
from stdlib.
@milancurcic, @ivan-pi I think you just demonstrated that we need both "copy" and "inplace" API.
At best, I'd say we may need both. Let's focus on one, and consider another later.
I'd caution against involving derived types or OOP here from the get-go. Obviously there are implementations with procedures only. I suggest taking an incremental approach here:
- Implement the most basic building block, say,
eig
that doesn't modify in-place (makes a copy); - Evaluate and hear from community if in-place variant is really needed. It's likely to be more computationally efficient, albeit slightly less elegant. Does the community want it? Great! Let's do it;
- Evaluate the basic 2 implementations and discuss if higher-level abstractions (derived types) are needed.
In summary, let's make a banana first, and jungle later. Gorilla will come for the banana sooner or later. Let's worry about it then.
from stdlib.
Example of using generic interface to specific implementations:
interface :: eig
module procedure :: deig_copy, deig_inplace
...
end interface eig
contains
subroutine deig_copy(A, lam, c)
real(dp), intent(in) :: A(:, :) ! matrix A
complex(dp), intent(out) :: lam(:) ! eigenvalues: A c = lam c
complex(dp), intent(out) :: c(:, :)
...
subroutine deig_inplace(A, lam)
real(dp), intent(in out) :: A(:, :) ! matrix A and result c
complex(dp), intent(out) :: lam(:) ! eigenvalues: A c = lam c
...
from stdlib.
It could be done with optional
argument instead. I don't know if there are performance penalties. I like specific procedures + generic interface better as its more explicit and clear what it does, even if more verbose. With optional arguments, A
would be intent(in out)
in both modes, and it's modified only in one.
from stdlib.
Let's discuss the API for eig
. Fortran cannot return more than one result in a function, and eig
must return both eigenvalues and eigenvectors. So the only solution is a subroutine. The first argument should be the input matrix, to be consistent with NumPy and Matlab. The second argument can be an optional B
matrix for the generalized eigenvector problem. The next argument can be eigenvalues and the last argument eigenvectors (NumPy, Julia style) or they can be switched (Matlab style). For efficiency reasons the eigenvalues should be just a vector (NumPy, Julia style) not a diagonal matrix (Matlab style: the default is a matrix, but the eigvalOption
input option can specify vector
to return a vector as well).
That reasoning leads to the following API:
interface eig
module procedure seig
module procedure seig_generalized
module procedure deig
module procedure deig_generalized
module procedure zeig
module procedure zeig_generalized
end interface eig
Here s
means single precision, d
double, z
complex (double). Here is an example of a double precision interface:
subroutine deig(A, lam, c)
! Solves a standard eigenproblem A*c = lam*c
real(dp), intent(in) :: A(:, :) ! matrix A
complex(dp), intent(out) :: lam(:) ! eigenvalues: A c = lam c
complex(dp), intent(out) :: c(:, :) ! eigenvectors: A c = lam c; c(i,j) = ith component of jth vec.
...
end subroutine
subroutine deig_generalized(A, B, lam, c)
! Solves a generalized eigenproblem A*c = lam*B*c
real(dp), intent(in) :: A(:, :) ! matrix A
real(dp), intent(in) :: B(:, :) ! matrix B
complex(dp), intent(out) :: lam(:) ! eigenvalues: A c = lam c
complex(dp), intent(out) :: c(:, :) ! eigenvectors: A c = lam c; c(i,j) = ith component of jth vec.
...
end subroutine
from stdlib.
Here are some common points that should be discussed:
-
Should the return arrays be
allocatable, intent(out)
? I would say no, for efficiency reasons. -
Should there be some optional options such as
algorithm
in the Matlab's API? I think there could. -
We need to implement all combinations of types. The above proposal assumes real input matrices. We should also allow a complex input matrix.
-
Modify inplace? Lapack sometimes returns the result by overwriting the input matrix, although in the case of
eig
that does not seem to be the case (it is the case forcholesky
), but the input matrix is destroyed, so internally there is an extra copy happening. The simple solution is the above proposal, that requires an extra copy, but the input matrix is never modified. Another solution is to have two kinds ofeig
subroutines, the one above, and another one calledeig_inplace
which will modify the matrixA
and return the eigenvalues there.
from stdlib.
@cmacmackin, @milancurcic, @septcolor what do you think of this proposal? I chose it so that it's something we might be able to agree quickly, and then we can figure out the proper workflow.
from stdlib.
I'm not experienced with linalg stuff, but all looks reasonable. Procedure names are short and intuitive.
I agree that allocatable
for intent(out)
dummy arguments should be avoided unless necessary.
I don't like APIs that modify variables in-place, so I suggest we make an extra copy in default implementation, but if desired by the community we can also have a variant with an intent(in out)
argument as you suggested.
Otherwise, full steam ahead as far as I'm concerned.
from stdlib.
From my experience with using LAPACK routines in some fluid-solvers (immersed boundary method, lattice boltzmann equation, etc) and PDE routines (orthogonal collocation), the right hand-side is rarely required after the solution of the system. If it is needed the user can always make a copy beforehand. Would the A matrix be returned factorize or would there be another copy?
real, allocatable(:) :: x, b
real, allocatable(:) :: A
A = reshape([1,3,2,4],[2,2])
b = [5, 6]
x = b ! user creates the copy
call solve(A,x) ! solve A*x=b
If you follow a more functional style you could write
x = solve(A,b)
The annoying thing is that a subroutine and a function cannot be in the same generic interface block (perhaps something to discuss in the J3 Fortran proposal), meaning we would need two routines with different names.
While I like the idea to have convenience routines, I think it is also worth considering a more object-oriented interface, similar to the one in the Eigen library: http://eigen.tuxfamily.org/dox/group__TutorialLinearAlgebra.html. A major difference compared to Eigen is that in Fortran we would like to use native real arrays and not some abstract matrix data types, meaning the solvers would likely become the abstract data type, e.g.:
type(linalg_solver) :: solver
! ... create A and b ...
call solver%init(A,factorization="LU")
x = solver%solve(b)
from stdlib.
@milancurcic, @ivan-pi I think you just demonstrated that we need both "copy" and "inplace" API. Julia does that, they append !
after the name of functions that modify arguments inplace, e.g. cholesky vs cholesky!. Fortran does not allow !
as part of the name, so we need some other convention. Any ideas?
Finally, @ivan-pi also has a good point that besides the direct "copy" and "inplace" API, some people might also prefer an object oriented API. So I think we might need 3 APIs.
@milancurcic so I can submit a PR for eig
. But I would like new code to land in some kind of "experimental" namespace in stdlib
. So that we can start using it and trying it out, but to reserve a right to change the API, if we discover issues. Only after some time and real usage in real codes, we should consider moving a functionality from experimental to the actual stdlib, after which we must support the API in a backwards compatible way.
from stdlib.
@jacobwilliams, @marshallward do you have any thoughts on this workflow? Let's just concentrate on eig
for now. We'll tackle solve
and others later.
from stdlib.
Personally, my preference would be for using derived types to store the matrix and/or its factorisation. With LAPACK, the factorisation requires more information than can be stored in the original matrix (e.g., pivot information), requiring additional arrays to be passed in. There are use-cases that reuse the factorisation, so we don't want that information to be lost. An example of an OOP approach can be found here: https://github.com/cmacmackin/OOP-Fortran-Examples/blob/master/04-lapack-wrapper.f90.
In terms of returning multiple results, another approach, aside from using a subroutine, would be to define a transparent derived type with these results as components and just to return that.
from stdlib.
Fortran does not allow ! as part of the name, so we need some other convention. Any ideas?
In case of eig
, it seems that both variants would need to be subroutines. The number of arguments will be different for the "copy" and "in-place" specific procedures. Can we then use a generic interface for both?
from stdlib.
I agree with your plan to start with eig
, then possibly an inplace variant, and then possibly an OO interface.
from stdlib.
Regarding argument ordering: NumPy, Matlab and Julia just return the result. So we do not have a prior implementation. To me the natural thing is to do output arguments at the end of the argument list. So the API is then very similar to NumPy: one simply calls eig()
like you would in NumPy, and then one looks up the output arguments and appends them.
Which ordering do people prefer?
from stdlib.
1. Should sparse arrays be considered here? Or separately? (This can also be handled later, btw!)
Since sparse arrays and sparse matrix solvers use quite different data structures than dense arrays and factorizations, I would prefer to see them in a separate module. That is also how it is done in Scipy (https://docs.scipy.org/doc/scipy/reference/sparse.html).
from stdlib.
@milancurcic I like your idea. My only worry is if this can work for all our routines or not. If not, then for consistency reason I would prefer to be explicit. Some ideas to use for eig!
:
eig_inplace
eig_i
oreig_I
eigI
From these I like the eigI
version the most. It's visually similar to eig!
, and I
is for inplace.
from stdlib.
1. Should sparse arrays be considered here? Or separately? (This can also be handled later, btw!)
Since sparse arrays and sparse matrix solvers use quite different data structures than dense arrays and factorizations, I would prefer to see them in a separate module. That is also how it is done in Scipy (https://docs.scipy.org/doc/scipy/reference/sparse.html).
I agree with @ivan-pi: I would also prefer to see sparse matrices in a separate module, especially that there are mutiple ways to store sparse matrices.
Are sparse matrices something of high interest by the users? We could start with simple formats (e.g., COO and CRS3) and associate them to e.g., sparse BLAS?
from stdlib.
Example of using generic interface to specific implementations:
interface :: eig module procedure :: deig_copy, deig_inplace ... end interface eig contains subroutine deig_copy(A, lam, c) real(dp), intent(in) :: A(:, :) ! matrix A complex(dp), intent(out) :: lam(:) ! eigenvalues: A c = lam c complex(dp), intent(out) :: c(:, :) ... subroutine deig_inplace(A, lam) real(dp), intent(in out) :: A(:, :) ! matrix A and result c complex(dp), intent(out) :: lam(:) ! eigenvalues: A c = lam c ...
Why not defining only one deig
subroutine with optional argument c
? Are there efficiency penalties? Or it is to keep the implementation more streamlined?
from stdlib.
Some ideas to use for eig!:
eig_inplace
eig_i
or eig_I`eigI
From these I like the
eigI
version the most. It's visually similar toeig!
, andI
is for inplace.
I like eig_inplace
and eig_i
best, but eigI
is okay. I'm not the likely user of this procedure, so happy to defer the vote to others. :)
from stdlib.
We can do eig_inplace
. Presumably it will not be used as often as just eig
, and so it's ok to make it a bit more verbose to know what it is doing.
from stdlib.
One thing we should try to do is to keep the serial API consistent with the possible parallel API in #67.
from stdlib.
I split the in-place discussion into its own issue #177.
from stdlib.
Related Issues (20)
- Support sorting arrays of bitsets HOT 2
- bitset count performance HOT 10
- A runtime error occurs when assigning a variable of `bitset_large` HOT 7
- Incorrect result for gamma functions of pure imaginary argument HOT 5
- Test: string_intrinsic (move) failed built with GCC 13.2 HOT 3
- Adding a procedure getting all keys in the hashmaps HOT 2
- Improve fpm build in README for faster getting started HOT 1
- test fault by string_intrisic HOT 1
- 'Sleep' not working when compiled in Win32 (x86) HOT 9
- Compilation error on Linux. Disabling hash/hashmap files compiles rest of the project successfully. HOT 6
- `string_intrinsic (Subprocess aborted)` failed HOT 7
- `bits()` in `bitset_type` returned a non-zero value even though no value is set. HOT 1
- Support for linear algebra HOT 6
- Hashtable write error on exit HOT 7
- use stdlib_math, only : gcd HOT 4
- The file src/stdlib_hashmap_chaining.f90 and two others are set as executable HOT 1
- Support for I/O of standard formats HOT 10
- Memory mapped features HOT 9
- Add `.gitignore` to `stdlib-fpm` deployment for generated test files HOT 1
- hash_functions test fails on i386: `Segmentation fault - invalid memory reference`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stdlib.