cgarling / starformationhistories.jl Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 18.29 MB

Fitting astrophysical star formation histories via CMD modelling.

License: MIT License

Julia 100.00%

starformationhistories.jl's People

Contributors

Stargazers

Watchers

starformationhistories.jl's Issues

Optimization in `fixed_amr`

With dense isochrone grids (> 1000 isochrones) the primary compute time for SFH optimization is spent in the composite! function doing the linear algebra to build the composite model. In fixed_amr, relative weights due to age-metallicity relations and metallicity distribution functions are fixed at runtime and the only optimization parameters are effectively star formation rates. As such, we could exclude isochrones with very low relative weights from the computation at runtime with (hopefully) minimal impact on convergence / quality of the fits. I'm thinking of adding a keyword argument, something like relweightcut, such that when considering the set of all relative weights aweights for isochrones of age a, isochrones with aweights[i] <= relweightcut * maximum(aweights) would be excluded from the composite model. With grids that are quite wide in metallicity-space this could lead to speedups ~2-5x. Thinking default should be for this behavior to be disabled, but it's a nice feature for applications where you might run fixed_amr many times in a loop (i.e., Monte Carlo applications) as long as it doesn't impact accuracy significantly.

In principle this could additionally be implemented into methods that fit the age-metallicity relation but you would have to redefine the selection in every loop iteration, and since the metallicity parameters are now being fit I worry that truncating the isochrone list could lead to instability or inaccuracy, so this application is low-priority.

\sigma uncertainties in `fit_templates`

Should they be exp.(sqrt.(diag(Optim.trace(result.mle.result)[end].metadata["~inv(H)"])) .* Optim.minimizer(result.mle.result))? Because you calculate the standard deviations in the transformed (log) space as sqrt.(diag(Optim.trace(result.mle.result)[end].metadata["~inv(H)"])) then you have to normalize them by multiplying by the means also in the transformed space which are Optim.minimizer(result.mle.result), and then you should transform them. Realistically I don't think you can directly transform standard errors like this anyway so the more robust way is probably to compute a confidence interval in the transformed (log) space and then transform that back to the original space but this is somewhat low priority for now.

Priors for MAP estimate and HMC sampling

If the templates are normalized to total stellar mass (not SFR), then setting uniform priors on their coefficients *does not * set uniform priors on their SFRs. These priors are not yet properly dealt with. I think the easiest way might be to allow passing template_logAge through to the methods that use priors, namely fit_templates, fit_templates_mdf, hmc_sample, and hmc_sample_mdf. You could then derive the timestep for each template and include a prior that accounts for the non-uniform timesteps.

GPU support

For large isochrone grids, the vast majority of the computation time is spent in the computational kernels in fitting/fitting_base.jl doing things like constructing the complex model from a linear combination of single-population models and computing the associated loglikelihood and gradient. The LL and grad contain slightly more complex logic, but the composite! method is just linear algebra and could be gainfully accelerated on GPUs. Would be worth trying with Metal.jl, probably through a package extension: see dev docs, discussion on Julia <1.9.

Interpolation performance

A main source of heap allocations seems to be that when the interpolants from Interpolations.jl are constructed with vector-valued data, they return vectors which allocate on the heap and there does not seem to be an in-place evaluation method. If we instead (probably in ingest_mags?) convert the mags (Vector{Vector{<:Real}}) to a Vector{StaticArrays.SVector{N, <:Real}} we can greatly reduce heap allocation and potentially improve the speed as well; an initial test with 2 filters demonstrates a change from 256 bytes heap and 105ns evaluation to 48 bytes heap and 26ns evaluation. Our entire loop iteration timing for the Chabrier2001LogNormal IMF is ~200 ns so this could be a significant savings.

Consider removing LoopVectorization.jl

LV is being deprecated for Julia 1.11+ and so we should consider moving away from it now. Should fall back to basic @simd or even remove the implementations that use these custom loops rather than the mul! calls.

One-based indexing

Most of the methods assume one based indexing but accept abstract array types (e.g., AbstractVector). Prior to release we should check this assumption with Base.require_one_based_indexing on the input arrays, or try to generalize the indexing, but I suspect that will not be worth the work.

Priors for metallicity-related variables

For CMDs with significant contamination, fitting age-metallicity relation variables (i.e., with what is now fit_templates_mdf) can result in very unphysical results. Should include some default weak priors to hopefully prevent these errors. These priors would ideally be user-facing via keyword arguments so they could be overwritten if desired. Problem with using any arbitrary user-provided function is that they need to be differentiable for use in the solvers. I suppose we could rely on autodiff, and for analytic differentials I believe there are standard APIs for this (maybe rrule and frule from ChainRulesCore.jl)?

Redo threading in sampling methods

Using threadid is not safe due to possibility of yields; see this blog post.

Sampling utilities from Pathfinder.jl

Pathfinder.jl has some interesting utilities for obtaining inverse Hessians (and draws from multivariate normal approximations) using BFGS and LBFGS from Optim.jl and other packages. Might be able to find some code that we could adapt, or perhaps we could utilize some of the Pathfinder functions instead. I think fit_mvnormals might be a good function to look at.

Rewrite `model_cmd` to take a single method for errors and completeness

A more complete description of observational errors and completeness requires all magnitudes at once, rather than the per-magnitude implementation we have currently. It is meant to be a simple implementation, but this generalization could make it much more useful. Then the case where the provided errfuncs and completefuncs are vectors could be handled by a special case method using an alternate dispatch signature.

Tracking issue for new methods in `sampling_update` branch

Initial implementation of MCMC sampling with free distance in fitting/mcmc_sampling_distance.jl looks good but does not yet have a convenience method like mcmc_sample, any documentation, or any tests. These will need to be added. Constructing the x0 for the walkers is a bit tricky so remember to explain that.

Probably we should start a new section in the documentation for alternative SFH implementations with extra parameters so we can keep these specialized methods separate from the main, simple implementations.

Next we are going to try MCMC + distance + fixed MDF. Might also want to do MCMC + distance + linear AMR model.

Efficiency of @turbo macro with 3D arrays rather than vectors of matrices

Wondering if the @turbo macro will work better (in composite! for example) if models is a 3D array rather than a 1-D vector of 2-D matrices. Worth checking out.

Binary sampling models

The Binaries type currently implements random pairing of stars sampled from the provided IMF. This is not actually a very realistic binary model. Think perhaps we should rename this to RandomBinaryPairs or something, and create another sampler that implements a uniform mass ratio model. In this case the provided IMF would actually give the distribution of system masses; we would then sample the mass ratio for binary systems uniformly (see, for example, https://academic.oup.com/mnrasl/article/430/1/L6/1181335).

BLAS threads for Optim.jl

Optimizations using Optim.jl's BFGS implementation seem to be spending a lot of time in BLAS calls. By default BLAS has multi-threading but for problems of our sizes this threading is not efficient. Worth profiling and examining more closely.

gradient of loglikelihood

We might be able to increase the efficiency of the gradient of the loglikelihood computation in fitting.jl if we define a custom method that derives the full gradient (so every partial) simultaneously and builds a dense representation of (data ./ composite). As it is, when we calculate each partial separately it recomputes for every partial.

Our solves are still quite fast but I think this would be an improvement.

Document formats for examples

Currently going with jupyter notebooks because they're easy and render natively on github, but doesn't lend well to inclusion on the docs pages. Might be better to go with something like Literate.jl or Weave.jl that is more explicitly designed to support conversion to easily hostable formats.

Change linear AMR model to same form as logarithmic

$\langle [\text{M} / \text{H}] \rangle(t) = \alpha \left( T_{max} - t \right) + \beta$
rather than
$\langle [\text{M} / \text{H}] \rangle(t) = \alpha t + \beta$
as it currently is.

This will make presentation clearer in the paper as the logarithmic AMR is
$\langle Z \rangle(t) = \alpha \left( T_{max} - t \right) + \beta$
so the two have a common form. This also allows us to enforce $\alpha>0$ which makes more sense as a slope than $\alpha<0$ as it was in the old model. It will necessitate some changes to scripts with science results to deal with the new formats but I think the change is worth it.

Fix `mini_spacing`

Currently only considers spacing along a single magnitude. It should consider the overall distance metric on the CMD, i.e. both y-axis mag and x-axis color. Current implementation only checks spacing along y-axis, which can cause spacing issues for horizontal/diagonal features like the HB or blue loop.

Some new doc pages missing from `docs/src/doc_index.md`

Add masking

Would like to support masking of the Hess diagrams (models, data) in the basic functions in src/fitting/fitting_base.jl to support situations like masking the horizontal branch. Adding masking to composite! hurts performance but we could probably do it in the loglikelihood calculations more performantly.

Test data and actions

I would like to be able to import an isochrone table during the Github Action CI run to test some of the functions whose behavior with test / random data may be hard to predict. I should be able to run wget or curl commands in the action, so if I can stage the file somewhere it's accessible, I should be able to download it during the action by putting the URL / authentication details in a repository secret.

This may work for actions, but it would complicate running the tests locally. May simply be best to distribute a small isochrone table for testing purposes? Or have the test script auto-download it if it's not present? Not sure what the best practice is here.

Remove cmdpoint.jl and add new implementation for covariant kernels

In the Gaussian2D-partials branch I have some code that uses the properly covariant kernels in the case that the filter that appears on the y axis also appears in the color on the x axis of the CMD. The implementation was poor and it didn't make much of a difference to the results at the time, so only some of that code is still in the main branch. Main problem was that you had to manually specify what type of covariance was to be used, which was annoying.

We can deal with this another way by keeping the current partial_cmd_smooth (which does not include covariance) but adding a new method that requires passing a table/dictionary/dataframe etc. and then the entries in that table that give the x and y axis filters. This is similar to the approach taken in simulate.jl where you have to pass mags, mag_names, and mag_lim_name to determine which magnitude should be used to calculate the limit. In this case you have to pass all the necessary single-band mags (rather than just the x and y coordinates), then we can internally determine if the y axis mag appears in the x axis color or not and use the correct covariant kernel. In the case that it doesn't, we could actually just call to the current partial_cmd_smooth which would be easy.

@turbo threading

It seems we get reasonable scaling out of setting thread=true for LoopVectorization.@turbo in the basic functions loglikelihood, ∇loglikelihood, and ∇loglikelihood! but it currently breaks the parallel hmc_sample method when used with matplotlib due to some GIL error on the python side.

Implement Poisson likelihood separately from Poisson likelihood ratio

Currently using Poisson likelihood ratio (PLR) but I think we can just use simple Poisson loglikelihood instead. In the PLR the factorial in the denominator cancels with numerator, but when taking the logarithm of the likelihood it becomes a logfactorial which has an efficient implementation in SpecialFunctions.jl that we can use, so it's not clear what benefit the PLR has. We should at least look at using the Poisson likelihood directly.

Foreground models

Worth looking into utilities or models to create foreground MW contamination predictions. Just saw a reference in Dan Weisz's 2012 paper on Leo T but there are surely more recent results (e.g., TRILEGALs recent update). Though it would be nice to have a model we could program ourselves and not have to make web queries or something like that.

Integrated metallicity distribution functions

Given the way that we are structuring the age-metallicity relation and using a mix of metallicities at each unique age to simulate imperfect ISM mixing, it should be straightforward to derive a mass-weighted metallicity distribution function (MDF) for the entire system. Deriving the number-weighted MDF, which is more relevant for comparison to observational studies, may be as simple as calculating the average number of stars per stellar mass formed for each population to translate the mass weighting to a number weighting. Would like a method for each AMR model that takes in the AMR coefficients and the SFR coefficients and returns something like a histogram or KDE giving the mass and/or number weighted MDF.

cgarling / starformationhistories.jl Goto Github PK

starformationhistories.jl's People

Contributors

Stargazers

Watchers

starformationhistories.jl's Issues

Recommend Projects

Recommend Topics

Recommend Org