cgarling / starformationhistories.jl Goto Github PK
View Code? Open in Web Editor NEWFitting astrophysical star formation histories via CMD modelling.
License: MIT License
Fitting astrophysical star formation histories via CMD modelling.
License: MIT License
With dense isochrone grids (> 1000 isochrones) the primary compute time for SFH optimization is spent in the composite!
function doing the linear algebra to build the composite model. In fixed_amr
, relative weights due to age-metallicity relations and metallicity distribution functions are fixed at runtime and the only optimization parameters are effectively star formation rates. As such, we could exclude isochrones with very low relative weights from the computation at runtime with (hopefully) minimal impact on convergence / quality of the fits. I'm thinking of adding a keyword argument, something like relweightcut
, such that when considering the set of all relative weights aweights
for isochrones of age a
, isochrones with aweights[i] <= relweightcut * maximum(aweights)
would be excluded from the composite model. With grids that are quite wide in metallicity-space this could lead to speedups ~2-5x. Thinking default should be for this behavior to be disabled, but it's a nice feature for applications where you might run fixed_amr
many times in a loop (i.e., Monte Carlo applications) as long as it doesn't impact accuracy significantly.
In principle this could additionally be implemented into methods that fit the age-metallicity relation but you would have to redefine the selection in every loop iteration, and since the metallicity parameters are now being fit I worry that truncating the isochrone list could lead to instability or inaccuracy, so this application is low-priority.
Should they be exp.(sqrt.(diag(Optim.trace(result.mle.result)[end].metadata["~inv(H)"])) .* Optim.minimizer(result.mle.result))
? Because you calculate the standard deviations in the transformed (log) space as sqrt.(diag(Optim.trace(result.mle.result)[end].metadata["~inv(H)"]))
then you have to normalize them by multiplying by the means also in the transformed space which are Optim.minimizer(result.mle.result)
, and then you should transform them. Realistically I don't think you can directly transform standard errors like this anyway so the more robust way is probably to compute a confidence interval in the transformed (log) space and then transform that back to the original space but this is somewhat low priority for now.
If the templates are normalized to total stellar mass (not SFR), then setting uniform priors on their coefficients *does not * set uniform priors on their SFRs. These priors are not yet properly dealt with. I think the easiest way might be to allow passing template_logAge
through to the methods that use priors, namely fit_templates
, fit_templates_mdf
, hmc_sample
, and hmc_sample_mdf
. You could then derive the timestep for each template and include a prior that accounts for the non-uniform timesteps.
For large isochrone grids, the vast majority of the computation time is spent in the computational kernels in fitting/fitting_base.jl
doing things like constructing the complex model from a linear combination of single-population models and computing the associated loglikelihood and gradient. The LL and grad contain slightly more complex logic, but the composite!
method is just linear algebra and could be gainfully accelerated on GPUs. Would be worth trying with Metal.jl, probably through a package extension: see dev docs, discussion on Julia <1.9.
A main source of heap allocations seems to be that when the interpolants from Interpolations.jl are constructed with vector-valued data, they return vectors which allocate on the heap and there does not seem to be an in-place evaluation method. If we instead (probably in ingest_mags?) convert the mags (Vector{Vector{<:Real}}) to a Vector{StaticArrays.SVector{N, <:Real}} we can greatly reduce heap allocation and potentially improve the speed as well; an initial test with 2 filters demonstrates a change from 256 bytes heap and 105ns evaluation to 48 bytes heap and 26ns evaluation. Our entire loop iteration timing for the Chabrier2001LogNormal IMF is ~200 ns so this could be a significant savings.
LV is being deprecated for Julia 1.11+ and so we should consider moving away from it now. Should fall back to basic @simd
or even remove the implementations that use these custom loops rather than the mul!
calls.
Most of the methods assume one based indexing but accept abstract array types (e.g., AbstractVector). Prior to release we should check this assumption with Base.require_one_based_indexing
on the input arrays, or try to generalize the indexing, but I suspect that will not be worth the work.
For CMDs with significant contamination, fitting age-metallicity relation variables (i.e., with what is now fit_templates_mdf
) can result in very unphysical results. Should include some default weak priors to hopefully prevent these errors. These priors would ideally be user-facing via keyword arguments so they could be overwritten if desired. Problem with using any arbitrary user-provided function is that they need to be differentiable for use in the solvers. I suppose we could rely on autodiff, and for analytic differentials I believe there are standard APIs for this (maybe rrule
and frule
from ChainRulesCore.jl)?
Using threadid
is not safe due to possibility of yields; see this blog post.
Pathfinder.jl has some interesting utilities for obtaining inverse Hessians (and draws from multivariate normal approximations) using BFGS and LBFGS from Optim.jl and other packages. Might be able to find some code that we could adapt, or perhaps we could utilize some of the Pathfinder functions instead. I think fit_mvnormals
might be a good function to look at.
A more complete description of observational errors and completeness requires all magnitudes at once, rather than the per-magnitude implementation we have currently. It is meant to be a simple implementation, but this generalization could make it much more useful. Then the case where the provided errfuncs
and completefuncs
are vectors could be handled by a special case method using an alternate dispatch signature.
Initial implementation of MCMC sampling with free distance in fitting/mcmc_sampling_distance.jl
looks good but does not yet have a convenience method like mcmc_sample
, any documentation, or any tests. These will need to be added. Constructing the x0
for the walkers is a bit tricky so remember to explain that.
Probably we should start a new section in the documentation for alternative SFH implementations with extra parameters so we can keep these specialized methods separate from the main, simple implementations.
Next we are going to try MCMC + distance + fixed MDF. Might also want to do MCMC + distance + linear AMR model.
Wondering if the @turbo macro will work better (in composite!
for example) if models
is a 3D array rather than a 1-D vector of 2-D matrices. Worth checking out.
The Binaries
type currently implements random pairing of stars sampled from the provided IMF. This is not actually a very realistic binary model. Think perhaps we should rename this to RandomBinaryPairs
or something, and create another sampler that implements a uniform mass ratio model. In this case the provided IMF would actually give the distribution of system masses; we would then sample the mass ratio for binary systems uniformly (see, for example, https://academic.oup.com/mnrasl/article/430/1/L6/1181335).
Optimizations using Optim.jl's BFGS implementation seem to be spending a lot of time in BLAS calls. By default BLAS has multi-threading but for problems of our sizes this threading is not efficient. Worth profiling and examining more closely.
We might be able to increase the efficiency of the gradient of the loglikelihood computation in fitting.jl if we define a custom method that derives the full gradient (so every partial) simultaneously and builds a dense representation of (data ./ composite). As it is, when we calculate each partial separately it recomputes for every partial.
Our solves are still quite fast but I think this would be an improvement.
Currently going with jupyter notebooks because they're easy and render natively on github, but doesn't lend well to inclusion on the docs pages. Might be better to go with something like Literate.jl or Weave.jl that is more explicitly designed to support conversion to easily hostable formats.
rather than
as it currently is.
This will make presentation clearer in the paper as the logarithmic AMR is
so the two have a common form. This also allows us to enforce
Currently only considers spacing along a single magnitude. It should consider the overall distance metric on the CMD, i.e. both y-axis mag and x-axis color. Current implementation only checks spacing along y-axis, which can cause spacing issues for horizontal/diagonal features like the HB or blue loop.
Would like to support masking of the Hess diagrams (models, data) in the basic functions in src/fitting/fitting_base.jl
to support situations like masking the horizontal branch. Adding masking to composite!
hurts performance but we could probably do it in the loglikelihood calculations more performantly.
I would like to be able to import an isochrone table during the Github Action CI run to test some of the functions whose behavior with test / random data may be hard to predict. I should be able to run wget
or curl
commands in the action, so if I can stage the file somewhere it's accessible, I should be able to download it during the action by putting the URL / authentication details in a repository secret.
This may work for actions, but it would complicate running the tests locally. May simply be best to distribute a small isochrone table for testing purposes? Or have the test script auto-download it if it's not present? Not sure what the best practice is here.
In the Gaussian2D-partials branch I have some code that uses the properly covariant kernels in the case that the filter that appears on the y
axis also appears in the color on the x
axis of the CMD. The implementation was poor and it didn't make much of a difference to the results at the time, so only some of that code is still in the main branch. Main problem was that you had to manually specify what type of covariance was to be used, which was annoying.
We can deal with this another way by keeping the current partial_cmd_smooth
(which does not include covariance) but adding a new method that requires passing a table/dictionary/dataframe etc. and then the entries in that table that give the x and y axis filters. This is similar to the approach taken in simulate.jl
where you have to pass mags
, mag_names
, and mag_lim_name
to determine which magnitude should be used to calculate the limit. In this case you have to pass all the necessary single-band mags (rather than just the x
and y
coordinates), then we can internally determine if the y
axis mag appears in the x
axis color or not and use the correct covariant kernel. In the case that it doesn't, we could actually just call to the current partial_cmd_smooth
which would be easy.
It seems we get reasonable scaling out of setting thread=true
for LoopVectorization.@turbo
in the basic functions loglikelihood
, ∇loglikelihood
, and ∇loglikelihood!
but it currently breaks the parallel hmc_sample
method when used with matplotlib due to some GIL error on the python side.
Currently using Poisson likelihood ratio (PLR) but I think we can just use simple Poisson loglikelihood instead. In the PLR the factorial in the denominator cancels with numerator, but when taking the logarithm of the likelihood it becomes a logfactorial which has an efficient implementation in SpecialFunctions.jl that we can use, so it's not clear what benefit the PLR has. We should at least look at using the Poisson likelihood directly.
Worth looking into utilities or models to create foreground MW contamination predictions. Just saw a reference in Dan Weisz's 2012 paper on Leo T but there are surely more recent results (e.g., TRILEGALs recent update). Though it would be nice to have a model we could program ourselves and not have to make web queries or something like that.
Given the way that we are structuring the age-metallicity relation and using a mix of metallicities at each unique age to simulate imperfect ISM mixing, it should be straightforward to derive a mass-weighted metallicity distribution function (MDF) for the entire system. Deriving the number-weighted MDF, which is more relevant for comparison to observational studies, may be as simple as calculating the average number of stars per stellar mass formed for each population to translate the mass weighting to a number weighting. Would like a method for each AMR model that takes in the AMR coefficients and the SFR coefficients and returns something like a histogram or KDE giving the mass and/or number weighted MDF.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.