joshspeagle / frankenz Goto Github PK
View Code? Open in Web Editor NEWA photometric redshift monstrosity
License: MIT License
A photometric redshift monstrosity
License: MIT License
This is part of a larger initiative to reorganize the workflow so that most of the computation is internal. This way a user should just be able to do something like:
import object
obj = object(data, params)
object.fit(new_data)
object.compute_pdf(params)
zpdfs = object.zpdfs
This should enable a very clean workflow for general users, while advanced users can still access internal quantities for more advanced analyses.
In line with the SOM and KMCkNN objects (and their utilities), I should also establish an object to deal with a grid-based approach.
I'd imagine that my pre-existing SOM code should be actually incorporated into this package as a helpful way to visualize/analyze data. There should be associated plotting utilities, as well as an easy way to turn those results into photo-z predictions for users who are interested in a comparison.
Internal tests have found that the biases involved with using a simple k nearest-neighbor search (rather than a radius nearest-neighbor search) can be more severe than I had thought for objects with noisy photometry (i.e. when objects have a lot of potential neighbors). Some tests involving switching over to a ball tree and using the Mahalanobis distance when searching appears to perform much better, so that should be the method of choice.
Also, I should increase the default K
to be ~100, since 25 appears to be too small when comparing results for individual objects.
This will also require moving over to sklearn for the nearest neighbor searches.
Either spin this off as a standalone package or integrate it into frankenz.
One of the results I really liked deriving was the nice way that hard selection boundaries modify the naive Gaussian PDF. I'd like to add in some way to account for this effect in the code.
The process of doing so will probably entail adding some type of functionality to deal with overlap integrals for arbitrary PDFs. This should be doable (in theory) using path sampling if we phrase the problem as starting from
q_0 = N(F_g|F,C_g) N(F_h|F,C_h)
with
z_0 = \int q_0 dF = N(F_g|F_h,C_g+C_h)
and evolving to
q_1 = P(F_g|F,C_g) P(F_h|F,C_h) P(F)
with
z_1 = \int q_1 dF
since we know z_0 and path sampling gives us a way to estimate z_1/z_0. The big challenge is getting an MCMC sampler to draw independent samples along the path, although that could be a fun problem to solve.
I currently don't have one implemented even though it should be straightforward to add in theory. I should probably add this in to whatever I code up for #5.
Currently, I have a lot of plots which I'm making by hand. I should add a lot of these into the plotting
module so I don't have to do that as much anymore. These should include:
Might be worth it so they can work within the BPZ-style prior, even if it is ultimately arbitrary.
Add in outlier modeling to the samplers, where the outlier model is/is not allowed to float. This should allow users to have 5 possible combinations of options:
I want to implement a growing neural gas to deal with some of the issues the SOM has. I'll need a way to de-project it though so users can plot it in 2 dimensions. Maybe I can use other sklearn
manifold-learning algorithms for that.
Right now the scheme I've constructed for turning KMCkNN results into predictions involves mostly manual operations. I should probably automate this so that users can just call a function and convert their results into redshift PDFs. This should ideally take place through some type of object so that other intermediate quantities such as posterior values, etc. actually get saved internally.
Currently there is no way to impute missing data. I need to re-add this feature in.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.