The frank from discsim

Repo permissions

Marco, is it easy to make Richard and I additional owners on the repo? I expect that could be helpful down the line, and, e.g., we can't see the repo 'Settings' on github.

Add tests

We would need to add a few tests, at least:

to test the package can be imported
to test the objects can be initialised correctly

Ideally:

to test that Frank produces the same result on the same dataset (perhaps fixing the random seed we can obtain reproducible results to a numerical precision level?)

These tests would be run automatically after any push command.

tutorial for extracting a UVTable from CASA

Ideally with and without uvplot installed in CASA. The tutorial also needs to detail what formats (columns, units) of an input UVTable frank can read.

scrub multiple copies of large UVTables from repo to reduce its size

How-to: https://help.github.com/en/github/authenticating-to-github/removing-sensitive-data-from-a-repository

The repo size is currently ~90 MB (see https://api.github.com/repos/discsim/frank --> "size" (in kB)). It can be reduced if we delete redundant copies of the AS 209 dataset from the repo's history.

I'd like to keep one copy of the dataset though so that someone can reproduce the exact results shown with it in the tutorial notebooks.

Also check that the size of the pypi package is small

Add a description for the repo (top of the page)

Marco to check classifiers in setup.py

try to read in wle from UVTable

In load_data

If it fails, use the wle given in the parameter file

Determining weights of mock data

CASA simulations for ALMA do not have a good estimate of the weights. Should we include our utility routine that we use for that?

Generate documentation automatically

Generate documentation automatically, e.g. using a tool like sphinx, embedding the docstrings provided in the code.
Have a look at how we do it in galario: https://github.com/mtazzari/galario/tree/master/docs

It needs #3 to generate docs automatically after each push to master

Update logging in fit.py to inform the user UVTables will be saved

And that these will be of comparable size to the input UVTable

Clean requirements

Clean up requirements (e.g. sphinx is only needed to build the docs, it is not necessary to run frankenstein).

Fix init.py

We need to decide what belongs in init.py. Marco, can you take a look?

extend plotting scripts

To include:

using dist to make 2nd x-axis for I(r)
adding brightness profile integral to legend
verify cut_data and cut_baseline work
plotting comparison_profile (CLEAN profile)
running fits in loop / overplotting profiles
adding other useful metrics based on work coming out of review
figure that makes the unconvolved and convolved, deprojected 2D sweep (using a user-provided CLEAN beam)
bootstrap functions/figure
plot DHT of an input profile

Copyright notice at top of each source file

Add this to the top of each source file

Frankenstein: <DESC>
Copyright (C) 2019-2020  R. Booth, J. Jennings, M. Tazzari

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <https://www.gnu.org/licenses/>.

where for I suggest something like: 1D Fourier reconstruction algorithm with Gaussian Processes

Logging

It would be useful to have some output to tell users what the code is doing. E.g. when the geometry is being fit, what kind of model is being fit to the data, important parameters used etc.

Implement automated tests

Just set up working automated tests with Travis CI or circle CI.

Test the whole pipeline inc. figures and data

See discussion in #7

For the full pipeline test, there are already tests for the geometry fitting and the brightness profile fitting.
What's missing is a test for the full pipeline, as it would additionally be testing that figures and data files can be generated (that np.genfromtxt and plt.savefig are working).

Decide where to store binary files

Where should we store large binary files e.g. .npz, .dat etc?
They are useful to carry out examples or tests, but they increase the repo size considerably.
Idea:

store in another repo, e.g. discsim/data and download it whenever one wants to use it.
store in zenodo

add functionality for cutting data based on maximum baseline

Documentation for diagnostic figures of convergence

After implementing the figures in #49, add a short description/comment on how to interpret these figures to the documentation.

RuntimeWarning in BinUVData

Traceback:

/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/frank/useful_funcs.py:117: RuntimeWarning: invalid value encountered in sqrt
  bin_vis_err.real = np.sqrt(bin_vis_err.real - bin_vis.real**2)

Frank is slower with multithreading enabled

By default, modern implementations of numpy/scipy use OPENBLAS or MKL, which use multiple threads for evaluating the matrix operations. In frank, this actually makes the code slower - at least with OPENBLAS

There is however an opportunity to exploit parallelism when doing bootstrapping, which is currently missed (Its necessary to take care with numpy's random numbers though, which are not threadsafe')

plot.py

uv binning: @rbooth200 Can you drop your code for the faster uv binning into plot.py in the docs_and_runner_jj branch?
Interfacing with _HankelRegressor:
- Accessing the predicted visibilities: @rbooth200 Are you open to _HankelRegressor returning the (predicted) deprojected visibilities directly in sol (rather than having to do sol.predict_deprojected(sol.q)? I assume you didn't do this so a user could call either sol.predict or sol.predict_deprojected, but I think that since predict_deprojected already corrects the visibility amplitudes for the inclination, it would be good to just return these directly in sol as they're as much a component of the fit as the reconstructed I(r_k).
- Accessing the observed visibilities: @rbooth200
  
  frank/frank/radial_fitters.py
  
  Line 412 in 72b2499
  
  # Deproject the visibilities:
  
  Can you pop l.412-420 out of _build_matrices and into another function that returns the deprojected (but amplitude-shifted) observed visibilities and weights? I know you already have self._geometry.apply_correction, but a function that does this and corrects the amplitude of V for the inclination would be useful for calling a single function in plot.py to obtain the deprojected observed visibilities. The deprojection and amplitude correction functionalities also seem outside the scope of _build_matrices.

bad parameter choices warning

Add a warning if, e.g., hyperpriors : n is too small for hyperpriors : rout (because in this case the fit can be bad without it being immediately clear why). This should be based on whether the last collocation point is at shorter baseline than the largest baseline in the data.

Add diagnostic plots for convergence

Following discussion on Wed afternoon, add two diagnostic plots to the automatically generated figures.
Purpose of these figures is to inform the user on the convergence status of the fit.

put package on pypi

Ensure code uses RA, Dec sign convention consistent with galario

Update docstrings in geometry.py that note the sign convention

logging in radial_fitters to give running count of iteration number

Write README

E.g. by following structure of https://github.com/mtazzari/galario/blob/master/README.md

change repo name to 'frank'

Marco suggested waiting to do this until the repo is frozen.

Also change all internal 'frankenstein' mentions to 'frank'.
NOTE: Only use find --> replace in the code itself (.py files), not anywhere else! The docs and other repo files mention the code as 'Frankenstein (frank)' ... don't want to overwrite those.

review all remaining inline "# TODO"s

In .py files and docs

Option to run frank just to produce UVTables

An additional option to just generate the uv tables without refitting (even though in most cases the fitting will be very fast). Add the call syntax and function for it.
python -m frank.fit -uv uvtable --generate_uvtables sol.obj
would be good.

Review tutorial

Make sure the tutorial is final and polished.

update geom pars, iter pars in json and fit.py

When not fitting for the geometry, instead deprojecting by a supplied geometry, I get this error.

Traceback:

  Determining disc geometry
    Using your provided geometry for deprojection
    Using: inc  = -33.93 deg,
           PA   = 86.47 deg,
           dRA  = 8.63e+02 mas,
           dDec = -2.29e+02 mas
  Fitting for brightness profile
/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/frank/radial_fitters.py:121: RuntimeWarning: divide by zero encountered in true_divide
  p1 = np.where(p > 0, 1. / p, 0)
/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/frank/radial_fitters.py:732: RuntimeWarning: divide by zero encountered in true_divide
  beta = (self._p0 + 0.5 * (Tr1 + Tr2)) / pi - \
/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/frank/radial_fitters.py:734: RuntimeWarning: divide by zero encountered in log
  pi_new = np.exp(sparse_solve(Tij_pI, beta + np.log(pi)))
/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/frank/radial_fitters.py:734: RuntimeWarning: invalid value encountered in add
  pi_new = np.exp(sparse_solve(Tij_pI, beta + np.log(pi)))
/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/frank/radial_fitters.py:121: RuntimeWarning: invalid value encountered in greater
  p1 = np.where(p > 0, 1. / p, 0)
/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/frank/radial_fitters.py:719: RuntimeWarning: invalid value encountered in greater
  while (np.any(np.abs(pi - pi_old) > self._tol * np.abs(pi)) and
Traceback (most recent call last):
  File "/Users/morgan/miniconda3/envs/jj/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/Users/morgan/miniconda3/envs/jj/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/frank/fit.py", line 434, in <module>
    main()
  File "/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/frank/fit.py", line 411, in main
    model['input_output']['iteration_diag']
  File "/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/frank/fit.py", line 276, in perform_fit
    sol = FF.fit(u, v, vis, weights)
  File "/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/frank/radial_fitters.py", line 755, in fit
    self._ps_cov = self._ps_covariance(fit, Tij, rho)
  File "/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/frank/radial_fitters.py", line 801, in _ps_covariance
    hess_chol = scipy.linalg.cho_factor(hess)
  File "/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/scipy/linalg/decomp_cholesky.py", line 155, in cho_factor
    check_finite=check_finite)
  File "/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/scipy/linalg/decomp_cholesky.py", line 19, in _cholesky
    a1 = asarray_chkfinite(a) if check_finite else asarray(a)
  File "/Users/morgan/miniconda3/envs/jj/lib/python3.7/site-packages/numpy/lib/function_base.py", line 498, in asarray_chkfinite
    "array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

Imports

We have a mix of relative / absolute imports. We should change to absolute (PEP8)

add ADS links

Update README and index.rst (including sidebar link to papers citing frank) once we have links.

Ensure .json files ship w/ pip

Copyright notice for interactive usage

Consider adding this when starting frankenstein, as suggested by the GPL instructions:

    Frankenstein  Copyright (C) 2019-2020  R. Booth, J. Jennings, M. Tazzari
    This program comes with ABSOLUTELY NO WARRANTY; for details type <<<>>>.
    This is free software, and you are welcome to redistribute it
    under certain conditions; see the full LICENSE attached to this package.

Flip the import functions so standard libraries are first

Per MT's suggestion. Do this at the top of all scripts. Will take 1 minute.

Perform geometry fit inside the fitter

Allow the user to run the fit of the geometry inside the fit command.
This simplifies the user experience, and prepare for future releases where we might implement other ways of fitting for the geometry at the same time of the Hankel fit.

remove all "sol.<blah>" from make_figs

Don't pass sol to make_figs, just pass the dict entries in sol that are needed to plot.

Refactor docstrings

Slightly refactor multi-linedocstrings, e.g. following numpy standards.
From:

def func():
   """This function...
   
   text text text
   """

   code...

to:

def func():
   """ (start from a new line)
   This function...
   
   text text text
   (leave empty line)
   """
   code... (no empty lines between """ and the first lien of code)

Add logging to additional scripts

To track fit iteration real-time and the slower saving operations (saving UVTables)

Bump version in init after code frozen

Update to version "1.0.0"

add plotting functions to API

Per MT's suggestion

Some reading confusion in radial_fitters

@rbooth200

frank/frank/radial_fitters.py

Line 414 in 72b2499

q = np.hypot(u, v)

If I'm reading this right, for q in this line and V in the line above it, you're using these variables to denote the observed baselines and observed Re(V). But (I think) everywhere else q is the spatial frequency collocation points and V the visibility amplitudes at q.

Can you change q here and anywhere else I'm missing to bls or something distinct to denote where it's the observed baselines, and change V here and anywhere else I'm missing to vis (as it is in geometry.py) to denote where it's the observed visibilities?

Update inline function calls and check notebooks in docs

Code call signatures have changed since I wrote docs. Update:

quickstart
tutorial: "What’s fit.py doing? The fitting procedure in detail"
tutorial: "Running fits in a loop"
tutorial: "Assessing a fit's uncertainty and sensitivity to hyperpriors"

Optional figure to produce 2D fit images

The plotting should optionally produce a figure that makes:

the unconvolved, deprojected 2D sweep; unconvolved, reprojected 2D sweep; unconvolved, reprojected residuals using the residual UVTable (this last 1 will require additional functionality to make)
the same but convolved with a beam, if the user passes in the beam

integrate logging across scripts

And check logging behavior if using frank as module (if you don't run fit.py, you'd have to instantiate the logfile somewhere else).

add short tutorial on fitting mock data

including function call to estimate weights correction factor

discsim / frank Goto Github PK

frank's People

Stargazers

Watchers

Forkers

frank's Issues

Recommend Projects

Recommend Topics

Recommend Org