The ulx_pop from nx1

ULX Population Study
====================

Dependencies
------------
sqlite3 - (sudo apt-get install libsqlite3-dev)
cfitsio - https://heasarc.gsfc.nasa.gov/fitsio/
Python 3.8+
pandas 1.1+

What is this?
-------------
Ultraluminous X-ray sources (ULXs) are a subset of X-ray binaries
lightcurves often display long term modulations on the order
of months.

One explanation for these modulations is that they are caused by
some form of precession of the accretion flow near the compact object.

One model created to describe these modulations is ULXLC:
http://www.sternwarte.uni-erlangen.de/~dauser/research/ulxlc/
and allows the creation of lightcurves for a variety of system parameters.

This code uses a modified version of ULXLC coupled with an
artificially created population of binary systems in order
to try and answer the question of how does precession affect
the observed population of ULXs.


Input Data
----------
The code makes uses of the output from the population synthesis code
STARTRACK (https://ui.adsabs.harvard.edu/abs/2008ApJS..174..223B)

The input files may be found at:
Z=0.02   | https://universeathome.pl/universe/pub/z02_data1.dat
Z=0.002  | https://universeathome.pl/universe/pub/z002_data1.dat
Z=0.0002 | https://universeathome.pl/universe/pub/z0002_data1.dat

These files are ~30gb each and for our analysis we have extracted the systems
with active mass transfer (mt=1)

[Once downloaded These files may be processed using src/process_startrack.py]

The result is a ~9.3m rows of data that is not provided in this repo.


General Thoughts About the state of the code
--------------------------------------------
    - Main population is still stored as csv, might wanna move to SQL.
    - Literally 0 unit tests.
    - One big file is just way too much
    - All the git commits are like 1 word i.e 'updates'
    - Lots of redundant files... (see src/old)
    - The SQL stuff could definitely be a lot better
    - Because the code is split between C and python
      it's pretty hard to put into parallel.


The main three files for the simulation are the following:
1.  src/populations.py
2.  src/results_processor.py
3.  src/ulxlc/ulxlcmod.c

==============
populations.py
==============
This file contains code for reading in the processed input data from STARTRACK,
calculating secondary quantities, and various functions for calculating
sampling weights and sub-populations.

Currently all the code is pretty much held in one class called 'Population'
This is pretty stupid as the class does not actually correspond to a single
population as you would expect it to but instead also contains all possible
sub-populations within it as various Pandas dataframes, this is highly confusing
and likely not a good idea.


Improvements
------------
- Old unused population functions at the start
    Population Class
    ----------------
    - Calc columns has a lot.
    - calc_sub_populations() calculates rows that are above ulx luminosity
        - df_ulx should probably be is own object/population
          along with other subpopulations
    - A unique binary system corresponds to a unique combination of
      columns: ['iidum_run', 'iidd_old'], and contains row specifying their
      time evolution, maybe these could be separated into their own
      'System' Class?


====================
results_processor.py
====================
One would assume that this file contains code for processing results, however
it also for some reason contains literally all the other simulation code too.

the main class 'ResultsProcessor' includes functions for:
    - Creating, reading and writing several SQLite tables
    - Performing maps, joins and other various transformations
      of these tables outside of SQLite using Pandas
    - Running the Monte-Carlo routine via subprocess.run() calls to
      the modified ULXLC written in C.
    - all the plotting code.

The other two classes 'Ulx' and 'ErassTransientSampler' are used
for sampling 


Improvements
------------
- Too Many similar functions i.e: "table_" functions make things way too
convoluted and repetitive.
    - Refactoring of this I think could be useful

================
ulxlc/ulxlcmod.c
================
This file contains the code for ULXLC, and additionally within the
main() contains analysis that i have written for generating lightcurves for
a specified population as well as analysing several properties in C.

currently, the way this file is used is that it is called by
'results_processor.py' using subprocess.run() and several parameters
are passed via command line (e.g. "./a.out param1, param2"), the passed
parameters tell the script which rows to retrieve from the sqlite database.


Improvements
------------
- There are almost no functions in the C code I have written it's just one big
huge block.
nx1 / ulx_pop Goto Github PK

ulx_pop's Introduction

ulx_pop's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent