stadelmanma / netl-ap-map-flow Goto Github PK

A fracture flow modeling package utilizing a modified local cubic law approach with OpenFoam and ParaView compatibility.

License: GNU General Public License v3.0

Fortran 32.75% Python 65.99% Makefile 0.76% Shell 0.17% FLUX 0.28% Batchfile 0.05%

cfd computational-fluid-dynamics fracture local-cubic-law openfoam

netl-ap-map-flow's People

Contributors

Stargazers

Watchers

Forkers

jerrycodes richardhaslam jiaoyj justapcwingo

netl-ap-map-flow's Issues

Make a function to handle the existance check as well as make directories

I have a very common pattern in my python code where I check if a file exists at a given path, raise an exception if it does and if not make all the parent directories. This will help DRY out my code and enforce a consistent logic for file creation throughout the code base.

I could call it prepare_output_path, possible code below, and it would go in the __core__.py file

def prepare_output_path(filename, overwrite=False, err_msg='{} file already exists'):
    r"""
    Checks if a file already exists and should not be overwritten, creating
    intermediate directories as needed if the file will be created.
    """
    if os.path.exists(filename) and not overwrite:
        raise FileExistsError(err_msg.format(filename))
    #
    os.makedirs(os.path.split(filename)[0], exist_ok=True)

Add unit tests for the map and solver modules

These are both relatively independent of the main program so I want to test them on their own.

I want to test the numerical validity of the solver module using a small coefficient matrix and various BCs. I want to do similar tests for the map module.

Change estimate_RAM_req to use sp.loadtxt instead of a DataField instance

This is a simple change and will offer a significant speed up because creating a large DataField carries with it a significant amount of extra overhead we don't need here.

The change will not affect the external API at all.

Add full documentation for run_model in source code

This will entail documenting all arguments, the return types and possibly an example.

Add better documentation throughout the Module

It would be good to implement something using read the docs and an automatic documentation builder like sphinx. OpenPNM likely has a lot of the logic I would need to implement already figured out. Also sphinx has examples and links to projects who use their system.

I would want to go through and throughly document each function's parameters, purpose and provide an example. I figure there is a PEP standard or something I can use as a guideline to enforce consistency.

Ideally I could incorporate my current examples as well somehow.

Look into the possibility of using ANSI escape codes for errors

I'll need to make them empty string if the program is being run in cmd.exe. They would make things look alot prettier though.

Check output units during initialization

Looking through coverage reports made me realize I never check the validity of the output units chosen until the very end of the simulation. I know I would find it extremely annoying if I waited on results for 15 minutes or even longer on a very large map just to have it die because of that.

Convert my lcl model input files to be YAML compliant and no spaces in param names

This will involve changing both the model code that reads the input file and the InputFile class in the module, as well as any related docs.

The primary advantage of making the parameter names exactly the same as the stuff before the colon, instead of the "first field prior to any delimiters". Additionally, I can just use yaml.load to get all my data, granted I'll want it to be an ordered load, instead of using a custom parser. Then I could just loop over the dict items to generate my actual ArgInput objects.

Change my lower-surface offset maps to mid-surface maps

This a more logical way to think about adding tortuosity into the simulations. Additionally, tracing the fracture's lower surface in the image only produces a different result than tracking the mid-surface when there are significant bifurcations. In the presence of significant bifurcations that can alter flow (i.e. not a dead end splay) the simple aperture map representation is automatically a poor representation anyways.

Using a (mean, geometric, or harmonic) averaged mid-surface location also allows direct implementation of the tortuosity correction presented in Brush, Thompson 2003 in addition to the Stokes Tapered Plate correction already in place. Lastly, an average is less sensitive to stray pixels or the aforementioned splay fractures than explicitly saving the lowest voxel.

Add 'modes' to offset map generation

Additionally, I would like to add three "modes" to offset map generation:

lowest - (current behavior) return the lowest fracture pixel as the offset.
weighted - (new default) return an averaged lowest point based on the bottom pixel in each set of continuous pixels for the column. Ideally this is will reduce the sharp transitions produced by 'lowest'
average - the offset is a simple average of all fracture pixels in the column

Another good route to smooth out sharp transitions would be to add a group=boolean parameter to the create_offset_map function where if True the value of a single offset is augmented by the surrounding 8 columns. I'll need to determine how I want to handle regions of zero aperture where no offset technically exists.

I'll also only want to consider 'real' columns when do any averaging. I don't want to calculate my averages based on previous averages. Although I could use averages to smooth the edges around zero aperture regions

Remove use of imp.reload

Turns out that module is being depreciated so I'll want to switch over to importlib it seems.

Add documentation on read the docs for the scripts and their usage.

Replace UnitConversion sub-modue by using pint

I have some odd and overly complex logic in play in UnitDecomposition as well as some extra stuff in some of the conversion classes that doesn't need to be there.

If I go the pint route then I'll need to update my instructions appropriately. If I require people to use pip install --user -r requirements.txt then I'll need to split my requirements file into conda_requirements.txt and pip_requirments.txt since I don't want pip trying to install stuff like numpy on Windows.

Make BulkRun program/script alittle more informative

Right now it just prints out that the input file was saved it would be good to transition to logging. Make that message a debug message and then add additional info and debug messages to track progress.

Tracking wall clock time and total CPU time would also be cool, i.e. did 50 simulations in 25 minutes using 20 cores, with total CPU time over all cores being 500 minutes.

Add instructions to README about using with babun

I could provide guidance on how to get Anaconda working unless there is a more formal guide somewhere else.

Add full documentation for data_processing submodule

This will entail documenting every arg, return type and adding an example for public APIs

Make my code citable using Zenodo

https://guides.github.com/activities/citable-code/

To do this you have to make make a release so it will be the last stage. Then I'll add the ref to my Thesis

Move the code to parse CSV stat files from core into a script

In terms of parsing the YAML files are superior and there is no reason to keep the StatFile class hanging around. It is just a maintenance burden.

Refactor OpenFoam related scripts

I need to rework these because they are just too complex or too specialized to work with. I'll remove the init-case-dir script from standard use and put it with some other misc scripts since it is both overly complex and very specialized.

apm-parallel-mesh-gen just needs it's CLI tweaked a little but in general it is good to go, I think.

open-foam-export needs to be greatly simplified so it works using a standard set of defaults pulling everything it needs from the model input file. This can include commented out directives to set the number of cores, end-time etc. Ideally I could formula a 'general' method to update any parameter using a directive not just ones I expect. A possible formulation in the input file could be something like #system.controlDict.end_time: 750 where the code splits the key based on . and first checks for a file named the same as the given segment or a property on the current OpenFoam object. The last segment is what will get set to the value after the colon.

Convert my scripts into console entry points

According to what I have read this is the preferred method since it will handle Windows gracefully. This also gets around the issue of my scripts being copied and installed even in development mode

Travis and coverage testing for the FORTRAN code?

Could be useful and possibly not too difficult because there are only so many possible program execution paths. Just not sure how exactly to set it up.

Add some integration tests to check model's actual output

Need to add some tests that use git diff --no-index on the current outputs against a set of static 'ground-truth' files to ensure consistent model outputs as well as numerical validity.

(if it's numerically valid now it should be numercally valid tomorrow if the output doesn't change)

Look into making my model an anaconda package

I might be able to do it through conda-forge https://conda-forge.github.io/

Add hydraulic aperture calculation to output file

The hydraulic aperture can be relatively easily back calculated from the transmissvity but given the simple calculation I really should just calculate it within the model to make direct comparisons easier.

Loading tiff stacks into Paraview (make into an example)

The steps below document how to do it

Source

Data Conversion with ImageJ

Open the tiff file in ImageJ.
Note the image stack dimenions and bits per pixel using Image -> Show Info
Save the image in raw format using File -> Save As -> Raw Data.

Loading the raw volume into ParaView

Open the raw formatted image in ParaView
- In the Pipeline Browser and Properties tab, you should see some options for specifying how the data should be treated
Set the Data Scalar Type based on the bits per pixel information
- Unsigned char if you have 8bit greyscale images
- Unsigned short for 16bit greyscale images
Adjust the data extents to match the image dimensions
- If the image is 300 pixels wide set the extents to 0, 299 and so on for each dimension.
Press Apply and it should now load a volume
- You will need to threshold the image appropriately to view the fracture separately from the solid

Create class for binary fracture image stacks

Initial class layout
Can't inherit from PIL.Image because that will put me in an awkward situation when I need to
manipulate the data.

class FractureImageStack
    __init__(self, image_filename, invert=False):
        r""" Reads image data, fracture is assumed to truthy (i.e. > 0) and rock false (i.e. == 0)"""
        # pull this code from load_image_data function in gen ap map script
        # allow instantiation from just image data

    invert(self):
        r""" Flips image data map """

    create_aperture_map(self, axis=1, dtype=int):
        r""" Flattens the 3-D image data along the specified axis using scipy.sum """
        return data_map  # data_map will be a 2-D ndarray

    @staticmethod
    save_image_stack(filename, image, format=None, overwrite=False, **kwargs):
        r""" Saves a multi-frame image using supplied image data, using Image.save
        from the PIL library"""
        # https://github.com/python-pillow/Pillow/blob/master/PIL/Image.py#L1651
        # need to figure out the best way to take advantage of im.save when just using data
        # writing to a temporary Bytes IO object might be the only way unless I can create an image  from data
        # https://github.com/python-pillow/Pillow/blob/master/Tests/test_file_tiff.py#L464
        im.save(filename, format=format, save_all=True, **kwargs)

    save(self, filename, overwrite=False, **kwargs):
        r""" saves a tiff stack under the desired filename, PIL must be >= 4.0.0"""
        # check PIL version 
        self.save_image_stack(filename, self.image_data, format='TIFF', overwrite=overwrite, **kwargs)

    export_vtk(self, filename, cell_data=None, overwrite=False,  **kwargs)
        r""" Exports an ASCII legacy format VTK file to be read by ParaView.
        The X, Y, Z coordinates are inferred from the image data dimensions.
        Cell data is a list of ndarrays and must be the same length as the number
        of voxels in the fracture.  """

Possibly add in some methods found in generate_offset_map, they might be useful for VTK generation.
* locate_nonzero_data
* generate_index_map

This will largely replace the code in generate_aperture_map, as well as several methods in generate_offset_map although I'll need to be considerably more memory conscious there and maybe kill the class instance at times. The only method that will be replaced in fracture_df is load_image_data

Update primary README to reflect the current changes

Make module names PEP8 compliant

This will essentially involve making everything lower case and purging the usage of __Name__.py modules

PEP 8

Add full documentation for openfoam sub module

This will entail documenting every arg, return type and adding an example for public APIs. I may want to do this after #85

Create a script to make script runners in a bin directory

This goes a long way towards both convenience and easier babun compatibility. It's more convenient and portable than trying to symlink the files

Rewrite this into a python3 script:

#!/bin/bash

DIR="$(pwd)"

for script in $DIR/scripts/apm_*; do

    # pulling out filename from path, runner needs no extension
    runner=${script##*/}
    runner=${runner%.*}
    file=$(python -c 'import os; print(os.path.abspath("./scripts/apm_bulk_run.py"), end="")')

    # creating runner file and setting as executable
    echo "Creating runner for script $runner.py"
    touch ~/bin/"$runner"
    chmod ug+x ~/bin/"$runner"

    # echoing content into file 
    echo "#!/bin/bash" > ~/bin/"$runner"
    echo "python3 \"$file\" "\$@"" >> ~/bin/"$runner"
done

echo "Added all scripts to "~/"bin"

Make C_OUT optional in UPPER_CASE, LOWER_CASE, R_JUST

I'll also need to remove the NC parameter in R_JUST

and update unit tests as well as code usages to cover the changes to those subroutines. I should change the unit tests first that way I can assess if they properly cover the module still

Adding integration tests

Solid integration tests still need to be created to test the files in the 'scripts' directory. There may be some other areas integration tests could be useful but that is the primary need.

Use minimum path length from inlet to outlet to remove poorly connected regions

This would be an addition to the cluster processing script where after cluster removal I calculate the shortest path length of every voxel from inlet to outlet. Then calculate the standard deviation of all the paths and remove ones greater than N sigma.

Add markdown formatted examples for running the model and basic module functionality

Attempt to add documentaton for the fortran code into read the docs

http://sphinx-fortran.readthedocs.io/en/latest/user.autodoc.html#

Add create_offset_map function into FractureImageStack

In the least, simply move the create_offset_map function from apm_process_image_stack.py into the FractureImageStack class to complement the create_aperture_map method. The patch_holes and filter_high_gradients functions will remain in apm_process_image_stack.py.

Moved the new features into #58

Make the unit conversion module use some kind of input file

This would allow adding and adjusting unit conversions without recompiling the source code. Not sure the best way to do it yet though. Maybe just using two vectors one storing the "key" of the unit i.e. "m" for "meters" and the other storing the value to SI conversion. Currently the unit conversion module implies dimensionality checks this would remove it unless a third vector is added that specifies the DMT (distance, mass, time) exponents. DMT exponents could be left as a string so things like 001 aren't lost.

Voxel size could still be used as an input and just appended onto the end of the unit vectors

Temperature would still needs it's own conversion

Add very simple integration tests for files in scripts directory

I really should add these to test coverage to make sure they work as intended. Since not having them tested has already bitten me a few times. Thorough testing wouldn't be required just a basic integration test that covers as much of the main execution path as possible and then checks that the expected files were created.

Add INTENT statements to subroutines in modules

This is just another small adjustment I should have done awhile back but didn't

Fix my VTK mesh output, units are messed up making everything look flat

As title. I' think it is because the DX and DZ are being converted but not the point map. In addition I could refactor my code to generate VTK points given any point map and set of DX, DZ values.

Modify the fortran code that reads the input file to store it in a char array and loop over that

By storing the entire file in a 2-D character array and looping over it I can remove the order dependence when reading in filenames. All commented out lines will still be ignored and not stored in the array

Adjust makefile to "clean" windows differently to fix debug build

Since I use the same makefile for everything I think I will add a windows clean target that preserves the actual dist/ directory or at least re-creates it after deletion. Since currently it is impossible to do a debug build on windows because it can't re-create the dist directory.

Alternatively, I could make it work by adding a windows target to create the dist directory as well.