stadelmanma / netl-ap-map-flow Goto Github PK
View Code? Open in Web Editor NEWA fracture flow modeling package utilizing a modified local cubic law approach with OpenFoam and ParaView compatibility.
License: GNU General Public License v3.0
A fracture flow modeling package utilizing a modified local cubic law approach with OpenFoam and ParaView compatibility.
License: GNU General Public License v3.0
I have a very common pattern in my python code where I check if a file exists at a given path, raise an exception if it does and if not make all the parent directories. This will help DRY out my code and enforce a consistent logic for file creation throughout the code base.
I could call it prepare_output_path
, possible code below, and it would go in the __core__.py
file
def prepare_output_path(filename, overwrite=False, err_msg='{} file already exists'):
r"""
Checks if a file already exists and should not be overwritten, creating
intermediate directories as needed if the file will be created.
"""
if os.path.exists(filename) and not overwrite:
raise FileExistsError(err_msg.format(filename))
#
os.makedirs(os.path.split(filename)[0], exist_ok=True)
These are both relatively independent of the main program so I want to test them on their own.
I want to test the numerical validity of the solver module using a small coefficient matrix and various BCs. I want to do similar tests for the map module.
This is a simple change and will offer a significant speed up because creating a large DataField carries with it a significant amount of extra overhead we don't need here.
The change will not affect the external API at all.
This will entail documenting all arguments, the return types and possibly an example.
It would be good to implement something using read the docs and an automatic documentation builder like sphinx. OpenPNM likely has a lot of the logic I would need to implement already figured out. Also sphinx has examples and links to projects who use their system.
I would want to go through and throughly document each function's parameters, purpose and provide an example. I figure there is a PEP standard or something I can use as a guideline to enforce consistency.
Ideally I could incorporate my current examples as well somehow.
I'll need to make them empty string if the program is being run in cmd.exe. They would make things look alot prettier though.
Looking through coverage reports made me realize I never check the validity of the output units chosen until the very end of the simulation. I know I would find it extremely annoying if I waited on results for 15 minutes or even longer on a very large map just to have it die because of that.
This will involve changing both the model code that reads the input file and the InputFile class in the module, as well as any related docs.
The primary advantage of making the parameter names exactly the same as the stuff before the colon, instead of the "first field prior to any delimiters". Additionally, I can just use yaml.load to get all my data, granted I'll want it to be an ordered load, instead of using a custom parser. Then I could just loop over the dict items to generate my actual ArgInput objects.
This a more logical way to think about adding tortuosity into the simulations. Additionally, tracing the fracture's lower surface in the image only produces a different result than tracking the mid-surface when there are significant bifurcations. In the presence of significant bifurcations that can alter flow (i.e. not a dead end splay) the simple aperture map representation is automatically a poor representation anyways.
Using a (mean, geometric, or harmonic) averaged mid-surface location also allows direct implementation of the tortuosity correction presented in Brush, Thompson 2003 in addition to the Stokes Tapered Plate correction already in place. Lastly, an average is less sensitive to stray pixels or the aforementioned splay fractures than explicitly saving the lowest voxel.
Additionally, I would like to add three "modes" to offset map generation:
Another good route to smooth out sharp transitions would be to add a group=boolean
parameter to the create_offset_map function where if True the value of a single offset is augmented by the surrounding 8 columns. I'll need to determine how I want to handle regions of zero aperture where no offset technically exists.
I'll also only want to consider 'real' columns when do any averaging. I don't want to calculate my averages based on previous averages. Although I could use averages to smooth the edges around zero aperture regions
Turns out that module is being depreciated so I'll want to switch over to importlib it seems.
I have some odd and overly complex logic in play in UnitDecomposition as well as some extra stuff in some of the conversion classes that doesn't need to be there.
If I go the pint route then I'll need to update my instructions appropriately. If I require people to use pip install --user -r requirements.txt
then I'll need to split my requirements file into conda_requirements.txt
and pip_requirments.txt
since I don't want pip trying to install stuff like numpy on Windows.
Right now it just prints out that the input file was saved it would be good to transition to logging. Make that message a debug message and then add additional info and debug messages to track progress.
Tracking wall clock time and total CPU time would also be cool, i.e. did 50 simulations in 25 minutes using 20 cores, with total CPU time over all cores being 500 minutes.
I could provide guidance on how to get Anaconda working unless there is a more formal guide somewhere else.
This will entail documenting every arg, return type and adding an example for public APIs
https://guides.github.com/activities/citable-code/
To do this you have to make make a release so it will be the last stage. Then I'll add the ref to my Thesis
In terms of parsing the YAML files are superior and there is no reason to keep the StatFile class hanging around. It is just a maintenance burden.
I need to rework these because they are just too complex or too specialized to work with. I'll remove the init-case-dir script from standard use and put it with some other misc scripts since it is both overly complex and very specialized.
apm-parallel-mesh-gen
just needs it's CLI tweaked a little but in general it is good to go, I think.
open-foam-export
needs to be greatly simplified so it works using a standard set of defaults pulling everything it needs from the model input file. This can include commented out directives to set the number of cores, end-time etc. Ideally I could formula a 'general' method to update any parameter using a directive not just ones I expect. A possible formulation in the input file could be something like #system.controlDict.end_time: 750
where the code splits the key based on .
and first checks for a file named the same as the given segment or a property on the current OpenFoam object. The last segment is what will get set to the value after the colon.
According to what I have read this is the preferred method since it will handle Windows gracefully. This also gets around the issue of my scripts being copied and installed even in development mode
Could be useful and possibly not too difficult because there are only so many possible program execution paths. Just not sure how exactly to set it up.
Need to add some tests that use git diff --no-index
on the current outputs against a set of static 'ground-truth' files to ensure consistent model outputs as well as numerical validity.
(if it's numerically valid now it should be numercally valid tomorrow if the output doesn't change)
I might be able to do it through conda-forge https://conda-forge.github.io/
The hydraulic aperture can be relatively easily back calculated from the transmissvity but given the simple calculation I really should just calculate it within the model to make direct comparisons easier.
Image -> Show Info
File -> Save As -> Raw Data
.Initial class layout
Can't inherit from PIL.Image because that will put me in an awkward situation when I need to
manipulate the data.
class FractureImageStack
__init__(self, image_filename, invert=False):
r""" Reads image data, fracture is assumed to truthy (i.e. > 0) and rock false (i.e. == 0)"""
# pull this code from load_image_data function in gen ap map script
# allow instantiation from just image data
invert(self):
r""" Flips image data map """
create_aperture_map(self, axis=1, dtype=int):
r""" Flattens the 3-D image data along the specified axis using scipy.sum """
return data_map # data_map will be a 2-D ndarray
@staticmethod
save_image_stack(filename, image, format=None, overwrite=False, **kwargs):
r""" Saves a multi-frame image using supplied image data, using Image.save
from the PIL library"""
# https://github.com/python-pillow/Pillow/blob/master/PIL/Image.py#L1651
# need to figure out the best way to take advantage of im.save when just using data
# writing to a temporary Bytes IO object might be the only way unless I can create an image from data
# https://github.com/python-pillow/Pillow/blob/master/Tests/test_file_tiff.py#L464
im.save(filename, format=format, save_all=True, **kwargs)
save(self, filename, overwrite=False, **kwargs):
r""" saves a tiff stack under the desired filename, PIL must be >= 4.0.0"""
# check PIL version
self.save_image_stack(filename, self.image_data, format='TIFF', overwrite=overwrite, **kwargs)
export_vtk(self, filename, cell_data=None, overwrite=False, **kwargs)
r""" Exports an ASCII legacy format VTK file to be read by ParaView.
The X, Y, Z coordinates are inferred from the image data dimensions.
Cell data is a list of ndarrays and must be the same length as the number
of voxels in the fracture. """
Possibly add in some methods found in generate_offset_map, they might be useful for VTK generation.
* locate_nonzero_data
* generate_index_map
This will largely replace the code in generate_aperture_map, as well as several methods in generate_offset_map although I'll need to be considerably more memory conscious there and maybe kill the class instance at times. The only method that will be replaced in fracture_df is load_image_data
This will essentially involve making everything lower case and purging the usage of __Name__.py modules
This will entail documenting every arg, return type and adding an example for public APIs. I may want to do this after #85
This goes a long way towards both convenience and easier babun compatibility. It's more convenient and portable than trying to symlink the files
Rewrite this into a python3 script:
#!/bin/bash
DIR="$(pwd)"
for script in $DIR/scripts/apm_*; do
# pulling out filename from path, runner needs no extension
runner=${script##*/}
runner=${runner%.*}
file=$(python -c 'import os; print(os.path.abspath("./scripts/apm_bulk_run.py"), end="")')
# creating runner file and setting as executable
echo "Creating runner for script $runner.py"
touch ~/bin/"$runner"
chmod ug+x ~/bin/"$runner"
# echoing content into file
echo "#!/bin/bash" > ~/bin/"$runner"
echo "python3 \"$file\" "\$@"" >> ~/bin/"$runner"
done
echo "Added all scripts to "~/"bin"
I'll also need to remove the NC parameter in R_JUST
and update unit tests as well as code usages to cover the changes to those subroutines. I should change the unit tests first that way I can assess if they properly cover the module still
Solid integration tests still need to be created to test the files in the 'scripts' directory. There may be some other areas integration tests could be useful but that is the primary need.
This would be an addition to the cluster processing script where after cluster removal I calculate the shortest path length of every voxel from inlet to outlet. Then calculate the standard deviation of all the paths and remove ones greater than N sigma.
In the least, simply move the create_offset_map
function from apm_process_image_stack.py into the FractureImageStack class to complement the create_aperture_map
method. The patch_holes
and filter_high_gradients
functions will remain in apm_process_image_stack.py.
Moved the new features into #58
This would allow adding and adjusting unit conversions without recompiling the source code. Not sure the best way to do it yet though. Maybe just using two vectors one storing the "key" of the unit i.e. "m" for "meters" and the other storing the value to SI conversion. Currently the unit conversion module implies dimensionality checks this would remove it unless a third vector is added that specifies the DMT (distance, mass, time) exponents. DMT exponents could be left as a string so things like 001 aren't lost.
Voxel size could still be used as an input and just appended onto the end of the unit vectors
Temperature would still needs it's own conversion
I really should add these to test coverage to make sure they work as intended. Since not having them tested has already bitten me a few times. Thorough testing wouldn't be required just a basic integration test that covers as much of the main execution path as possible and then checks that the expected files were created.
This is just another small adjustment I should have done awhile back but didn't
As title. I' think it is because the DX and DZ are being converted but not the point map. In addition I could refactor my code to generate VTK points given any point map and set of DX, DZ values.
By storing the entire file in a 2-D character array and looping over it I can remove the order dependence when reading in filenames. All commented out lines will still be ignored and not stored in the array
Since I use the same makefile for everything I think I will add a windows clean target that preserves the actual dist/ directory or at least re-creates it after deletion. Since currently it is impossible to do a debug build on windows because it can't re-create the dist directory.
Alternatively, I could make it work by adding a windows target to create the dist directory as well.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.