Open geoscience is even more awesome, so we made a list. This list is curated from repositories that make our lives as geoscientists, hackers and data wranglers easier or just more awesome. In accordance with the awesome manifesto, we add awesome repositories. We are open to contributions of course, this is a community effort after all.
If you are interested in being a maintainer of this repository, leave the maintainer role file.
Awesome software projects sub-categorized by focus.
Seismic and Seismology
Auralib – / Python package to support investigation of geoscience problems including geophysics, rock physics, petrophysics, and data read/write in common formats.
GemPy – 3-D structural geological modelling software with implicit modelling and support for stochastic modelling.
GeoPhyInv – Julia Toolbox for Geophysical Modeling and Inverse Problems.
HyVR – 3-D anisotropic subsurface models based on geological concepts that can be used with groundwater flow simulators (e.g., ModFlow).
Landlab – Simulate surface processes using a large suite of existing interoperable process components (landscape evolution, sediment dynamics, surface hydrology, ecohydrology), exensible by own modules.
LoopStructural – an open-source 3D structural geological modelling library.
modelr.io – Web app for simple synthetic seismic forward modelling.
ModFlow – Flow modelling software distributed by the USGS to simulate and predict groundwater conditions and groundwater/surface-water interactions with additional variants and add-ons.
OccamyPy – an object-oriented optimization framework for small- and large-scale problems.
PyFWI – It can be used to perform full-waveform inversion (FWI) and time-lapse FWI of seismic data.
pyGeoPressure – Pore pressure prediction using well log data and seismic velocity data.
pyGIMLi – Multi-method library for solving inverse and forward tasks related to geophysical problems.
PyGMI– It is a modelling and interpretation suite aimed at magnetic, gravity and other datasets.
PyLops – Linear Operators with some geophysics/seismic modules (e.g., pre- and post-stack AVO inversion, deconvolution, Marchenko redatuming, Radon filtering).
libres – Tool for managing an ensemble of reservoir models.
MRST – Rapid prototyping and demonstration of new simulation methods in reservoir modelling and simulation.
ResInsight – ResInsight is a powerful open source, cross-platform 3D visualization, curve plotting, and post processing tool for reservoir models and simulations.
SHEMAT-Suite – Simulator for flow, heat and species transport in porous media including stochastic and deterministic parameter estimation.
Pyinterpolate – Kriging, Poisson Kriging, Semivariogram Deconvolution, Areal Kriging and other spatial interpolation methods in Python for Earth, Ecology and Social Sciences.
SamGIS – Image Segmentation machine learning based (Segment Anything by Meta - Facebook) applied to GIS and geo data. HuggingFace demo here.
Geochemistry
GeoPyTool – Application with geochemical plotting capabilities.
PhreeQC – Reactions in water and between water and rocks and sediments (speciation, batch-reaction, one-dimensional transport, and inverse geochemical calculations).
pyrolite – Geochemical transformation and visualisation.
Reaktoro – Unified framework for modelling chemically reactive systems.
Thermobar – Thermobarometry, chemometry and mineral equilibrium tool.
CHNOSZ – Thermodynamic calculations and diagrams for geochemistry, R Packages for Geochemistry: CHNOSZ and logKcalc
GeoChemical Data toolkit – GCDKit – System for handling and recalculation of whole-rock analyses from igneous rocks: Standard geochemical calculations and many of the common plots (binary, ternary, spider diagrams).
Geodynamics
Underworld - Computational tools for the geodynamics community.
Magnetotellurics
MATE - A Python program for interpreting magnetotelluric models of the mantle.
MTPy - A Python Toolbox for magnetotelluric data processing, analysis, modelling and visualization.
Razorback – An Python library for magnetotelluric robust processing.
Structural Geology
apsg – Advanced structural geology analysis and visualisation based on Matplotlib.
mplStereonet – Stereonets on python based on Matplotlib.
OpenStereo – An open source, cross-platform structural geology analysis software.
Stress_state_plot – An open source structural geology package for visualisation of a given stess-state via matplotlib.
Visualization
cmocean – MatPlotLib collection of perceptual colormaps for oceanography.
Geologic Patterns – Entire FGDC pattern library extracted to SVG and PNG for use in geologic maps and stratigraphic columns.
ipyleaflet – 2D interactive maps and GIS visualization in the Jupyter Notebook.
localtileserver – A Python package for serving tiles from large raster files in the Slippy Maps standard (i.e., /zoom/x/y.png) for visualization in Jupyter with ipyleaflet or folium.
PyVista – 3D plotting and mesh analysis through a streamlined interface for the Visualization Toolkit (VTK).
PVGeo – Data and model visualization in ParaView and Visualization Toolkit (VTK) via PyVista.
GeoVista – Cartographic rendering and mesh analytics powered by PyVista.
Platforms
GRASS-GIS – GIS platform for vector and raster geospatial data management, geoprocessing, spatial modelling and visualization, source code available at github.
OpendTect – Seismic interpretation package, source code available at github.
OpenGeode – Representation and manipulation of geological models.
Pangeo – A community platform for Big Data geoscience built on top of the open source scientific python ecosystem.
QGIS – GIS platform to visualize, manage, edit, analyse data, and compose printable maps.
Webviz – Webviz is a wrapper on top of Dash from Plotly which encourages making reusable data visualisation components and dashboards.
Webviz-subsurface – Webviz-subsurface contains subsurface specific standard webviz containers, which are used as plugins in webviz-config.
Natural Language Processing
geoVec – "Word embeddings for application in geosciences: development, evaluation and examples of soil-related concepts" and an implementation.
Geochronology
IsoplotR – A free and open-source substitute for Kenneth Ludwig's popular Isoplot add-in to Microsoft Excel.
pychron – Data acquisition and processing framework for Ar-Ar geochronology and noble gas mass spectrometry.
Digital Rocks Portal – Powerful data portal for images of varied porous micro-structures.
Geoscience Australia Portal – Comprehensive map-based Australian data portal across multiple geoscience domains.
GSQ Open Data Portal – Petroleum, coal, and mineral geoscience data from the Queensland resource industry and government, with supporting information from GSQ GitHub Repository for Data Models, RDF Vocabularies, and system design. Use of VPN may result in 403 error.
ICGEM – Hosts gravity field spherical harmonic models and provides a webservice for generating grids of gravity functionals (geoid, gravity anomaly, vertical derivatives, etc).
NOPIMS – Open petroleum geoscience data from Western Australia made available by the Australian Government.
Poseidon NW Australia – Interpreted 3D seismic (32bit) including reports and well logs.
Quantarctica – User-configurable QGIS basemap for Antarctica with high-quality, peer-reviewed, free and open Antarctic scientific data.
SARIG – South Australian Resources and Information Gateway providing map-based statewide geoscientific and geospatial data with over 600 datasets.
SEG Open Data Catalog – Catalog of "geophysical data that is readily available for download from the internet, via mail, or through special request", maintained by the Society of Exploration Geophysicists.
TerraNubis – The new Open Seismic Repository, includes the classic F3 and Penobscot seismic volumes (which both also have wells and other data assets).
UK National Data Repository – Open petroleum geoscience data from the UK Government (free registration required).
Volve data village - A complete set of data from a North Sea oil field available for research, study and development purposes.
World Stress Map – A global compilation of information on the crustal present-day stress field.
Volve data village - A complete set of data from a North Sea oil field available for research, study and development purposes.
Macrostrat - A multiscale, harmonized, and globally-defined geologic map dataset and stratigraphic API.
Costa Model – A hierarchical carbonate reservoir benchmarking case study.
EarthChem – Community-driven preservation, discovery, access, and visualization of geochemical, geochronological, and petrological data.
Would it be worthwhile to a sub-category to datasets along the lines of "data in bulk", "bulk download available", or "machine-learning ready"?
A lot of the open datasets out there take a lot human typing to extract large enough datasets to do something with. Wells, especially, have this problem.
Alternatively, should we limit "awesome" datasets to those that don't take a lot of prep or extraction to get working with?
Would it make sense to add a ground penetrating radar (GPR) category? I see no category under which open GPR processing software like GPRPy, RGPR, or readgssi can fit (disclaimer: readgssi is my own software, so whether or not it's awesome is arguably not my place to decide).
I would love to see a place where open GPR software is enumerated and this repository seems perfect for that purpose.
Maybe should abandon current CI methods and go to GitHub Actions that runs a link check every month and then adds issue if there is a fail. Current set-up tends to slow merging of pull requests for little benefit.
PR #76 removed the Digital Rocks Portal data repository because their SSL certificate is messed up. We should add it back eventually when the site is back up. Here is the entry:
- [Digital Rocks Portal](https://www.digitalrocksportal.org/) – Powerful data portal for images of varied porous micro-structures
The "Simulation and Modelling" category is quite broad at the moment. There is lots of overlap with the "Geophysics" category: e.g. emsig, Fatiando, ResIPy, SimPEG, ... would all fit under "Geophysics". Would it help with discoverability of packages to try to move project names out of "Simulation and Modelling" which might be a bit of a catch-all at the moment?
Summary of our potential maintainer sustainability problem:
Looking at the contributions graph data: https://github.com/softwareunderground/awesome-open-geoscience/graphs/contributors confirms that I've become the main source of pull request approvals, though @amoodie and @banesullivan have also done some approvals. For a couple reasons, I would like to see more diverse pull request approvals. One is the bus factor. The term bus factor refers to what happens if someone gets hit by a bus tomorrow.
Hypothesized Social Dynamics:
I suspect there's a couple things going on to cause this pattern.
I can sometimes be quick to jump on easy pull-request approvals as I'm on github for work and non-work reasons most days. This unfortunately takes away opportunities from others to make quick approvals.
It can be less than obvious whether to approve inclusion on the list for early projects that lack a long record of community engagement as shown by stars, forks, etc. To figure out whether to approve those projects often takes installing them and going through an example or two to make sure they work and do what they say they do.
For newbies, they might think doing this work requires someone more experienced?
For more experienced folks, they can sometimes be less interested in messing with code they're not going to use?
Maintainer Tasks Before Pull Request Approval:
For clarity, I'll summarize the typical maintainer tasks related to pull requests.
Go through the checklist as the user fills it out in the pull-request and see if there is anything not-checked there that is critical.
Go to the README or project page of the project and see if there is pre-existing high levels of engagement as demonstrated by forks, stars, issues, and pull-requests that seem to come from outside the core owners group. If so, probably can go ahead and approve.
If the project is new and lacks an existing community, it might still be awesome, but this requires more work. At a minimum, the maintainer should (1) make sure the code is installable (2) run a few examples (3) determine if there's a reasonable chance the functionality of the code is elegant and broad enough to be useful to others.
Rough Proposed Solution:
Move to have at least 3 listed "Current Maintainers" on the README.
Add a "How to be a Maintainer" markdown file linked to from the CONTRIBUTING.md file that explains the tasks of the maintainers and sets some base expectations in terms of:
X days after a pull-request is submitted, there should be some response in the thread.
Maintainers can let each other know they're going to be "off" for a month or quarter due to other responsibilities.
Any maintainer who responds to more than 75% of the pull requests, should perhaps consider making space for others and how to to about doing that.
Plus some suggestions on setting up GitHub notifications such that PRs on the awesome-open-geoscience repository do not get lost in the shuffle.
Pitch being a listed maintainer as:
Ideal for newbie and intermediate coders.
A learning opportunity where you get to try out new code.
A learning opportunity for what is involved in being a project maintainer, but with relatively low stakes and low complexity.
Something people can list on their resume if they'd like.
Source listed maintainers from:
Mention the need in CONTRIBUTING file and README.
SoftwareUnderground Slack.
Ping previous contributors in last 3 years in an announcement issue.
NOTE: This is a quick proposal. Fee free to offer modifications, additions, or completely different takes.
Meanwhile I’m struggling to find these aspects of the project in #74 which proposes adding MB-System. This software has a difficult to read/understand README, no API documentation (or at least after clicking around their website, I did not immediately see it), and no(?) existing examples for new users to get their hands dirty. Although it does appear to be generally accessible (installable using Homebrew)
Your list is heavy on the solid-earth side of geosciences. Are packages for climate / ocean / atmospheric science in scope? If so, I can make a PR with some suggestions?
I'd like to suggest some small set of icons that we could use to indicate what kinds of data files are contained within the various open data collections.
This would require a little bit of graphic design work. Any takers?
Icons might include:
Standard formats:
stacked seismic (SEGY)
prestack seismic (SEGY)
VSP (SEGY)
petrophysical data (LAS)
petrophysical data (DLIS)
Shapefiles (lines, points, and polygons)
etc.
Non-standard formats (this will be harder to contain)
horizon files (ASCII, csv, etc)
Deviation surveys (ASCII, csv)
Time-depth tables / checkshots (ASCII)
Well tops markers (ASCII, CSV)
etc.
Perhaps such a feature could be used to incorporate the notion of a standardized summary page for each dataset. Consistently documented and curated. Could even write a script to build such a summary directly from the data set itself – as a first order entrance exam to test the data quality of the dataset.
this is a nice collection. Maybe our GeoStat-Framework fits in here.
We provide a set of Python packages for geostatistical simulations:
GSTools: GeoStatTools provides geostatistical tools for various purposes:
random field generation
conditioned field generation
incompressible random vector field generation
simple and ordinary kriging
variogram estimation and fitting
many readily provided and even user-defined covariance models
plotting and exporting routines
ogs5py: A python-API for the OpenGeoSys 5 scientific modeling package. OGS5 is an open-source, finite-element solver for thermo-hydro-mechanical-chemical processes in porous and fractured media.
welltestpy: A python-package for handling well based field campaigns with a special focus on estimating parameters of aquifer heterogeneity from standard pumping test data.
AnaFlow: A python-package containing (semi-)analytical solutions for the groundwater flow equation with a focus on effective type curves for heterogeneous aquifers.
Walks: The Ministry of Random Walks provides methods for:
particle tracking
Lagrangian transport
travel time estimations
All these packages are working nicely together to generate workflows for simulating geostatistical setups.
I enjoy this list and frequently use it as reference. I’m relatively new to SWUNG and want to get more involved. I also want to learn maintainer skills.
How have you participated
I’m relatively new to swung (about 1yr). I participate in discussions on the slack/Mattermost channels. I frequently use the resources on this repo for my daily work as a geo data scientist @ThinkOnward (formerly Studio X).
I’m a geoscientist by education, with a BS and MS from Kansas State. I worked as a geophysicist at APC/OXY from 2014-2021 in the GoM. I left to pursue an opportunity in data science.
This also raises the question on whether links here should point to the code repository (proving it's open-source) or to the documentation/homepage of the listed project