PnetCDF-python
Overview
PnetCDF-python is a Python interface to PnetCDF, a high-performance parallel I/O library for accessing netCDF files. This integration with Python allows for easy manipulation, analysis, and visualization of netCDF data using the rich ecosystem of Python's scientific computing libraries, making it a valuable tool for python-based applications that require high-performance access to netCDF files.
More about PnetCDF-python
At a granular level, PnetCDF-python is a library that consists of the following components:
Component | Description |
---|---|
File | pncpy.File is a high-level object representing an netCDF file, which provides a Pythonic interface to create, read and write within an netCDF file. A File object serves as the root container for dimensions, variables, and attributes. Together they describe the meaning of data and relations among data fields stored in a netCDF file. |
Attribute | In the library, netCDF attributes can be created, accessed, and manipulated using python dictionary-like syntax. A Pythonic interface for metadata operations is provided both in the File class (for global attributes) and the Variable class (for variable attributes). |
Dimension | Dimension defines the shape and structure of variables and stores coordinate data for multidimensional arrays. The Dimension object, which is also a key component of File class, provides an interface to create, access and manipulate dimensions. |
Variable | Variable is a core component of a netCDF file representing an array of data values organized along one or more dimensions, with associated metadata in the form of attributes. The Variable object in the library provides operations to read and write the data and metadata of a variable within a netCDF file. Particularly, data mode operations have a flexible interface, where reads and writes can be done through either explicit function-call style methods or indexer-style (numpy-like) syntax. |
Dependencies
- Python 3.9 or above
- PnetCDF C library
- Python libraries mpi4py, numpy
- To work with the in-development version, you need to install Cython
Installation
Currently our PyPI wheels don't cover all systems. If you already have a working MPI with the mpicc compiler wrapper is on your search path and pnetcdf-C installation, you can use pip:
CC=mpicc PNETCDF_DIR=/path/to/pnetcdf/dir/ pip install pncpy==0.0.3
Development installation
-
Clone GitHub repository
-
Make sure numpy, mpi4py and Cython are installed and you have Python 3.9 or newer.
-
Make sure a working MPI implementation and PnetCDF C is installed with shared libraries(
--enable-shared
), and pnetcdf-config utility is in your Unix $PATH. (or specifiypnetcdf-config
filepath insetup.cfg
) -
(Optional) create python virtual environment and activate it
-
Run
CC=mpicc PNETCDF_DIR=/path/to/pnetcdf/dir/ pip install -v .
Current build status
The project is under active development. Below is a summary of the current implementation status
Component | Implemented | To be implemented next (w/ priority*) |
---|---|---|
File API | ncmpi_strerror ncmpi_strerrno ncmpi_create ncmpi_open/close ncmpi_enddef/redef ncmpi_sync ncmpi_begin/end_indep_data ncmpi_inq_path ncmpi_inq ncmpi_wait ncmpi_wait_all ncmpi_inq_nreqs ncmpi_inq_buffer_usage/size ncmpi_cancel ncmpi_set_fill ncmpi_set_default_format ncmpi_inq_file_info ncmpi_inq_put/get_size |
ncmpi_inq_libvers 2 ncmpi_delete 2 ncmpi_sync_numrecs 2 ncmpi__enddef 2 ncmpi_abort 3 ncmpi_inq_files_opened 2 ncmpi_inq 3 |
Dimension API | ncmpi_def_dim ncmpi_inq_ndims ncmpi_inq_dimlen ncmpi_inq_dim ncmpi_inq_dimname ncmpi_rename_dim |
|
Attribute API | ncmpi_put/get_att_text ncmpi_put/get_att ncmpi_inq_att ncmpi_inq_natts ncmpi_inq_attname ncmpi_rename_att ncmpi_del_att |
ncmpi_copy_att 2 |
Variable API | ncmpi_def_var ncmpi_def_var_fill ncmpi_inq_varndims ncmpi_inq_varname ncmpi_put/get_vara ncmpi_put/get_vars ncmpi_put/get_var1 ncmpi_put/get_var ncmpi_put/get_varn ncmpi_put/get_varm ncmpi_put/get_vara_all ncmpi_put/get_vars_all ncmpi_put/get_var1_all ncmpi_put/get_var_all ncmpi_put/get_varn_all ncmpi_put/get_varm_all ncmpi_iput/iget_var ncmpi_iput/iget_vara ncmpi_iput/iget_var1 ncmpi_iput/iget_vars ncmpi_iput/iget_varm ncmpi_iput/iget_varn ncmpi_bput_var ncmpi_bput_var1 ncmpi_bput_vara ncmpi_bput_vars ncmpi_bput_varm ncmpi_bput_varn ncmpi_fill_var_rec |
All type-specific put/get functions 3 (e.g. ncmpi_put_var1_double_all) All put/get_vard functions 3 All mput/mget_var functions 3 |
Inquiry API | ncmpi_inq ncmpi_inq_ndims ncmpi_inq_dimname ncmpi_inq_varnatts ncmpi_inq_nvars ncmpi_inq_vardimid ncmpi_inq_var_fill ncmpi_inq_buffer_usage ncmpi_inq_buffer_size ncmpi_inq_natts ncmpi_inq_malloc_max_size ncmpi_inq_malloc_size ncmpi_inq_format ncmpi_inq_file_format ncmpi_inq_num_rec_vars ncmpi_inq_num_fix_vars ncmpi_inq_unlimdim ncmpi_inq_varnatts ncmpi_inq_varndims ncmpi_inq_varname ncmpi_inq_vartype ncmpi_inq_varoffset ncmpi_inq_header_size ncmpi_inq_header_extent ncmpi_inq_recsize ncmpi_inq_version ncmpi_inq_striping |
ncmpi_inq_dimid 3 ncmpi_inq_dim 3 ncmpi_inq_malloc_list 2 ncmpi_inq_var 3 ncmpi_inq_varid 3 |
\*priority level 1/2/3 maps to first/second/third priority
Testing
- To run all the existing tests, execute
./test_all.csh [test_file_output_dir]
- To run a specific single test, execute
mpiexec -n [num_process] python3 test/tst_program.py [test_file_output_dir]
The optional test_file_output_dir
argument enables the testing program to save out generated test files in the directory