From @jhkennedy on April 23, 2015 22:47
CISM (dev and public) builds (with errors) on my Ubuntu systems. However, some of the test cases in the CISM dev branch do not run (detailed below).
Tested on both Ubuntu 14.04LTS and 12.04LTS -- with fresh installs
Build output:
[ 1%] Building Fortran object CMakeFiles/glimmercismfortran.dir/libglimmer/parallel_mpi.F90.o
...
[ 40%] Building Fortran object CMakeFiles/glimmercismfortran.dir/libglint/glint_mbal.F90.o
/home/fjk/Documents/Code/cism-dev/libglint/glint_mbal.F90:37:0: warning: extra tokens at end of #ifdef directive [enabled by default]
#ifdef USE_ENMABAL ! This option is *not* suppported
^
[ 41%] Building Fortran object CMakeFiles/glimmercismfortran.dir/libglint/glint_mbal_coupling.F90.o
...
[ 78%] Building Fortran object CMakeFiles/glimmercismfortran.dir/libglimmer-solve/SLAP/dgmres.f.o
/home/fjk/Documents/Code/cism-dev/libglimmer-solve/SLAP/dgmres.f:2620.59:
$ DZ, SX, JSCAL, JPRE, MSOLVE, NMSL, RWORK, IWORK,
1
Warning: Type mismatch in argument 'ipar' at (1); passed REAL(8) to INTEGER(4)
/home/fjk/Documents/Code/cism-dev/libglimmer-solve/SLAP/dgmres.f:1881.36:
IF (ISDGMR(N, B, X, XL, NELT, IA, JA, A, ISYM, MSOLVE,
1
Warning: Type mismatch in argument 'ia' at (1); passed INTEGER(4) to REAL(8)
/home/fjk/Documents/Code/cism-dev/libglimmer-solve/SLAP/dgmres.f:1976.38:
IF (ISDGMR(N, B, X, XL, NELT, IA, JA, A, ISYM, MSOLVE,
1
Warning: Type mismatch in argument 'ia' at (1); passed INTEGER(4) to REAL(8)
[ 79%] Building Fortran object CMakeFiles/glimmercismfortran.dir/libglimmer-solve/SLAP/dcg.f.o
...
[ 86%] Building Fortran object CMakeFiles/glimmercismfortran.dir/libglimmer-solve/SLAP/xersla.f.o
/home/fjk/Documents/Code/cism-dev/libglimmer-solve/SLAP/xersla.f:245.21:
call xerabt('xerror -- invalid input',23)
1
Warning: Type mismatch in argument 'messg' at (1); passed CHARACTER(1) to INTEGER(4)
/home/fjk/Documents/Code/cism-dev/libglimmer-solve/SLAP/xersla.f:325.18:
call xerabt(messg,lmessg)
1
Warning: Type mismatch in argument 'messg' at (1); passed CHARACTER(1) to INTEGER(4)
[ 86%] Building C object CMakeFiles/glimmercismfortran.dir/libglimmer/writestats.c.o
[ 87%] Building Fortran object CMakeFiles/glimmercismfortran.dir/fortran_autogen_srcs/glimmer_vers.F90.o
Linking Fortran static library lib/libglimmercismfortran.a
[ 92%] Built target glimmercismfortran
[ 93%] Building CXX object libglimmer-trilinos/CMakeFiles/glimmercismcpp.dir/trilinosNoxSolver.cpp.o
[ 94%] Building CXX object libglimmer-trilinos/CMakeFiles/glimmercismcpp.dir/trilinosGlissadeSolver.cpp.o
[ 95%] Building CXX object libglimmer-trilinos/CMakeFiles/glimmercismcpp.dir/trilinosModelEvaluator.cpp.o
Linking CXX static library ../lib/libglimmercismcpp.a
[ 96%] Built target glimmercismcpp
Scanning dependencies of target cism_driver
[ 97%] Building Fortran object cism_driver/CMakeFiles/cism_driver.dir/cism_external_dycore_interface.F90.o
Warning: Nonexistent include directory "/home/fjk/Documents/Code/cism-dev/builds/linux-gnu/build/include"
[ 98%] Building Fortran object cism_driver/CMakeFiles/cism_driver.dir/cism_front_end.F90.o
Warning: Nonexistent include directory "/home/fjk/Documents/Code/cism-dev/builds/linux-gnu/build/include"
[ 99%] Building Fortran object cism_driver/CMakeFiles/cism_driver.dir/gcm_to_cism_glint.F90.o
Warning: Nonexistent include directory "/home/fjk/Documents/Code/cism-dev/builds/linux-gnu/build/include"
[100%] Building Fortran object cism_driver/CMakeFiles/cism_driver.dir/gcm_cism_interface.F90.o
Warning: Nonexistent include directory "/home/fjk/Documents/Code/cism-dev/builds/linux-gnu/build/include"
[100%] Building Fortran object cism_driver/CMakeFiles/cism_driver.dir/cism_driver.F90.o
Warning: Nonexistent include directory "/home/fjk/Documents/Code/cism-dev/builds/linux-gnu/build/include"
Linking CXX executable cism_driver
[100%] Built target cism_driver
For a parallel build these tests work (run as serial and parallel [where applicable]):
- Halfar
- glint-example
- higher-order (all except slab)
And these don't (errors detailed below):
- EISMINT-1 (all)
- EISMINT-2 (all)
- higher-order/slab
Typical EISMINT-1 and EISMINT-2 output:
$ ./cism_driver e1-fm.1.config
CISM dycore type (0=Glide, 1=Glam, 2=Glissade, 3=AlbanyFelix, 4 = BISICLES) = 0
g2c%which_gcm (1 = data, 2 = minimal) = 0
call cism_init_dycore
Setting halo values: nhalo = 0
WARNING: parallel dycores tested only with nhalo = 2
Layout(EW,NS) = 31 31 total procs = 1
Global idiag, jdiag: 1 1
Local idiag, jdiag, task: 1 1 0
*** Error in `cism_driver': free(): invalid pointer: 0x00000000014cc8b0 ***
Program received signal SIGABRT: Process abort signal.
Backtrace for this error:
#0 0x7F4BE47A27D7
#1 0x7F4BE47A2DDE
#2 0x7F4BE3DF0D3F
#3 0x7F4BE3DF0CC9
#4 0x7F4BE3DF40D7
#5 0x7F4BE3E2D393
#6 0x7F4BE3E3966D
#7 0x5D3A91 in __glimmer_sparse_slap_MOD_slap_solve at glimmer_sparse_slap.F90:210 (discriminator 1)
#8 0x545407 in __glimmer_sparse_MOD_sparse_solve at glimmer_sparse.F90:237
#9 0x5457A8 in __glimmer_sparse_MOD_sparse_easy_solve at glimmer_sparse.F90:373 (discriminator 1)
#10 0x487A6D in thck_evolve at glide_thck.F90:561
#11 0x48AB5F in __glide_thck_MOD_thck_lin_evolve at glide_thck.F90:170
#12 0x473ACA in __glide_MOD_glide_tstep_p2 at glide.F90:862
#13 0x439344 in __cism_front_end_MOD_cism_run_dycore at cism_front_end.F90:302
#14 0x439986 in __gcm_cism_interface_MOD_gci_run_model at gcm_cism_interface.F90:118
#15 0x438D03 in cism_driver at cism_driver.F90:49
higher-order/slab output (serial):
$ /slab.py
Using Scientific.IO.NetCDF for netCDF file I/O
Writing slab.nc
Running CISM for the confined-shelf experiment
==============================================
Executing serial run with: ./cism_driver slab.config
CISM dycore type (0=Glide, 1=Glam, 2=Glissade, 3=AlbanyFelix, 4 = BISICLES) = 2
g2c%which_gcm (1 = data, 2 = minimal) = 0
call cism_init_dycore
* FATAL ERROR : ice limit (thklim) is too small for Glissade dycore
Fatal error encountered, exiting...
PARALLEL STOP in /home/fjk/Documents/Code/cism-dev/libglimmer/glimmer_log.F90 at line 178
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1001.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
higher-order/slab output (parallel):
$ /slab.py -m 2
Using Scientific.IO.NetCDF for netCDF file I/O
Writing slab.nc
Running CISM for the confined-shelf experiment
==============================================
Executing parallel run with: mpirun -np 2 ./cism_driver slab.config
CISM dycore type (0=Glide, 1=Glam, 2=Glissade, 3=AlbanyFelix, 4 = BISICLES) = 2
g2c%which_gcm (1 = data, 2 = minimal) = 0
call cism_init_dycore
* FATAL ERROR : ice limit (thklim) is too small for Glissade dycore
Fatal error encountered, exiting...
PARALLEL STOP in /home/fjk/Documents/Code/cism-dev/libglimmer/glimmer_log.F90 at line 178
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode 1001.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun has exited due to process rank 1 with PID 1903 on
node pc0101123 exiting improperly. There are two reasons this could occur:
1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.
2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"
This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[pc0101123:01901] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[pc0101123:01901] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
Copied from original issue: E3SM-Project/cism-piscees#28