Giter Site home page Giter Site logo

Comments (8)

underwoo avatar underwoo commented on July 17, 2024

More information from @hguo-gfdl. Some is included above, but this helps clarify the issue:

Email from @hguo-gfdl

I ran into problematic regional output "vmo" (Ocean Mass Y Transport) along Carribean Windward Passage after changing ocean layout. Originally I used around 8100 PEs as below,

<freInclude name="resourceparams_A1152x2_O5791_production">
   <resources jobWallclock="16:00:00" segRuntime="2:30:00">
      <atm ranks="1152" threads="2"   layout = "8,24"   io_layout = "1,4" />
      <lnd                            layout = "8,24"   io_layout = "1,4" />
      <ice                            layout = "144,8"  io_layout = "1,4" />
      <ocn ranks="5791" threads="1"   layout = "90,90"  io_layout = "1,5"  mask_table="mask_table.2309.90x90"/>
   </resources>
</freInclude>

After reducing to 6408 PEs, "vmo" at Windward Passage reached extremely large value, which was unreasonable,

<freInclude name="resourceparams_A864x2_O4671_production">
   <resources jobWallclock="16:00:00" segRuntime="2:40:00">
      <atm ranks="864" threads="2"    layout = "6,24"   io_layout = "1,4" />
      <lnd                            layout = "6,24"   io_layout = "1,4" />
      <ice                            layout = "72,12"  io_layout = "1,4" />
      <ocn ranks="4671" threads="1"   layout = "90,72"  io_layout = "1,4" mask_table="mask_table.1809.90x72"/>
   </resources>
</freInclude>

The xml for the above run is on gaea:

/ncrc/home2/oar.gfdl.cmip6/xml/CM4/CM4Bling_am4p0c96L33_OM4p25_piControl_C.xml

The untarred history file is at gfdl:

/archive/oar.gfdl.cmip6/CM4/warsaw_201710_om4_v1.0.1/CM4_piControl_C/gfdl.ncrc4-intel16-prod-openmp/history/tmp_650/06500101.ocean_Windward_Passage.nc

The stdout is in gfdl:

/archive/oar.gfdl.cmip6/CM4/warsaw_201710_om4_v1.0.1/CM4_piControl_C/gfdl.ncrc4-intel16-prod-openmp/ascii/06500101.ascii_out.tar

from fms.

underwoo avatar underwoo commented on July 17, 2024

@hguo-gfdl could you please point me to a file that has the correct (or what looks correct) for the vmo variable? I grabbed two random files from /archive/oar.gfdl.cmip6/CM4/warsaw_201710_om4_v1.0.1/CM4_piControl_C/gfdl.ncrc4-intel16-prod-openmp/history and see no difference. It would be easier if you could point me to two versions of the variable: one with the expected output, and the one with the bad output.

from fms.

hguo-gfdl avatar hguo-gfdl commented on July 17, 2024

@underwoo The expected output:
gfdl:/archive/oar.gfdl.cmip6/CM4/warsaw_201710_om4_v1.0.1/CM4_piControl_C/gfdl.ncrc4-intel16-prod-openmp/history/tmp_293/02930101.ocean_Windward_Passage.nc

The bad output:
gfdl:/archive/oar.gfdl.cmip6/CM4/warsaw_201710_om4_v1.0.1/CM4_piControl_C/gfdl.ncrc4-intel16-prod-openmp/history/tmp_650/06500101.ocean_Windward_Passage.nc

from fms.

underwoo avatar underwoo commented on July 17, 2024

from fms.

hguo-gfdl avatar hguo-gfdl commented on July 17, 2024

@underwoo I just added read permission on these files. Please let me know if there are any problems.

Thanks,

from fms.

underwoo avatar underwoo commented on July 17, 2024

@hguo-gfdl this might be related to another issue: #52. This issue points to an inconsistency in how diag_manager_mod names axes. I will need to run a test case, both to see if I can reproduce the error and what information is in the uncombined history files. Hopefully I will have some information next week.

from fms.

underwoo avatar underwoo commented on July 17, 2024

@hguo-gfdl I was able to reproduce the issue, and it does appear to be a diag_manager issue, more than likely this is not related to what we see in #52.

In your case, with this layout, diag_manager produces two files (since this is a region). One of those files has the variables volcello, thetao and so. The other file has variable vmo and vo. Uncombined, the variable outputs look correct. When combined, only the variables vmo and vo exist in the combined file, but have the data for volcello and thetao respectively.

We will continue to look into this issue, and try to come up with a fix.

Thank you for reporting this.

from fms.

underwoo avatar underwoo commented on July 17, 2024

We think we have discovered the issue, and it is something we do not know how to fix at this time. We believe what is happening is that at different resolutions, some of the MPI processes may not have access to all the same variables within small regions due to different grids used in the model. The regional output process writes a file per MPI process, and simply dumps the data to the file. If the variable doesn't exist within the grid on a particular MPI process, the variables is unregistered on that process. Note, this particular case will only happen with certain variables on the ocean, ice and land component grids.

We will update, in a future patch, the workflow to not combine files that do not have the same variables set of variables. This will keep the workflow from overwriting data. However, it is not a fix to this issue.

If we discover a method to correct this behavior, we will let you know. For now, I will close this as something we will not fix.

from fms.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.