Giter Site home page Giter Site logo

`performance_ex1` with triangles about mfem HOT 6 OPEN

aschaf avatar aschaf commented on July 20, 2024 1
`performance_ex1` with triangles

from mfem.

Comments (6)

v-dobrev avatar v-dobrev commented on July 20, 2024 1

Your suggested fixes look good to me. Thanks @aschaf!

from mfem.

aschaf avatar aschaf commented on July 20, 2024

I opened #4206 to resolve the issue.

A related question: if I want to further speed up the assembly (eg. on my laptop) should I use the MPI parallel version to utilize all available cores, or would it make sense to somehow extend the TBilinearform to utilize eg. OpenMP for the assembly loop?

from mfem.

v-dobrev avatar v-dobrev commented on July 20, 2024

A related question: if I want to further speed up the assembly (eg. on my laptop) should I use the MPI parallel version to utilize all available cores, or would it make sense to somehow extend the TBilinearform to utilize eg. OpenMP for the assembly loop?

The MPI-parallel version should be easier to try -- just look at how miniapps/performance/ex1p.cpp differes from miniapps/performance/ex1.cpp. Implementing an OpenMP version will require some effort and in the end may actually be slower.

from mfem.

aschaf avatar aschaf commented on July 20, 2024

I noticed another issue I somewhat missed in the pull request. When turning on SIMD, it again crashes at line 103 in teltrans.hpp

mfem/fem/teltrans.hpp

Lines 94 to 106 in 7c296d0

template <typename vint_t, int NE>
inline MFEM_ALWAYS_INLINE
void SetAttributes(int el, vint_t (&attrib)[NE]) const
{
const int vsize = sizeof(vint_t)/sizeof(attrib[0][0]);
for (int i = 0; i < NE; i++)
{
for (int j = 0; j < vsize; j++)
{
attrib[i][j] = elements[el+j+i*vsize]->GetAttribute();
}
}
}

specifically because at some point el+j+i*vsize is greater than the number of elements. A quick and dirty fix was just to replace line 103 with

attrib[i][j] = ((el+j+i*vsize) < fes.GetNE()) ? elements[el+j+i*vsize]->GetAttribute() : 1;

from mfem.

v-dobrev avatar v-dobrev commented on July 20, 2024

@aschaf, you are right -- this is a bug. In the respective situation in VectorExtract, we stop copying values when we reach the last element, see

void VectorExtract(const vec_layout_t &vl,

A better approach for here and for VectorExtract is probably to replicate the last element attribute or values, so that all SIMD entries have meaningful values.

from mfem.

aschaf avatar aschaf commented on July 20, 2024

@v-dobrev Thank you for the explanation, I will open a PR once this is fixed.

from mfem.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.