Giter Site home page Giter Site logo

peterrum / dealii Goto Github PK

View Code? Open in Web Editor NEW

This project forked from dealii/dealii

1.0 1.0 0.0 362.36 MB

The development repository for the deal.II finite element library.

Home Page: https://www.dealii.org/

License: Other

CMake 0.43% C++ 95.89% C 0.80% Perl 0.01% HTML 0.01% Assembly 0.03% MATLAB 0.06% Makefile 0.03% Fortran 0.08% Cuda 0.35% Shell 0.04% Groovy 0.01% Jupyter Notebook 0.04% Python 0.10% Roff 2.14% Gnuplot 0.01% Dockerfile 0.01%

dealii's People

Contributors

agrayver avatar bangerth avatar benbrands avatar blaisb avatar class4kayaker avatar dangars avatar danshapero avatar davydden avatar drwells avatar eisbaerli avatar esenonfossiio avatar gassmoeller avatar grahambenharper avatar guidokanschat avatar jaeryunyim avatar jfrohne avatar jppelteret avatar krishnakumarg1984 avatar kronbichler avatar luca-heltai avatar mac-a avatar marcfehling avatar masterleinad avatar nfehn avatar peterrum avatar qiaolei-88 avatar rezarastak avatar rombur avatar tamiko avatar tjhei avatar

Stargazers

 avatar

Watchers

 avatar  avatar

dealii's Issues

LinearAlgebra::SharedMPI::Vector

Goal

Introduce a new vector (LinearAlgebra::SharedMPI::Vector) in deal.II in such a way that it can be used in dealii-based DG-application (DG, ghost cells), in hyper.deal (DG, ghost, faces), and in the CEED benchmarks (CG).

related to dealii#10872, hyperdeal/hyperdeal#18, and https://github.com/peterrum/ceed_benchmarks_dealii/tree/sm_vector_mf

Layered approach

        /**
         * Return position of shared cell: cell -> (owner, offset)
         */
        const std::map<dealii::types::global_dof_index,
                       std::pair<unsigned int, unsigned int>> &
        get_maps() const; // cells

        /**
         * Return position of ghost face: (cell, no) -> (owner, offset)
         */
        const std::map<std::pair<dealii::types::global_dof_index, unsigned int>,
                       std::pair<unsigned int, unsigned int>> &
        get_maps_ghost() const; // ghost faces in the context of hyper.deal
  • DoFInfo (owned by MatrixFree) pre-computing and storing pair for each lane of macro cell using the above functions of the partitioner -> dof_indices_contiguous_ptr
  • FEEvaluation uses DoFInfo::dof_indices_contiguous_ptr

Initialize partitioner

  • by DoFInfo?
  • DoFInfo::vector_partitioner_sm (x1), DoFInfo::vector_partitioner_face_variants_sm (x5)?
  • where should the sm comm come from? MatrixFree::AdditionalData?
  • CG -> IndexSet/dealii::Utilties::MPI::Partitioner vs. DG -> global_cell_index

Get partitioner

  • MatrixFree::get_vector_partitioner()

Initialize vector

  • MatrixFree::initialize_dof_vector() -> remove sm comm as argument?

Update vector

  • via VectorDataExchange::update_ghost_values_start(), update_ghost_values_finish(), compress_start(), compress_finish(), reset_ghost_values()
  • VectorDataExchange::zero_vector_region() -> use normal partitioner?

Accessing vector during cell/face integrals

via read_dof_values(), distribute_local_to_global(), gather_evaluate(), integrate_scatter()

  • read_dof_values()/distribute_local_to_global() use VectorReader/read_write_operation (CG/DG) /read_write_operation_contiguous (DG -> to be specialized?, rule out interleaved_contiguous? - see code snippets below)
  if (n_filled_lanes == VectorizedArrayType::size() &&
      n_lanes == VectorizedArrayType::size())
    {
      if (this->dof_info->index_storage_variants[ind][this->cell] ==
          internal::MatrixFreeFunctions::DoFInfo::IndexStorageVariants::
            contiguous)
        {
          if (n_components == 1 || n_fe_components == 1)
            for (unsigned int comp = 0; comp < n_components; ++comp)
              operation.process_dofs_vectorized_transpose(
                this->data->dofs_per_component_on_cell,
                dof_indices,
                *src[comp],
                values_dofs[comp],
                vector_selector);
          else
            operation.process_dofs_vectorized_transpose(
              this->data->dofs_per_component_on_cell * n_components,
              dof_indices,
              *src[0],
              &values_dofs[0][0],
              vector_selector);
        }

https://github.com/dealii/dealii/blob/4cf8f26cbf26f12e630aff124cd112fb3f24180e/include/deal.II/matrix_free/fe_evaluation.h#L4618-L4640

    for (unsigned int comp = 0; comp < n_components; ++comp)
      {
        for (unsigned int i = 0; i < this->data->dofs_per_component_on_cell;
             ++i)
          operation.process_empty(values_dofs[comp][i]);
        if (this->dof_info->index_storage_variants[ind][this->cell] ==
            internal::MatrixFreeFunctions::DoFInfo::IndexStorageVariants::
              contiguous)
          {
            if (n_components == 1 || n_fe_components == 1)
              {
                for (unsigned int v = 0; v < n_filled_lanes; ++v)
                  if (mask[v] == true)
                    for (unsigned int i = 0;
                         i < this->data->dofs_per_component_on_cell;
                         ++i)
                      operation.process_dof(dof_indices[v] + i,
                                            *src[comp],
                                            values_dofs[comp][i][v]);
              }
            else
              {
                for (unsigned int v = 0; v < n_filled_lanes; ++v)
                  if (mask[v] == true)
                    for (unsigned int i = 0;
                         i < this->data->dofs_per_component_on_cell;
                         ++i)
                      operation.process_dof(
                        dof_indices[v] + i +
                          comp * this->data->dofs_per_component_on_cell,
                        *src[0],
                        values_dofs[comp][i][v]);
              }
          }

https://github.com/dealii/dealii/blob/4cf8f26cbf26f12e630aff124cd112fb3f24180e/include/deal.II/matrix_free/fe_evaluation.h#L4712-L4745

  • for cell gather_evaluate()/integrate_scatter() call read_dof_values()/distribute_local_to_global()

  • for faces gather_evaluate()/integrate_scatter() -> fe_face_evaluation_process_and_io():

    • MatrixFreeFunctions::tensor_symmetric_hermite + do_gradients: n_face_orientations/n_vectorization_lanes_filled
    • nodal + do_gradients: n_face_orientations/n_vectorization_lanes_filled
            // case 4: contiguous indices without interleaving
            else if (n_face_orientations > 1 ||
                     dof_info.index_storage_variants[dof_access_index][cell] ==
                       MatrixFreeFunctions::DoFInfo::IndexStorageVariants::
                         contiguous)
              {
                const unsigned int *indices =
                  &dof_info.dof_indices_contiguous[dof_access_index]
                                                  [cell *
                                                   VectorizedArrayType::size()];
                Number2_ *vector_ptr =
                  global_vector_ptr + comp * static_dofs_per_component +
                  dof_info
                    .component_dof_indices_offset[active_fe_index]
                                                 [first_selected_component];


                if (do_gradients == true &&
                    data.element_type ==
                      MatrixFreeFunctions::tensor_symmetric_hermite)
                  {
                    if (n_face_orientations == 1 &&
                        dof_info.n_vectorization_lanes_filled[dof_access_index]
                                                             [cell] ==
                          VectorizedArrayType::size())
                      for (unsigned int i = 0; i < dofs_per_face; ++i)
                        {
                          const unsigned int ind1 =
                            index_array_hermite[0][2 * i];
                          const unsigned int ind2 =
                            index_array_hermite[0][2 * i + 1];
                          const unsigned int i_ = reorientate(0, i);


                          proc.function_2a(temp1[i_],
                                           temp1[i_ + dofs_per_face],
                                           vector_ptr + ind1,
                                           vector_ptr + ind2,
                                           grad_weight,
                                           indices,
                                           indices);
                        }
                    else if (n_face_orientations == 1)
                      for (unsigned int i = 0; i < dofs_per_face; ++i)
                        {
                          const unsigned int ind1 =
                            index_array_hermite[0][2 * i];
                          const unsigned int ind2 =
                            index_array_hermite[0][2 * i + 1];
                          const unsigned int i_ = reorientate(0, i);


                          const unsigned int n_filled_lanes =
                            dof_info
                              .n_vectorization_lanes_filled[dof_access_index]
                                                           [cell];


                          for (unsigned int v = 0; v < n_filled_lanes; ++v)
                            proc.function_3a(temp1[i_][v],
                                             temp1[i_ + dofs_per_face][v],
                                             vector_ptr[ind1 + indices[v]],
                                             vector_ptr[ind2 + indices[v]],
                                             grad_weight[v]);


                          if (integrate == false)
                            for (unsigned int v = n_filled_lanes;
                                 v < VectorizedArrayType::size();
                                 ++v)
                              {
                                temp1[i_][v]                 = 0.0;
                                temp1[i_ + dofs_per_face][v] = 0.0;
                              }
                        }
                    else
                      {
                        Assert(false, ExcNotImplemented());


                        const unsigned int n_filled_lanes =
                          dof_info
                            .n_vectorization_lanes_filled[dof_access_index]
                                                         [cell];


                        for (unsigned int v = 0; v < n_filled_lanes; ++v)
                          for (unsigned int i = 0; i < dofs_per_face; ++i)
                            proc.function_3a(
                              temp1[reorientate(v, i)][v],
                              temp1[reorientate(v, i) + dofs_per_face][v],
                              vector_ptr[index_array_hermite[v][2 * i] +
                                         indices[v]],
                              vector_ptr[index_array_hermite[v][2 * i + 1] +
                                         indices[v]],
                              grad_weight[v]);
                      }
                  }
                else
                  {
                    if (n_face_orientations == 1 &&
                        dof_info.n_vectorization_lanes_filled[dof_access_index]
                                                             [cell] ==
                          VectorizedArrayType::size())
                      for (unsigned int i = 0; i < dofs_per_face; ++i)
                        {
                          const unsigned int ind = index_array_nodal[0][i];
                          const unsigned int i_  = reorientate(0, i);


                          proc.function_2b(temp1[i_],
                                           vector_ptr + ind,
                                           indices);
                        }
                    else if (n_face_orientations == 1)
                      for (unsigned int i = 0; i < dofs_per_face; ++i)
                        {
                          const unsigned int ind = index_array_nodal[0][i];
                          const unsigned int i_  = reorientate(0, i);


                          const unsigned int n_filled_lanes =
                            dof_info
                              .n_vectorization_lanes_filled[dof_access_index]
                                                           [cell];


                          for (unsigned int v = 0; v < n_filled_lanes; ++v)
                            proc.function_3b(temp1[i_][v],
                                             vector_ptr[ind + indices[v]]);


                          if (integrate == false)
                            for (unsigned int v = n_filled_lanes;
                                 v < VectorizedArrayType::size();
                                 ++v)
                              temp1[i_][v] = 0.0;
                        }
                    else
                      for (unsigned int i = 0; i < dofs_per_face; ++i)
                        {
                          for (unsigned int v = 0;
                               v < VectorizedArrayType::size();
                               ++v)
                            if (cells[v] != numbers::invalid_unsigned_int)
                              proc.function_3b(
                                temp1[reorientate(v, i)][v],
                                vector_ptr[index_array_nodal[v][i] +
                                           dof_info.dof_indices_contiguous
                                             [dof_access_index][cells[v]]]);
                        }
                  }
              }
            else
              {
                // case 5: default vector access
                AssertDimension(n_face_orientations, 1);


                // for the integrate_scatter path (integrate == true), we
                // need to only prepare the data in this function for all
                // components to later call distribute_local_to_global();
                // for the gather_evaluate path (integrate == false), we
                // instead want to leave early because we need to get the
                // vector data from somewhere else
                proc.function_5(temp1, comp);
                if (integrate)
                  accesses_global_vector = false;
                else
                  return false;
              }
          }

https://github.com/dealii/dealii/blob/4cf8f26cbf26f12e630aff124cd112fb3f24180e/include/deal.II/matrix_free/evaluation_kernels.h#L2980-L3139

      template <typename T0, typename T1, typename T2, typename T3>
      void
      function_2a(T0 &      temp_1,
                  T0 &      temp_2,
                  const T1  src_ptr_1,
                  const T1  src_ptr_2,
                  const T2 &grad_weight,
                  const T3 &indices_1,
                  const T3 &indices_2)
      {
        // case 2a)
        do_vectorized_gather(src_ptr_1, indices_1, temp_1);
        do_vectorized_gather(src_ptr_2, indices_2, temp_2);
        temp_2 = grad_weight * (temp_1 - temp_2);
      }


      template <typename T0, typename T1, typename T2>
      void
      function_2b(T0 &temp, const T1 src_ptr, const T2 &indices)
      {
        // case 2b)
        do_vectorized_gather(src_ptr, indices, temp);
      }


      template <typename T0, typename T1, typename T2>
      void
      function_3a(T0 &      temp_1,
                  T0 &      temp_2,
                  const T1 &src_ptr_1,
                  const T2 &src_ptr_2,
                  const T2 &grad_weight)
      {
        // case 3a)
        temp_1 = src_ptr_1;
        temp_2 = grad_weight * (temp_1 - src_ptr_2);
      }


      template <typename T1, typename T2>
      void
      function_3b(T1 &temp, const T2 &src_ptr)
      {
        // case 3b)
        temp = src_ptr;
      }

https://github.com/dealii/dealii/blob/4cf8f26cbf26f12e630aff124cd112fb3f24180e/include/deal.II/matrix_free/evaluation_kernels.h#L3310-L3353

      template <typename T0, typename T1, typename T2, typename T3>
      void
      function_2a(const T0 &temp_1,
                  const T0 &temp_2,
                  T1        dst_ptr_1,
                  T1        dst_ptr_2,
                  const T2 &grad_weight,
                  const T3 &indices_1,
                  const T3 &indices_2)
      {
        // case 2a)
        const VectorizedArrayType val  = temp_1 - grad_weight * temp_2;
        const VectorizedArrayType grad = grad_weight * temp_2;
        do_vectorized_scatter_add(val, indices_1, dst_ptr_1);
        do_vectorized_scatter_add(grad, indices_2, dst_ptr_2);
      }


      template <typename T0, typename T1, typename T2>
      void
      function_2b(const T0 &temp, T1 dst_ptr, const T2 &indices)
      {
        // case 2b)
        do_vectorized_scatter_add(temp, indices, dst_ptr);
      }


      template <typename T0, typename T1, typename T2>
      void
      function_3a(const T0 &temp_1,
                  const T0 &temp_2,
                  T1 &      dst_ptr_1,
                  T1 &      dst_ptr_2,
                  const T2 &grad_weight)
      {
        // case 3a)
        const Number val  = temp_1 - grad_weight * temp_2;
        const Number grad = grad_weight * temp_2;
        dst_ptr_1 += val;
        dst_ptr_2 += grad;
      }


      template <typename T0, typename T1>
      void
      function_3b(const T0 &temp, T1 &dst_ptr)
      {
        // case 3b)
        dst_ptr += temp;
      }

https://github.com/dealii/dealii/blob/4cf8f26cbf26f12e630aff124cd112fb3f24180e/include/deal.II/matrix_free/evaluation_kernels.h#L3586-L3632

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.