Deion implement interface for device specification when usin

References (in code) <a href="http://eigen.tuxfamily.narkive

Results of CPU profiling Bottleneck 1: Pruning of nodes/link

dmccloskey commented on July 17, 2024

References (in code)

References (build)

-std=c++11 does not work with findCUDA: https://stackoverflow.com/questions/34960818/compiling-cu-using-nvcc-in-cmake
cmake 3.8+: https://stackoverflow.com/questions/36551469/triggering-c11-support-in-nvcc-with-cmake
Eigen compatibility and pre-processor flags: https://eigen.tuxfamily.org/dox/TopicCUDA.html
old example but has some helpful hints: https://codeyarns.com/2013/09/13/how-to-build-cuda-programs-using-cmake/

References (CUDA compilation)

CUDA compiler flags: https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#compilation-phases

from evonet.

dmccloskey commented on July 17, 2024

Docker and GPU support not available for Windows using CUDA

Will need to use a dual boot option to install the NVIDIA drivers directly on Linux

References:

from evonet.

dmccloskey commented on July 17, 2024

WINDOWS 10, VS2017, and CUDA 9.2 compilation

Windows SDK

error: "MSB8036 The Windows SDK version 10.0.15063.0 was not found"
fix:
- installed Windows universal app SDK https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk
- retargeted solution for Windows SDK Version 10.0.17 (if necessary)
reference: https://social.msdn.microsoft.com/Forums/vstudio/en-US/a739a8db-4e6e-478f-99c2-1348fc031985/compilation-error-with-windows-sdk-version-100150630?forum=visualstudiogeneral

VS2017 and CUDA 9.2 compatibility

error: "unsupported Microsoft Visual Studio version! Only the versions 2012, 2013, 2015 and 2017 are supported!"
fix:
- modified file "c:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\include\crt\host_config.h"
- replaced line 131 "#if _MSC_VER < 1600 || _MSC_VER > 1913" to "#if _MSC_VER < 1600 || _MSC_VER > [insert very large number]"
reference: https://devtalk.nvidia.com/default/topic/1022648/cuda-9-unsupported-visual-studio-version-error/

Force VS2017 to build x64 and target x64 architecture

error:
- errors about not being able to compile CUDA project for 32 bit platforms
- errors about using a x64 bit compiler to target a x86 machine
fix: use the "-T" and "-G" cmake options to specify the host and the compiler, respectively
reference: https://cmake.org/cmake/help/v3.8/generator/Visual%20Studio%2015%202017.html

Linking to Boost in VS2017

error: "LNK1104 cannot open file 'libboost_unit_test_framework-vc141-mt-gd-x64-1_67.lib'"
fix: multi-select all projects, update the "library" directories to include the Boost "lib" folder
reference: https://docs.microsoft.com/en-us/cpp/ide/vcpp-directories-property-page

int conversion in Eigen library

error:
- "C2397 conversion from 'unsigned __int64' to '__int64' requires a narrowing conversion"
- "C2397 conversion from 'unsigned __int64' to 'int' requires a narrowing conversion"
fix: ensured that all integer types used when allocating the size of Tensors (e.g., Eigen::Tensor<float, 2> tensor(int_type, int_type); ) are the same

use of POSIX sleep in Eigen library

error:
fix: commented out line 91 //sleep(1); in TensorDeviceCuda.h

Missing "math_functions.hpp" in Cuda 9.2

error: C1083 Cannot open include file: 'math_functions.hpp': No such file or directory
fix: copied "math_functions.hpp" from \include\crt to \include directory
reference: https://stackoverflow.com/questions/43113508/math-functions-hpp-not-found-when-using-cuda-with-eigen

MSVC compiler "quirk" and Error in Eigen Macros.h file

error: C1017 invalid integer constant expression
fix:
- had to use the develop branch of eigen
- https://stackoverflow.com/questions/48341389/error-while-compiling-eigen-library-v3-3-4-with-vs2017-nvcc-cuda-9-0
references:

Cygwin64 linking errors using CUDA

errors:
- c++: error: /subsystem:console: No such file or directory
- c++: error: opengl32.lib: No such file or directory
fix: none found

Cygwin64 linking errors to the Boost library

error: undefined reference to `boost::unit_test::framework::master_test_suite()
fix: None found
references: https://stackoverflow.com/questions/49699013/undefined-reference-to-boostunit-testframeworkmaster-test-suite

from evonet.

dmccloskey commented on July 17, 2024

Results of CPU profiling

Bottleneck 1: Pruning of nodes/link/weights
Save node/link/weight pruning until the very end
Bottleneck 2: forwardPropogateLayerNetInput
Make a "cache" before the first epoch of training of all layers and steps of operation
Use the "cache" to allocate memory for needed tensors
Update node output/derivative/error values only when requested by the user
Bottleneck 3: MapValuesToNodes
Refactor to "materialize" node values on the fly as requested by the user (i.e., retrieve actual values from tensors)
Bottleneck 4: backwardPropogateLayerError
Same as Bottleneck 2 except for the back propagation steps

Code changes

LossFunction

base class for LossFunctionOp and LossFunctionGradOp
... operator()(..., Eigen::ThreadPoolDevice& device) const = 0;
... operator()(..., Eigen::GpuDevice& device) const = 0;

CalculateActivation

refactor to use Eigen::Tensor<float, 0>
... operator()(..., Eigen::ThreadPoolDevice& device) const = 0;
... operator()(..., Eigen::GpuDevice& device) const = 0;

Model

allocateForwardPropogationLayerTensors and allocateBackwardPropogationLayerTensors
alternatively convertNodesToTensors
refactor backPropogateLayerError to another class to implement parallelism
refactor forwardPropogateLayerNetInput to another class to implement parallelism

Graph of operations

FP:

sequence 1:
- source nodes (outputs), links (weights) -> (MatMul) net input -> (SplitByActivation) split input -> (PerElement) sink nodes (output) -> (PerElement) sink nodes (derivative)[Do this during BP]
- subsequence 1:
  - source nodes (outputs), links (weights) -> (MatMul) net input -> (PerElement) sink nodes (output) -> (PerElement) sink nodes (derivative)
- repeat for all subsequences
repeat for all sequences

Error:

output nodes (output), expected output -> (Custom) output nodes (error)

BP:

sequence 1:
source nodes (error), links (weights) -> (MatMul) sink nodes (tmp) -> (SplitByTime) sink nodes (tmp), sink nodes (derivative) -> (DotProd) sink nodes (errors)
subsequence 1
source nodes (error), links (weights) -> (MatMul) sink nodes (tmp), sink nodes (derivative) -> (DotProd) sink nodes (errors)
repeat for all sequences

Structures

Tensors:

Output tensors (batch x time [same as nodes])
Derivative tensors (batch x time [same as nodes])
Error tensors (batch x time [same as nodes])
Link tensors [same as Weight]

Node ids:

Matching node ID vectors and Link ID vectors

Tensor container:

std::vector to hold output, derivative, and error tensors in order

Structure to hold Argument

Tensor type
time-step
tensor index

enum to hold Operation type

MatMul
Dot
PerElement
None

Structure to hold single Instruction

0: Argument
1:Argument
operation: Operation

ExecutionGraph

std::vector<Instruction> of operations

from evonet.

dmccloskey commented on July 17, 2024

Results of CPU profiling

Results after adding in thread support for PopulationTrainer and add in thread support and node caching for Model FP and BP steps

Bottlenecks

calculateNetNodeInput_
saveCurrentOutput() and all other "save..." methods called during FPTT
Low priority others: BPTT and updateWeights

from evonet.

Comments (5)

References (in code)

References (build)

References (CUDA compilation)

Docker and GPU support not available for Windows using CUDA

References:

WINDOWS 10, VS2017, and CUDA 9.2 compilation

Windows SDK

VS2017 and CUDA 9.2 compatibility

Force VS2017 to build x64 and target x64 architecture

Linking to Boost in VS2017

int conversion in Eigen library

use of POSIX sleep in Eigen library

Missing "math_functions.hpp" in Cuda 9.2

MSVC compiler "quirk" and Error in Eigen Macros.h file

Cygwin64 linking errors using CUDA

Cygwin64 linking errors to the Boost library

Results of CPU profiling

Code changes

Graph of operations

Structures

Results of CPU profiling

Bottlenecks

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org