Giter Site home page Giter Site logo

hw08's Introduction

高性能并行编程与优化 - 第08讲的回家作业

通过 pull request 提交作业。会批分数,但是:

没有结业证书,回家作业仅仅作为评估学习效果和巩固知识的手段,不必为分数感到紧张 :) 量力而行,只要能在本课中,学到昨天的自己不懂的知识,就是胜利,没必要和别人攀比。 注意不要偷看别人的作业哦!

作业提交时间不限 :) 即使完结了还想交的话我也会看的~ 不过最好在下一讲开播前完成。

评分规则

  • 完成作业基本要求 50 分(详见下方"作业要求")
  • 能够在PR 描述中用自己的话解释 25 分
  • 代码格式规范、能够跨平台 5 分
  • 有自己独特的创新点 20 分
  • 明显抄袭现象 -100 分

作业要求

修改 main.cpp,改良其中的各个核函数,回答注释中的问题,并通过 main() 函数中的基本测试。 测试的结果和你的优化思路,可以直接写在注释里,也可以写在 PR 描述里。

温馨提示:如果用了 IDE,记得统一开启 Release 模式来比较性能。

关于内卷

如果你把 filter_positive 改成了基于 BLS 优化的 exclusive_scan 的,或是用了 thrustvector 和模板函数: 只要是在 满足作业要求的基础 上,这是件好事! 老师会酌情加分,视为“独特的创新点”,但最多不超过 20 分。

hw08's People

Contributors

archibate avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

hw08's Issues

CudaAllocator在Windows MSVC上编译错误的问题

你好,我发现你实现的这个CudaAllocator是好用的CUDA内存管理工具,我尝试将CudaAllocator.h文件拷贝到我的工程中,并写了一个非常简单测试函数:

#include <string>
#include <vector>
#include <iostream>
#include <cuda_runtime.h>
#include "CudaAllocator.h"

__global__ void TestVectorKernel(int *x, int n)
{
	int idx = threadIdx.x + blockIdx.x * blockDim.x;
	if (idx < n)
	{
		x[idx] = idx + 1;
	}
}

然后在一个函数中创建CudaAllocator的std::vector,并调用这个核函数。

int main()
{
	
	int N = 3;
	std::vector<int, CudaAllocator<int>> arr(N);
	TestVectorKernel <<<1, 4 >>> (arr.data(), N);
	checkCudaErrors(cudaDeviceSynchronize());
	for (size_t i = 0; i < N; i++)
	{
		std::cout << arr[i] << std::endl;
	}
	return 0;
}

但是在编译的时候出现问题:

Build started...
1>------ Build started: Project: Project2, Configuration: Debug x64 ------
1>Compiling CUDA source file Source.cpp...
1>
1>C:\Users\user\source\repos\cuda_test\Project2>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\nvcc.exe" -gencode=arch=compute_52,code=\"sm_52,compute_52\" --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\bin\HostX64\x64" -x cu -rdc=true  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include"  -G   --keep-dir x64\Debug -maxrregcount=0  --machine 64 --compile -cudart static -allow-unsupported-compiler -g   -D_DEBUG -D_CONSOLE -D_UNICODE -DUNICODE -Xcompiler "/EHsc /W3 /nologo /Od /Fdx64\Debug\vc143.pdb /FS /Zi /RTC1 /MDd " -o x64\Debug\Source.cpp.obj "C:\Users\user\source\repos\cuda_test\Project2\Source.cpp"
1>C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\include\vector(1908): error : no suitable user-defined conversion from "std::_Rebind_alloc_t<CudaAllocator<int>, int>" to "std::_Rebind_alloc_t<std::_Rebind_alloc_t<CudaAllocator<int>, int>, std::_Container_proxy>" exists
1>          detected during:
1>            instantiation of "void std::vector<_Ty, _Alloc>::_Construct_n(std::vector<_Ty, _Alloc>::size_type, _Valty &&...) [with _Ty=int, _Alloc=CudaAllocator<int>, _Valty=<>]"
1>(669): here
1>            instantiation of "std::vector<_Ty, _Alloc>::vector(std::vector<_Ty, _Alloc>::size_type, const _Alloc &) [with _Ty=int, _Alloc=CudaAllocator<int>]"
1>C:/Users/user/source/repos/cuda_test/Project2/Source.cpp(34): here
1>
1>C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\include\vector(793): error : no suitable user-defined conversion from "std::_Rebind_alloc_t<CudaAllocator<int>, int>" to "std::_Rebind_alloc_t<std::_Rebind_alloc_t<CudaAllocator<int>, int>, std::_Container_proxy>" exists
1>          detected during instantiation of "std::vector<_Ty, _Alloc>::~vector() [with _Ty=int, _Alloc=CudaAllocator<int>]"
1>C:/Users/user/source/repos/cuda_test/Project2/Source.cpp(34): here
1>
1>C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\include\vector(794): error : no instance of function template "std::_Delete_plain_internal" matches the argument list
1>            argument types are: (<error-type>, std::_Container_proxy *)
1>          detected during instantiation of "std::vector<_Ty, _Alloc>::~vector() [with _Ty=int, _Alloc=CudaAllocator<int>]"
1>C:/Users/user/source/repos/cuda_test/Project2/Source.cpp(34): here
1>
1>3 errors detected in the compilation of "C:/Users/user/source/repos/cuda_test/Project2/Source.cpp".
1>Source.cpp
1>C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 11.1.targets(785,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\nvcc.exe" -gencode=arch=compute_52,code=\"sm_52,compute_52\" --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.32.31326\bin\HostX64\x64" -x cu -rdc=true  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\include"  -G   --keep-dir x64\Debug -maxrregcount=0  --machine 64 --compile -cudart static -allow-unsupported-compiler -g   -D_DEBUG -D_CONSOLE -D_UNICODE -DUNICODE -Xcompiler "/EHsc /W3 /nologo /Od /Fdx64\Debug\vc143.pdb /FS /Zi /RTC1 /MDd " -o x64\Debug\Source.cpp.obj "C:\Users\user\source\repos\cuda_test\Project2\Source.cpp"" exited with code 1.
1>Done building project "Project2.vcxproj" -- FAILED.

我这边的环境是:

  • Windows 11 x64
  • Visual Studio 2022
  • CUDA 11.1

不太明白哪里出现了问题,是否可以麻烦看一下,或者有没有其他也遇到这个问题的同学解答一下。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.