Giter Site home page Giter Site logo

wolf1981's Projects

autodiff icon autodiff

Reverse mode automatic differentiation.

avx_mathfun icon avx_mathfun

AVX-optimized sin(), cos(), exp() and log() functions

bigmpi icon bigmpi

Implementation of MPI that supports large counts

blocksparse icon blocksparse

Efficient GPU kernels for block-sparse matrix multiplication and convolution

caffe icon caffe

Caffe: a fast open framework for deep learning.

caffe2 icon caffe2

Caffe2 is a cross-platform framework made with expression, speed, and modularity in mind.

catamount icon catamount

Catamount is a compute graph analysis tool to load, construct, and modify deep learning models and to symbolically analyze their compute requirements

ccompiler icon ccompiler

c语言编译器,用 lex 和 yacc 工具完成词法分析与语法分析并生成语法树,C++实现了语 法树的解析并生成中间代码,生成中间代码的过程中实现了错误检测。C++实 现了中间代码的优化操作。之后利用 python 对中间代码进行处理并生成 mips 汇编码并且可以成功在 PCSpim(mips 模拟器)上运行。

chainer icon chainer

A flexible framework of neural networks for deep learning

cnmem icon cnmem

A simple memory manager for CUDA designed to help Deep Learning frameworks manage memory

cntk icon cntk

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

cpufp icon cpufp

A CPU tool for benchmarking the peak of floating points

cuda-half2 icon cuda-half2

Convert CUDA programs from float data type to half or half2 with SIMDization

cugemmprof icon cugemmprof

A simple tool to profile performance of multiple combinations of GEMM of cuBLAS

data-structure icon data-structure

c++ 顺序表、链表、静态链表、队列、一元多项式、汉诺塔、火车调度问题、操作系统调度问题、背包问题、最大连续子列和问题、KMP算法、稀疏矩阵、广义表、并查集、无向图邻接表、有向图邻接表、Krusskal算法、Prim算法、最短路径Dijsktra算法、最短路径Bellman-Ford算法、最短路径Floyd算法、拓扑排序、关键路径、优化的冒泡排序、快速排序、直接插入排序、折半插入排序、闭散列实现、开散列实现

deepcore icon deepcore

An lightweight high performance computing library specifically for CNN batch-training based on CUDA. Not dependent to any third party librarys. This project is still under developing...

deepfloat icon deepfloat

An exploration of log domain "alternative floating point" for hardware ML/AI accelerators.

exllamav2 icon exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

falcon icon falcon

Library for fast image convolution in neural networks on Intel Architecture

fmath icon fmath

fast log and exp functions for x86/x64 SSE

folly icon folly

An open-source C++ library developed and used at Facebook.

fp6_llm icon fp6_llm

An efficient GPU support for LLM inference with 6-bit quantization (FP6).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.