Comments (6)
Did you build in Release? What do the apply_
functions do?
The ggml
convolution operations are for sure not very optimal, but 100x difference is too much
from ggml.
Thanks for your reply. I build in master branch. apply_
functions is the wrappper of conv, as follow:
static ggml_tensor * apply_conv2d_no_clamp(ggml_context * ctx, ggml_tensor * input, const conv2d_layer & layer)
{
ggml_tensor * result = ggml_conv_2d(ctx, layer.weights, input, \
layer.stride, layer.stride, \
layer.padding, layer.padding, \
layer.dilation, layer.dilation);
return result;
}
static ggml_tensor * apply_conv2d(ggml_context * ctx, ggml_tensor * input, const conv2d_layer & layer)
{
ggml_tensor * result = ggml_conv_2d(ctx, layer.weights, input, layer.stride, layer.stride, layer.padding, layer.padding, layer.dilation, layer.dilation);
result = ggml_clamp(ctx, result, 0.0f, 6.0f);
return result;
}
static ggml_tensor * apply_conv_depthwise_2d(ggml_context * ctx, ggml_tensor * input, const conv2d_layer & layer)
{
ggml_tensor * result = ggml_conv_depthwise_2d(ctx, layer.weights, input, layer.stride, layer.stride, layer.padding, layer.padding, layer.dilation, layer.dilation);
result = ggml_clamp(ctx, result, 0.0f, 6.0f);
return result;
}
Did you build in Release? What do the
apply_
functions do?The
ggml
convolution operations are for sure not very optimal, but 100x difference is too much
from ggml.
Did you build in Release? What do the
apply_
functions do?The
ggml
convolution operations are for sure not very optimal, but 100x difference is too much
I tested mobilenetv2 inference on the release branch code, and the inference time was about the same.
from ggml.
By Release I mean to build with -O3
optimizaion flags. What hardware are you running on?
from ggml.
By Release I mean to build with
-O3
optimizaion flags. What hardware are you running on?
I build with -O3
flags, the inference time has been accelerated, but it is still not ideal, about 15x slower than onnxruntime inference. I test on my PC, CPU info: Intel(R) Core(TM) i7-7560U CPU @ 2.40GHz 2.40 GHz
.
from ggml.
Make sure you are building with AVX2 support and ramp up the threads a bit:
const int n_threads = 4;
ggml_graph_compute_with_ctx(ctx0, gf, n_threads);
from ggml.
Related Issues (20)
- Is there interest in adding option for beginning padding to `ggml_pad`? HOT 1
- How to get scale / delta from quantized file?
- is there interest in `ggml_unfold_1d` ?
- Is there interest in `ggml_reduce` or `ggml_add_ext`? HOT 2
- Support for Custom Data Types in `ggml_arange` Function HOT 1
- Optimizing the ChatTTS Model to Enhance Generation Speed
- Is ggml support arm architecture inference?
- Is ggml support RISC-V ISA porting?
- Error in `ggml_get_rows` for large tensors with CUDA backend. HOT 1
- Support computer graph visualization? HOT 1
- How to use ggml_conv_1d? HOT 11
- Convert ggml file to onnx format HOT 1
- Is ggml support RAG? HOT 1
- Is there a reason why backend couldn't be selected at runtime? HOT 5
- ```ggml_backend_sched_graph_compute``` fails to compute a hybrid(?) graph when called for the second time HOT 2
- Build fails on Alpine Linux HOT 2
- [Enhancement] Native Convolution Support HOT 5
- How much inaccuracy/difference from pytorch is to be expected? HOT 4
- [sync llama.cpp] Update ggml-sycl op HOT 3
- How to run tests
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ggml.