Giter Site home page Giter Site logo

Comments (3)

ltj2013 avatar ltj2013 commented on August 16, 2024

Step to reproduce

./pplnn-build/tools/pplnn --in-shapes 32_3_224_224 --dims 32_3_224_224 --warmuptimes 200 --runningtimes 200 --onnx-model vgg16.onnx
[INFO][2021-07-05 08:31:30.885][pplnn.cc:683] ppl.nn version: v0.1.0-dirty
[INFO][2021-07-05 08:31:32.207][pplnn.cc:88] ***** register CudaEngine *****
[INFO][2021-07-05 08:31:32.940][simple_graph_partitioner.cc:90] total partition(s) of graph[torch-jit-export]: 1.
[INFO][2021-07-05 08:31:33.295][opt_graph.cc:187] Create 71 TensorImpl
[INFO][2021-07-05 08:31:33.295][opt_graph.cc:299] added 56 new bridge kernels
[INFO][2021-07-05 09:46:30.989][opt_graph.cc:461] deleted 52 bridge kernels
[INFO][2021-07-05 09:46:46.325][pplnn.cc:523] ----- input info -----
[INFO][2021-07-05 09:46:46.326][pplnn.cc:526] input[0]:
[INFO][2021-07-05 09:46:46.326][pplnn.cc:527]     name: input.1
[INFO][2021-07-05 09:46:46.326][pplnn.cc:534]     dim(s): 32 3 224 224
[INFO][2021-07-05 09:46:46.326][pplnn.cc:536]     DataType: FLOAT32
[INFO][2021-07-05 09:46:46.326][pplnn.cc:537]     DataFormat: NDARRAY
[INFO][2021-07-05 09:46:46.326][pplnn.cc:538]     NumBytesIncludePadding: 19267584
[INFO][2021-07-05 09:46:46.326][pplnn.cc:539]     NumBytesExcludePadding: 19267584
[INFO][2021-07-05 09:46:46.326][pplnn.cc:542] ----- output info -----
[INFO][2021-07-05 09:46:46.326][pplnn.cc:545] output[0]:
[INFO][2021-07-05 09:46:46.326][pplnn.cc:546]     name: 70
[INFO][2021-07-05 09:46:46.326][pplnn.cc:553]     dim(s): 32 1000
[INFO][2021-07-05 09:46:46.326][pplnn.cc:555]     DataType: FLOAT32
[INFO][2021-07-05 09:46:46.326][pplnn.cc:556]     DataFormat: NDARRAY
[INFO][2021-07-05 09:46:46.326][pplnn.cc:557]     NumBytesIncludePadding: 128000
[INFO][2021-07-05 09:46:46.326][pplnn.cc:558]     NumBytesExcludePadding: 128000
[INFO][2021-07-05 09:46:46.326][pplnn.cc:561] ----------------------
[INFO][2021-07-05 09:46:46.326][pplnn.cc:791] Run() costs: 9175.929688 ms.
[INFO][2021-07-05 09:46:46.326][pplnn.cc:799] Run ok

As shown in log, the time start on 08:31 and start inference on 09:46, took 75 minutes to prepare. Is it normal?the model was import from torchvison and export to onnx

import torchvision
dummy_input = torch.randn(32, 3, 224, 224)
model = torchvision.models.vgg16(pretrained = True)
model.eval()
torch.onnx.export(model, dummy_input, "vgg16.onnx", opset_version=11)

Also, test with batch size = 1, the time is pretty normal.

# ./pplnn-build/tools/pplnn --onnx-model vgg16.onnx --in-shapes 1_3_224_224 --dims 1_3_224_224 --warmuptimes 100 --runningtimes 100
[INFO][2021-07-05 05:21:44.428][pplnn.cc:683] ppl.nn version: v0.1.0-dirty
[INFO][2021-07-05 05:21:46.437][pplnn.cc:88] ***** register CudaEngine *****
[INFO][2021-07-05 05:21:47.230][simple_graph_partitioner.cc:90] total partition(s) of graph[torch-jit-export]: 1.
[INFO][2021-07-05 05:21:47.511][opt_graph.cc:187] Create 71 TensorImpl
[INFO][2021-07-05 05:21:47.511][opt_graph.cc:299] added 56 new bridge kernels
[INFO][2021-07-05 05:24:30.634][opt_graph.cc:461] deleted 52 bridge kernels
[INFO][2021-07-05 05:24:31.300][pplnn.cc:523] ----- input info -----
[INFO][2021-07-05 05:24:31.300][pplnn.cc:526] input[0]:
[INFO][2021-07-05 05:24:31.300][pplnn.cc:527]     name: input.1
[INFO][2021-07-05 05:24:31.300][pplnn.cc:534]     dim(s): 1 3 224 224
[INFO][2021-07-05 05:24:31.300][pplnn.cc:536]     DataType: FLOAT32
[INFO][2021-07-05 05:24:31.300][pplnn.cc:537]     DataFormat: NDARRAY
[INFO][2021-07-05 05:24:31.300][pplnn.cc:538]     NumBytesIncludePadding: 602112
[INFO][2021-07-05 05:24:31.300][pplnn.cc:539]     NumBytesExcludePadding: 602112
[INFO][2021-07-05 05:24:31.300][pplnn.cc:542] ----- output info -----
[INFO][2021-07-05 05:24:31.300][pplnn.cc:545] output[0]:
[INFO][2021-07-05 05:24:31.300][pplnn.cc:546]     name: 70
[INFO][2021-07-05 05:24:31.300][pplnn.cc:553]     dim(s): 1 1000
[INFO][2021-07-05 05:24:31.300][pplnn.cc:555]     DataType: FLOAT32
[INFO][2021-07-05 05:24:31.300][pplnn.cc:556]     DataFormat: NDARRAY
[INFO][2021-07-05 05:24:31.300][pplnn.cc:557]     NumBytesIncludePadding: 4000
[INFO][2021-07-05 05:24:31.300][pplnn.cc:558]     NumBytesExcludePadding: 4000
[INFO][2021-07-05 05:24:31.300][pplnn.cc:561] ----------------------
[INFO][2021-07-05 05:24:31.300][pplnn.cc:791] Run() costs: 344.269989 ms.
[INFO][2021-07-05 05:24:31.300][pplnn.cc:799] Run ok

Actually, it may take hours to select the fastest algo for conv or gemm ops in prepare stage, especially when the batch size is large.

from ppl.nn.

Si-XU avatar Si-XU commented on August 16, 2024

The time cost for batch = 32 is reasonable. The algorithm selection process will execute the real tensor size and select the shortest time-consuming one from over 6000 kernels. Thus, the time cost for 32 batch model will be approximately 32 times longer than the single batch.

If you cannot stand over one hour cost for 32 batch. There are two ways to reduce time cost for preparing stage:
1 use '--quick-select' to skip algo selection.
2 reduce the dim size by '--dims', like '--dims 3_3_224_224'

from ppl.nn.

zerollzeng avatar zerollzeng commented on August 16, 2024

Thanks for the explaination.

from ppl.nn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.