nvdla / doc Goto Github PK
View Code? Open in Web Editor NEWDocumentation for NVDLA.
License: Other
Documentation for NVDLA.
License: Other
thanks so mush for sharing NVDLA.
i read the register description carefully,and found that there is only a few vivid words for key points, what should i do if i want to integrate NVDLAinto my design(the integration)?
where can i find the details about "loadable image"?
I think the Addresses in the hwarch.rst do not agree with the code. Table 10 looks mostly correct, except a couple of end addresses. But tables 11-30 have an extra 0x3000 added.
Added as pull request.
Hi,
It shows "this file is invalid so it cannot be displayed", can u please have a look? 3x.
We have the following inquiries about NVDLA. Would appreciate feedback.
How do we access the full training infrastructure when we build our own ConvNet model ?
Can NVDLA parser (compilation tool) can read Caffe2, TensorFlow and pyTorch frameworks?
It seems that running parallel multiple independent layers is not feasible for “Headless Implementation”. Is “Fused (Pipelined) operation” possible in this environment?
If there is a companion macrocontroller, does KMD run on this companion microcontroller (and UMD runs on main system CPU)?
If the companion microcontroller runs on RTOS (say, Nucleus), do we need to reprogram the entire Kernel Mode Driver that is currently accessing Linux Kernel?
Can NVDLA’s 2D convolution support “1x1 convolution” (for ResNet implementation)?
If we need a softmax activations in the output layer, is there a way to implement it in DLA (using application program)?
Are there any KMD (or UMD) APIs available for this implementation?
In the following paragraph in http://nvdla.org, “multiple DLA devices” mean multiple DLA IP cores? or multiple independent layers in one DLA core?
“Runtime driver supports submitting inference jobs to multiple DLA devices”
Application program should be written in C/C++?, not Java?
How does the application program submit a inference job to KMD?
Does UMD also include Linux Kernel Driver?
What are the portability layers supposed to do in UMD and KMD, respectively? Licensee is supposed to program these?
Is KMD stack mainly composed of the firmware (scheduler) and Linux Kernel Driver?
Does KMD stack have system calls for power management in the Liux Kernel?
Suppose that runtime environment is going to be running on Android (instead of Linux), what do we need to do in this case?
对于新增的data format 文档中winograd weight data format有些疑问,在图17中,为什么假设kernel是5x5 stride 2 , 根据我的理解,dla目前支持kernelsize=3x3,stride=1的情景?
One question about the figure17 in the lastest document about the data format: why the original weight kernel in the left of figure is 5x5 x48byte with stride size of 2 ? I do not think dla can support winograd opt in this situation .
http://nvdla.org/_images/format_channel_extension_and_conversion_for_wingorad.svg
Dears, I am using PDFTexify (the CTex program) in windows environment to compile the latex files, and the svg files prevent the success of compilation. (Reporting some errors like 'Unknow graphics extension: .svg')
Are you meeting with the same problem?
How do you fix it?
Thank you!
This link on scalability configuration shows the differences in parameters for the two
http://nvdla.org/hw/v2/scalability.html
However, I would like to know what impact these parameters difference would mean from an application difference standpoint.
Is it possible to measure power consumption in the virtual simulator? We're just starting out and aren't sure if the NVDLA framework works for our project.
Hi
Please update which of the Nvidia Hardware currently has DLA cores? Xavier is only one I can find
Regards
I'm getting the error:
adding /home/travis/build/nvdla/doc/.env to cache
creating directory /home/travis/build/nvdla/doc/.env
sh: 1: sw_vers: not found
on build #16 for pull request #5. I see that appium/appium#1580
had the same error when iOS checks were done on Ubuntu. Could this be the same problem? I'm running Ubuntu.
Or am I doing something wrong?
The Compilation phase is responsible for converting (aka compiling) a deep neural network into a sequence of hardware layers that are optimized for a given NVDLA configuration. Having a compiled network optimized for specific hardware configuration improves performance by reducing model size, load and run times. Compilation is a 2-step process that consists of: Parsing and Compiling.
But where is it?
what's the MAX_BUSY_CYCLE mean(mentioned in NVDLA_OpenSource_Performance.xls)? include the zero-calculations or not?
The NVDLA unit description (http://nvdla.org/hw/v1/ias/unit_description.html) mentions an upper limit length of 32 for a Stripe Operation:
"The upper limit is 32 due to buffer size in the accumulator"
However, this seems to contradict the buffer size as mentioned in the "Convolution Accumulator" chapter. Let me explain why:
Every Atomic Operation results in 16 partial sums (see chapter "Atomic Operation"). So, we will have 32x16=512 Elements in total after a maximum sized Stripe Operation.
Each of these elements will be saved as an INT48 (when using INT16 in the previous steps) in the assembly SRAM group (see table 49).
This results in 512x6 Byte=3kiB.
According to the chapter "Convolution Accumulator", the buffer size is 96Bx128=12kiB
So, in theory the length of Stripe Operation could be 128 instead of 32.
Is there any reason why this is not the case or are the calculations wrong?
Hello,
when executing YOLO on NVDLA, I realized that currently it is hard-corded somewhere either in the HW implementation or KMD. When I pool an 448x448 image, the result is completely wrong. But when I reduce the image size to half (224x224), the pooling engine works fine. From what it looks like, for 448x448 images, there is basically a duplication of the original image in the pooling result and it starts from a little bit right than the center of the image.
Pooling result:
Original image:
I'm assuming the 256th pixel, because I have a feeling somewhere in the code uint8_t is used and it is not large enough for reading 448x448 images. Has anyone encountered the same problem? Or does anyone know there is indeed a hard-coded section? I am very curious about this.
Best
Tim
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.