Giter Site home page Giter Site logo

Comments (10)

martinling avatar martinling commented on July 19, 2024

We've actually already made the change to bring those I/Os to the outside of the enclosure in the form of two PMOD ports. See this update on CrowdSupply for more info.

It would certainly be possible to enable using those I/Os as a simple logic analyzer in exactly the way you describe. Writing the software and gateware to do that isn't on our priority list at the moment, as we need to focus on the USB features we've committed to, but it would be a fairly straightforward project.

from luna.

symdeb avatar symdeb commented on July 19, 2024

Thank you. That tis really great. A few others question while having difficulties get through the Discord phone verification process (it seems to be locked up somehow) so post them here

  1. Could not find "technial specifications" for LUNA: Curious for what is the internal sample rate ? Other prduct use 1 to 2GHz (DSLogic/Zeroplus). What would the rate be for the user GPIO and what speed in/out clockd could be created?

  2. Read that memory is 8MB . Other products support 64~128MB For USB HS, how many seconds of data could be stored in the buffer or is for HS ? Could memory be expanded and would streamed transfer an viable option through the USB 3.0 host ?

from luna.

martinling avatar martinling commented on July 19, 2024

Curious for what is the internal sample rate ? Other prduct use 1 to 2GHz (DSLogic/Zeroplus). What would the rate be for the user GPIO and what speed in/out clockd could be created?

LUNA doesn't work by sampling the USB data lines directly at high speed, there are USB 2.0 PHYs for each port which are used as the high speed interface.

The FPGA is connected to the ULPI interface of each PHY, which includes an 8-bit parallel data bus. Because of that 8-bit wide bus, the logic for capturing USB packets, buffering them and delivering them to the host only needs to run at 60MHz (8 x 60 = 480Mbit/s).

The simplest way to integrate GPIO capture would be to sample the PMOD I/Os at the 60MHz clock, and add the resulting data to the capture buffer. With a bit more work, they could potentially also be sampled/driven from another clock domain at a higher rate. The ECP5 I/O buffers are limited to 200MHz in / 150MHz out for LVCMOS, but can go up to 400MHz for LVDS. Buffer space and USB throughput for the logic data will be the main limiting factors though, especially if trying to capture USB at the same time.

Read that memory is 8MB . Other products support 64~128MB For USB HS, how many seconds of data could be stored in the buffer or is for HS ? Could memory be expanded and would streamed transfer an viable option through the USB 3.0 host ?

Since we identify and capture USB packets on the FPGA, rather than just raw samples of the data lines, the duration we can store in memory depends on the USB traffic involved. In the most demanding scenario where the bus is fully utilized at HS, an 8MB buffer provides something like ~150ms of capture.

We can stream buffered packets to the host at around 40MB/s, so if the target bus utilization is moderate, or limited to short bursts of high traffic, it is possible to capture indefinitely. Due the overhead involved in streaming a HS capture over HS however, it's not possible to maintain continuous capture if the target bus is more heavily utilized.

It may be possible to achieve unlimited HS capture by a channel bonding approach using both the 'host' and 'sideband' ports for streaming, but this would require two cables to the host which would have to be attached to independent buses. Implementing this is not currently on our roadmap.

The current hardware does not have any support for USB 3.0. There is partial support for USB 3.0 in the LUNA gateware library, but utilising it will require a future product with different hardware.

from luna.

symdeb avatar symdeb commented on July 19, 2024

Thank you much for the explanation. Ideally that user IO could be used to test MII/RMII/GMII for ethernet, but those are 25, 50 and 150MHz respectively.

For USB, the use case faced is as follows:
1, Using a tool based on libusb sending a control request from a host PC to retrieve a interface descriptor from a device does not show anything going out in wireshark (retrieving device descriptors works fine and show up in Wireshark)
2. Adding breakpoint/debug code in the USB device does not show a receipt for such packet
3. The USB device firmware library creator would like to see "proof" that the request did go over USB before
putting the library into question.

Thus the idea was to use an analyzer to capture if there was a really a transfer on bus:

  1. At Full speed, the buffer might be large enough for several seconds of USB data that can be transfered to a USB 2.0 host
  2. For High speed, longer capture requires streaming and requires an USB 3.0 to the host to keep up the data tranfer

Approaches:
A. Put the device in full speed and use a low cost USB analzyer. the result would probably be the same at high speed, or
B. At high speed use a high cost capture device and a USB 3.0 PC host with sufficient speed to catch up the USB 3,0 data

So here is why LUNA came in as a low cost option for HS. The only question now is if the USB PC host is fast enough.
That why a buffer of say 256MB would come in handy for several seconds of buffered data (perhaps compressed)
Please correct me if this way of thought is incorrect or if there is a better approach

from luna.

martinling avatar martinling commented on July 19, 2024

The other way around the USB 2.0 bottleneck is to do some packet filtering on the FPGA, to exclude things that aren't of interest. E.g. you could discard all traffic that's not on endpoint 0, which would keep the data rate down whilst still ensuring that you see the transfer you're looking for if it's there.

At the moment we don't have frontend features for doing that, but with our Amaranth workflow, an ad-hoc hack to add a filter on the FPGA side can be as simple as editing luna/gateware/usb/analyzer.py and re-running things.

from luna.

symdeb avatar symdeb commented on July 19, 2024

That is a great feature for custom code for data manipulation and triggering. Looking at the schematic, the upstream USB is high speed. Even the ULPI runs at 60MHz, all the data has to be moved upstream again to the host (PC). if that is uncompressed the data amount would be about 1:1 (if not more if there is some overhead such as meta data), correct ?
Since the FPA need to cope with the "slow" upstream USB 2.0 ULPI interface, Would that not cause congestion in the FPGA ?
In other words, were there any considerationto us an USB3.0 tranceiver instead that could reduce risk for such bottlenecks for streaming data to the host (PC) ? The FPGA would need to add functionality to interfae to such an USB 3.0 device controller.
Key question: Have any experiments been done if the USB HS streaming to the PC host can sustained over longer periods, say 5 to 10 seconds ?

from luna.

zyp avatar zyp commented on July 19, 2024

We're using the LUNA stack in Orbtrace, and I can share some performance numbers from our testing.

First of all, the theoretical max HS USB bulk capacity is 13 packets per microframe, and at 512B per packet and 8000 microframes per second, that comes out to 425.984 Mb/s (53.248 MB/s). We have however not been able to reach this in our testing, probably due to practical limitations with the host side scheduling and reserved capacity for other devices on the bus.

We've however been able to reach 12 packets per microframe which comes out to around 393 Mb/s (49 MB/s), and can sustain this as long as there's not a lot of other traffic on the bus fighting for capacity. We're doing this with only 8kB+1kB of buffering on the device side.

I figure more buffering is mainly useful if you've got bursty traffic with a reasonably large difference between peak and average data rates.

from luna.

martinling avatar martinling commented on July 19, 2024

Overhead depends somewhat on the nature of the traffic but let's work through a simple high-throughput case - capturing a single target device making one continuous flat-out bulk IN transfer with full 512 byte packets. Assume for the sake of simplicity that we're filtering out SOF packets and other traffic, and that nothing gets NAKed.

In that scenario, for each 512 bytes of payload data sent by the target device, LUNA will currently put 525 bytes into the capture buffer for the IN transaction:

  • 2 bytes length header + 3 bytes IN packet
  • 2 bytes length header + 3 bytes DATA0/1 packet fields + 512 bytes payload
  • 2 bytes length header + 1 byte ACK packet

As @zyp notes, the realistic limit for one device is 12 of those transactions per microframe and there are 8000 microframes per second, so LUNA has to buffer 12 * 8000 * 525 B/s for a total of 50.4MB/s, and could stream that buffer to the host at 12 * 8000 * 512 B/s or 49.152MB/s. So the buffer would fill up at (50.4 - 49.152) = 1.248MB/s. With an 8MiB buffer that would correspond to 6.72s of capture time. But if we can compress the buffered data on the FPGA side by even just 3%, then it becomes possible to capture indefinitely.

In most practical scenarios, HS throughput to the host should be sufficient to get things done, especially when combined with all the filtering and triggering logic that's possible through customising the analyzer gateware. Adding a USB 3.0 transceiver would have significantly increased the cost of the device. For scenarios where you really want to capture every detail of a completely saturated bus, there's always the possibility of using the second USB 2.0 port to increase throughput.

from luna.

symdeb avatar symdeb commented on July 19, 2024

6 seconds would be more than enough to gather data. Looking at the schematic, the USB3343 is used. Does LUNA support USB Low/full speed as well . It would require to get the USB data, but not via ULPI,correct ?

from luna.

martinling avatar martinling commented on July 19, 2024

Yes, low and full speed capture are supported. At the moment you have to select which speed you want to capture, but on-the-fly speed detection is on the todo list.

from luna.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.