princetonuniversity / lucid Goto Github PK

License: MIT License

Makefile 0.34% Shell 2.14% Python 4.73% OCaml 85.48% Standard ML 1.20% C++ 0.17% TeX 5.79% Dockerfile 0.15%

lucid's Introduction

The Lucid / DPT (data plane threads) language

Lucid is a data plane programming language that focuses on simple, general, and modular abstractions. This makes it easier to express a range of data-plane algorithms and data structures, like specialized hash tables (e.g., cuckoo hashing), sketches, and custom telemetry caches. Programs are often 10X fewer lines of code in Lucid compared to P4, and read much more like Go, Python, or C, than equivalent implementations in P4.

Lucid also has a type-and-effect system that guarantees the ordering of operations on global state (state that persists across packets). Programs that pass Lucid's ordering check can be laid out as a pipeline (important for targets like the Tofino) and also have a convenient memory model: each packet's updates to all the global state can be viewed as an atomic transaction, and a stream of packets can be viewed as a serial sequence of such atomic transactions that executes in the order of packet arrivals. In other words, you don't need to worry about concurrency or race conditions at the packet level for Lucid programs that pass the ordering check.

There are 3 implementations of Lucid:

The Lucid interpreter. This defines the semantics of the language in a target-independent way. It is relatively fast and works on simple json events, either from a file given at startup or piped from stdin.
The Lucid-Tofino compiler. This backend compiles a Lucid program to a p416 program for the Tofino 1. It does many target-specific optimizations drawn from our many years of programming the Tofino.
A DPDK-C compiler. This backend produces a C program that uses DPDK for packet IO. It is a work in progress and currently single threaded.

Installation

The easiest way to run Lucid is with the pre-built binaries in the release directory. This should work for recent macos and ubuntu/debian systems.

git clone https://github.com/PrincetonUniversity/lucid/
cd lucid
./release/unpack.sh
./release/dpt.sh -h

Note: there is only a pre-built binary for the Interpreter at this time.

Docker

There is also a docker image that can run the Lucid interpreter and Tofino compiler.

1. Install docker

if you are on a laptop/desktop, just install the docker desktop app: docker desktop
if you are on a server... you can probably figure out how to install docker

2. Clone this repository and pull the lucid docker container

Run this in your terminal:

git clone https://github.com/PrincetonUniversity/lucid/
cd lucid
./docker_lucid.sh pull

This will download about 400MB of data and should take < 5 minutes.

Once finished, you can run ./docker_lucid.sh interpret to run the interpreter or ./docker_lucid.sh compile to run the Tofino compiler. The docker_lucid script takes care of forwarding all arguments, files, and directories to / from the docker image.

Building from source

Finally, you can also build Lucid from source. Its main dependencies are ocaml and z3. On macos or linux, you should be able to do:

./install_dependencies.sh
make

to build the Lucid interpreter (dpt) and tofino compiler (dptc).

Syntax highlighting

There is a VSCode syntax highlighter for Lucid here: (https://github.com/benherber/Lucid-DPT-VSCode-Extension)

Lucid also renders okay as C (besides polymorphic size arguments).

Run the interpreter

Using the docker container, you can run the interpreter with ./docker_lucid.sh interpret <lucid program name>. The interpreter type checks your program, then runs it in a simulated network defined by a specification file.

Try it out with one of the tutorial programs (monitor.dpt):

jsonch@jsonchs-MBP lucid % ./docker_lucid.sh interp tutorials/interp/01monitor/monitor.dpt
# ... startup output elided ...
t=0: Handling event eth_ip(11,22,2048,1,2,128) at switch 0, port 0
t=600: Handling event prepare_report(11,22,2048,1,2,128) at switch 0, port 196
sending report about packet {src=1; dst=2; len=128} to monitor on port 2
dpt: Final State:

Switch 0 : {

 Pipeline : [ ]

 Events :   [ ]

 Exits :    [
    eth_ip(11,22,2048,1,2,128) at port 1, t=600
    report(1,2,128) at port 2, t=1200
  ]

 Drops :    [ ]

 packet events handled: 0
 total events handled: 2

}

Run the compiler

Finally, to compile Lucid programs to P4 for the Intel tofino, run the compiler with ./docker_lucid.sh compile <lucid program name> -o <build directory name>.

The compiler translates a Lucid program into P4, optimizes it, and produces a build directory with a P4 program, Python control plane, and a directory that maps control-plane modifiable Lucid object names to their associated P4 object names. Try it out with the tutorial application:

jsonch@jsonchs-MBP lucid % ./docker_lucid.sh compile tutorials/interp/01monitor/monitor.dpt -o monitor_build
build dir: monitor_build
build dir: /Users/jsonch/Desktop/gits/lucid/monitor_build
# ... output elided ...
[coreLayout] placing 41 atomic statement groups into pipeline
.........................................
[coreLayout] final pipeline
--- 41 IR statements in 3 physical tables across 2 stages ---
stage 0 -- 2 tables: [(branches: 3, IR statements: 25, statements: 25),(branches: 4, IR statements: 13, statements: 13)]
stage 1 -- 1 tables: [(branches: 4, IR statements: 3, statements: 3)]
Tofino backend: -------Layout for egress: wrapping table branches in functions-------
Tofino backend: -------Layout for egress: deduplicating table branch functions-------
Tofino backend: -------Translating to final P4-tofino-lite IR-------
Tofino backend: -------generating python event parsing library-------
Tofino backend: -------generating Lucid name => P4 name directory-------
Tofino backend: -------printing P4 program to string-------
Tofino backend: -------printing Python control plane to string-------
compiler: Compilation to P4 finished. Writing to build directory:/app/build
local p4 build is in: /Users/jsonch/Desktop/gits/lucid/monitor_build
jsonch@jsonchs-MBP lucid % ls monitor_build
eventlib.py     libs            lucid.p4        manifest.txt
globals.json    logs            lucid.py        scripts
layout_info.txt lucid.cpp       makefile        src
jsonch@jsonchs-MBP lucid % cat monitor_build/manifest.txt 
Lucid-generated tofino project folderContents: 
lucid.p4 -- P4 data plane program
lucid.py -- Python script to install multicast rules after starting lucid.p4
eventlib.py -- Python event parsing library
globals.json -- Globals name directory (maps lucid global variable names to names in compiled P4)
makefile -- simple makefile to build and run P4 program
lucid.cpp -- c control plane (currently unused)

What to look at next

There are many example programs in our test suite, including some for the Tofino.

Lucid has been used to implement a number of interesting applications.

Lucid's wiki documents all of Lucid's well-supported features.

There are also a number of publications about Lucid, listed below.

Publications

About Lucid or its components:

Lucid, a Language for Control in the Data Plane (SIGCOMM 2021 -- The SIGCOMM 2021 artifact is in the sigcomm21_artifact branch).
Safe, modular packet pipeline programming (POPL 2022)
Lucid, a high-level easy-to-use dataplane programming language (Devon K. Loehr's PhD Thesis)
Automated Optimiation of Parameterized Data-Plane Programs with Parasol (arxiv preprint)

Using Lucid:

SwitchLog: A Logic Programming Language for Network Switches (PADL 2023)

lucid's People

Contributors

Stargazers

Watchers

Forkers

kumarkshiv danielbentleymacleod aj3189 taylorakin17 vaibhav-mehta1001 sophiayoo1 emmanueljs1 chenxiaoqino mollydream

lucid's Issues

Near-term TODO

In no particular order:

Parsers
Tuples in surface syntax
Smooth out entry/exit/whatever event designations
Actually get partial evaluation working
Make module interfaces detachable

Constant propagation

Probably a big project with a few levels, but there should be as much constant propagation as possible due to hardware constraints. This comes up a lot with polymorphic sizes, other compilation passes. Ideas:

simple constant prop (adding of two constants/bitshifts with constant size/etc.)
Eliminating branches of match tables/if statements if they switch on a constant

New memops not yet implemented in the backend!

I follow the instructions，but when I try to run make compile,system hint “Failure "New memops not yet implemented in the backend!".Can someone tell me why?

Error in chain_prob_stateful_firewall.dpt

When compiling chain_prob_stateful_firewall.dpt from examples/popl21, the hash units generated in the resulting P4 file have incorrect type variables. Specifically, in the code snippet below,

    CRCPolynomial<bit<32>>(44, true, false, false, 0, 0) dpt_12073_poly;
    Hash<bit<32>>(HashAlgorithm_t.CUSTOM, dpt_12073_poly) dpt_6329_hasher_12072_alu_0_opstmt;
    action dpt_6329_alu_0_opstmt() {
        hdr.BloomFilter_clear_helper_filter.idx = dpt_6329_hasher_12072_alu_0_opstmt.get({hdr.add_to_firewall.args_0, hdr.add_to_firewall.args_1});
        }

the hash unit should output <bit<4>> instead of <bit<32>>.
One thing to note is that this error only occurs on hash units for clearing indexes, i.e. not on the hash units for add_to_filter and in_filter.

See the expression wire event in the source code, does this refer to a non-background event

Put Types into tuples and records instead of Raw Types

Right now the type of a record/tuple includes a list of raw types, rather than a list of types. Is there a reason for this? The issue came up when trying to add information into the type but this information cannot be properly propagated during inference because of the raw_ty rather than ty in a record/tuple type. If there is a good reason for the raw_ty then I will find something else, but if not then we should change it.

An error occurred while executing the buildbox.sh interpreter command

Hello, thank you very much for your excellent work.

However, I encountered the following error when executing the buildbox.sh interpreter command as described in the section To build a lucid only VM:

default: setting up for compiler default: ***** installing requirements to run: lucid compiler ***** default: p4 studio cannot install: either /home/vagrant/vm or /home/vagrant/vm not found

The SSH command responded with a non-zero exit status. Vagrant assumes that this means the command failed. The output for this command should be in the log above. Please read the output to determine what went wrong.

I chose only VM, why does it say p4 studio can't be installed?

I have no clue about this and look forward to your reply.

Bug Report: Same field name in different header types leads to "Cannot unify" error

The attached program has two header field types ip_hdr_prefix and udp_hdr, both has a field with name len. When accessing ip#len, the compiler errored:

error: bugreport_fieldname.dpt: Cannot unify
 {int<<16>> sport; int<<16>> dport; int<<16>> len; int<<16>> csum}
 with
 {int<<8>> v_ihl; int<<8>> tos; int<<16>> len; int<<16>> id; int<<16>> flags_frag; int<<8>> ttl; int<<8>> proto; int<<16>> csum}

Looks like the second "len" in udp_hdr caused the problem; changing it to some other field name will fix the issue.

bugreport_fieldname.dpt

Feature Wishlist

These are in no particular order, but are split into "good first issues" (might require some design work but should be doable), and everything else. The wilder ideas are not necessarily feasible, and should be discussed and thought about before we decide if they're worth pursuing. Feel free to suggest more.

Good First Issues

Global primitives (e.g. you could write global int x. Currently you have to create a 1-element array to achieve this).
A separate location type for switch IDs. Currently these are represented as ints. We'd probably want some way to cast between them.
Partial evaluation and variable subsitution pass -- get rid of intermediate variables and static computations where possible.
Assignment to individual vector elements (e.g. vec[3] = 7). Likely related to the next point.
Allow use of the underscore identifier _ to indicate a variable/argument that won't be used.
Add a packet keyword to indicate events which should carry the underlying packet's payload with them. Should be easy to add to the frontend, possibly more work to add to the backend.
Restrict events to allow only first-order (no event-type arguments) and second-order (only first-order event-type arguments) events, rather than allowing arbitrary higher-order events as we do now.
Allow the return keyword to be used in a handler to terminate computation so the user doesn't have to write big if/else branches.
Make the interpreter interactive (let people pause it, insert their own packets in the middle of execution, and maybe inspect the network while it's paused)

Wilder ideas

Add a mod operator (or something like it) to implement wrapping around.
Allow users to define "size lists", which can be used to construct vectors where the elements have different sizes. Syntax might be something like sizelist foo = [4; 6; 8]; type vec = vector<foo>; vec x = [1<4>; 10<6>; 17<8>];.
Better type errors when unification fails
Better ordering errors that explain which line(s) and global variable(s) are relevant.
Better analysis of when it's okay for entry events to generate non-exit events. Might require user annotations.
Built-in security, e.g. events that can't be spoofed.
Integration with the control plane. It's there, might as well use it.
Integration with P4All. Related to the next point.
Automatically finding "good" values for certain abstract parameters, e.g. the size of the array. Could use our simulator, which is much faster than the Tofino's (albeit at a much higher level of abstraction).
Functors -- could be used for e.g. chain replication, or adding fault tolerance to certain events.

Problems with match statement

There is an error from the match statement below:

  match in_port, out_port with
  | 0, 1 -> {   int<<32>> src_ip = (int<<32>>)src_ip;
                int<<32>> dst_ip = (int<<32>>)dst_ip;
                BloomFilter.add_to_filter(filter, [src_ip; dst_ip]); }
  | _, _ -> {   int<<32>> src_ip = (int<<32>>)src_ip;
                int<<32>> dst_ip = (int<<32>>)dst_ip;
                bool in_filter = BloomFilter.in_filter(filter, [dst_ip; src_ip]); }

Error message when "typing again":
error: chain_prob_stateful_firewall.dpt: Unbound variable src_ip~602
which seems to be a bug.

Function inlining and comprehension unrolling must be done concurrently

Currently function calls are expressions which become statements when inlined. This causes a problem during function inlining, because we might have function calls inside vector comprehensions. We can't inline the function first because comprehensions take an expression, and we can't unroll the comprehension prior to inlining because we might not know how long it will be. Currently we get around this by doing both simultaneously, but this is both complicated and prevents us from running inlining as early as we'd like to. Some examples:

When integrating P4All, we can't run as precise stage estimated as we would like because we can't inline functions beforehand.
We can't pass globally-typed function arguments to events (see https://github.com/PrincetonUniversity/lucid/blob/main/examples/regression/reg4.dpt)

We can work around both of these (taking less precise estimates, or duplicating function bodies the same way we do events), but it might be nicest to find a way to decouple inlining and comprehension unrolling. We might do this by creating a new type of expression, e.g. EStmt of stmt * id, which has the semantics of evaluating the statement and then returning the value of the id variable (which may be initialized within the statement). Alternatively, we could create a new type of comprehension which basically inlines this, e.g. EComp2 of (stmt * id) * id * size.

Give a proper error if module interface has too general of a type

Currently if we have something like

module Foo {
  fun void ('a x);
} {
  fun void ('a x) {
    int z = x + 1; // Oops, x actually has to be an int, not 'a
  }
}

The type system won't properly catch the fact that the interface type is too general.