We currently have some FIXMEs involving structs that we should address:
Here is a minimal Copilot example involving structs to use as a test case:
{-# LANGUAGE DataKinds #-}
module Main where
import Language.Copilot
import Copilot.Compile.C99
import Copilot.Verifier (verify)
newtype Battery = Battery
{ temp :: Field "temp" Word16
}
instance Struct Battery where
typename _ = "battery"
toValues battery = [Value typeOf (temp battery)]
instance Typed Battery where
typeOf = Struct (Battery (Field 0))
spec :: Spec
spec = do
let battery :: Stream Battery
battery = extern "battery" Nothing
trigger "testfun" true [arg battery]
main :: IO ()
main = reify spec >>= verify mkDefaultCSettings [] "structs"
Before we can even get to the FIXMEs above, however, there is a more fundamental issue: LLVM compiles structs in somewhat surprising ways. Consider this C program with three different structs:
struct s1 {
int x1;
double y1;
};
struct s2 {
int x2;
double y2;
char z2;
};
struct s3 {
int x3;
int y3;
};
void f1(struct s1 ss) {}
void f2(struct s2 ss) {}
void f3(struct s3 ss) {}
Although f1
, f2
, and f3
all look quite similar, they are compiled very differently. Here is the bitcode that results from compiling this program:
%struct.s2 = type { i32, double, i8 }
; Function Attrs: norecurse nounwind readnone uwtable
define dso_local void @f1(i32 %ss.coerce0, double %ss.coerce1) local_unnamed_addr #0 {
entry:
ret void
}
; Function Attrs: norecurse nounwind readnone uwtable
define dso_local void @f2(%struct.s2* nocapture byval(%struct.s2) align 8 %ss) local_unnamed_addr #0 {
entry:
ret void
}
; Function Attrs: norecurse nounwind readnone uwtable
define dso_local void @f3(i64 %ss.coerce) local_unnamed_addr #0 {
entry:
ret void
}
Three somewhat surprising things to note:
- The bitcode for
f1
takes two arguments rather than a single %struct.s1
argument. In fact, there is no %struct.s1
type at all in the bitcode! Instead, LLVM effectively unpacks the x1
and y2
fields into f1
's arguments.
- The bitcode for
f2
has a %struct.s2*
argument—a pointer type—rather than a %struct.s2
argument. That is, it passes its argument by reference, not by value.
- The bitcode for
f3
takes an i64
as an argument rather than a %struct.s3
. Again, there is no %struct.s3
argument at all in the bitcode. Instead, LLVM combines the x3
and y3
fields into a single argument.
These oddities are all explained by the System V ABI (see section 3.2.3 of this document), which has very particular requirements for structs that are passed by value. There are similar oddities that arise when returning structs, such as in these examples:
struct s1 g1() {
struct s1 ss = { .x1 = 0, .y1 = 0 };
return ss;
}
struct s2 g2() {
struct s2 ss = { .x2 = 0, .y2 = 0, .z2 = 0 };
return ss;
}
struct s3 g3() {
struct s3 ss = { .x3 = 0, .y3 = 0 };
return ss;
}
Which give rise to this LLVM bitcode:
; Function Attrs: norecurse nounwind readnone uwtable
define dso_local { i32, double } @g1() local_unnamed_addr #0 {
entry:
ret { i32, double } zeroinitializer
}
; Function Attrs: argmemonly nounwind willreturn
declare void @llvm.memset.p0i8.i64(i8* nocapture writeonly, i8, i64, i1 immarg) #1
; Function Attrs: nounwind uwtable
define dso_local void @g2(%struct.s2* noalias nocapture sret %agg.result) local_unnamed_addr #2 {
entry:
%0 = bitcast %struct.s2* %agg.result to i8*
tail call void @llvm.memset.p0i8.i64(i8* nonnull align 8 dereferenceable(24) %0, i8 0, i64 24, i1 false)
ret void
}
; Function Attrs: norecurse nounwind readnone uwtable
define dso_local i64 @g3() local_unnamed_addr #0 {
entry:
ret i64 0
}
g1
and g3
's treatment closely mirrors that of f1
and f3
. g2
is perhaps the strangest of all, as it takes the struct s2
return type and converts it to an argument!
All of these quirks will make copilot-verifier
's job more challenging, as it has to map the Copilot types, which closely correspond to the C source language, to the types that they are compiled to in the LLVM target language. This means that we will either need to:
- Implement all of the corner cases of the System V ABI involving structs in
copilot-verifier
, or
- Change
copilot-c99
's codegen such that it passes and returns structs by reference, not by name, in the functions that copilot-verifier
overrides (e.g., the trigger functions).
Option (2) certainly sounds less tricky, although it would require coordination with the upstream copilot
repo.