Giter Site home page Giter Site logo

llvm / clangir Goto Github PK

View Code? Open in Web Editor NEW
284.0 26.0 76.0 2.55 GB

A new (MLIR based) high-level IR for clang.

Home Page: https://clangir.org

License: Other

clang codegeneration compiler-frontend compilers cpp intermediate-code-generation intermediate-language intermediate-representation llvm

clangir's Introduction

ClangIR (CIR)

Check https://clangir.org for general information, build instructions and documentation.

clangir's People

Contributors

arsenm avatar chandlerc avatar chapuni avatar d0k avatar ddunbar avatar douggregor avatar dwblaikie avatar echristo avatar espindola avatar fhahn avatar isanbard avatar jdevlieghere avatar kazutakahirata avatar klausler avatar labath avatar lattner avatar lebedevri avatar lhames avatar maskray avatar nico avatar nikic avatar preames avatar rksimon avatar rnk avatar rotateright avatar rui314 avatar tkremenek avatar topperc avatar vitalybuka avatar zygoloid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clangir's Issues

Revise usage of `const_struct` and `const_array` to enhance codegen parity

// TODO(cir): constant arrays are currently just pushed into the stack using
// the store instruction, instead of being stored as global variables and
// then memcopyied into the stack (as done in Clang).
else if (auto arrTy = op.getType().dyn_cast<mlir::cir::ArrayType>()) {
// Fetch operation constant array initializer.
auto constArr = op.getValue().dyn_cast<mlir::cir::ConstArrayAttr>();
if (!constArr)
return op.emitError() << "array does not have a constant initializer";
// Lower constant array initializer.
auto denseAttr = lowerConstArrayAttr(constArr, typeConverter);
if (!denseAttr.has_value()) {
op.emitError()
<< "unsupported lowering for #cir.const_array with element type "
<< arrTy.getEltType();
return mlir::failure();
}
attr = denseAttr.value();
} else

Note: This is indeed an interesting difference, if we forget about LLVM and think only about CIR for a moment, we're also not being uniform right now (my fault), given that const_struct attrs are used inline within cir.const and others are going through globals, at some point we need to migrate more.

Originally posted by @bcardosolopes in #171 (review)

I wonder if this approach is actually better than what clang does, given the lack of memcopy/globals being involved - OTOH we might be loosing uniquing from unnamed address tagged globals. Any thoughts @htyu ?

Originally posted by @bcardosolopes in #171 (comment)

An explicit assignment better supports later optimization passes than a memcpy does, which always requires an additional step of analysis. However from performance point of view an explicit assignment would be as less efficient as memcpy does, but I guess we can always push the conversion to memcpy to llvm.

Originally posted by @htyu in #171 (comment)

C/C++ atomic types and operations

There are multiple ways to go about this, ideally member functions to be idiomatically recognized and implement operations on top of first class CIR types.

Fix CIR parsing problem with struct/classes

For something simple as:

struct String {
  long size;
  long capacity;

  String() : size{0}, capacity{0} {}
  String(char const *s) : size{strlen(s)}, capacity{size} {}
};

We can emit CIR but not read it back:

error: expected '=' in type alias definition
!22struct2EString22 = !cir.struct<"struct.String", i64, i64>
     ^

We probably need some CIR specific prefix here or something.

Return int8Ty instead of i1 when function defined return type `bool`

while trying to add LNot operation, notice that the type converter will make cir.bool translated to i8, I think i1 maybe more proper.
For example:

bool ulnot() {
  unsigned a = 0;
  return ~a;
}

./bin/clang -emit-cir t.cpp -o -

module attributes {cir.sob = #cir.signed_overflow_behavior<undefined>} {
  cir.func @_Z5ulnotv() -> !cir.bool {
    %0 = cir.alloca !cir.bool, cir.ptr <!cir.bool>, ["__retval"] {alignment = 1 : i64} loc(#loc2)
    %1 = cir.alloca i32, cir.ptr <i32>, ["a", init] {alignment = 4 : i64} loc(#loc9)
    %2 = cir.cst(0 : i32) : i32 loc(#loc4)
    cir.store %2, %1 : i32, cir.ptr <i32> loc(#loc9)
    %3 = cir.load %1 : cir.ptr <i32>, i32 loc(#loc5)
    %4 = cir.unary(not, %3) : i32, i32 loc(#loc6)
    %5 = cir.cast(int_to_bool, %4 : i32), !cir.bool loc(#loc10)
    cir.store %5, %0 : !cir.bool, cir.ptr <!cir.bool> loc(#loc11)
    %6 = cir.load %0 : cir.ptr <!cir.bool>, !cir.bool loc(#loc11)
    cir.return %6 : !cir.bool loc(#loc11)
  } loc(#loc8)
}

./bin/clang -emit-cir t.cpp -o - | ./bin/cir-tool -cir-to-llvm -o -

module {
  llvm.func @_Z5ulnotv() -> i8 {
    %0 = llvm.mlir.constant(1 : index) : i64
    %1 = llvm.alloca %0 x i8 {alignment = 1 : i64} : (i64) -> !llvm.ptr<i8>
    %2 = llvm.mlir.constant(1 : index) : i64
    %3 = llvm.alloca %2 x i32 {alignment = 4 : i64} : (i64) -> !llvm.ptr<i32>
    %4 = llvm.mlir.constant(0 : i32) : i32
    llvm.store %4, %3 : !llvm.ptr<i32>
    %5 = llvm.load %3 : !llvm.ptr<i32>
    %6 = llvm.mlir.constant(-1 : i32) : i32
    %7 = llvm.xor %6, %5  : i32
    %8 = llvm.mlir.constant(0 : i32) : i32
    %9 = llvm.icmp "ne" %7, %8 : i32
    %10 = llvm.zext %9 : i1 to i8
    llvm.store %10, %1 : !llvm.ptr<i8>
    %11 = llvm.load %1 : !llvm.ptr<i8>
    llvm.return %11 : i8
  }
}

Proper support for primitive types

Right now we use MLIR's ints, floats, etc. Just like we have cir.bool and cir.ptr we should have cir.int ... like type and track more language level info (signedness, etc)

MLIR based diagnostics (e.g. LifetimeChecker) should be defined using tablegen

Currently CIR uses the diagnostic engine provided by MLIR.

I wonder if CIR will migrate to the diagnostic engine of clang itself in the future? Because the existing clang diagnostic information is defined declaratively using TableGen, and has its diagnostic groups and settings (e.g. enabled by default or not).

Example:

D << "use of invalid pointer '" << varName << "'";

Link: https://mlir.llvm.org/docs/Diagnostics/
Link: https://clang.llvm.org/docs/InternalsManual.html#the-diagnostics-subsystem

Support for `Decl::PragmaComment`

Is support for pragmas in scope for ClangIR? I was wondering if the content of a pragma could be exposed as a ClangIR attribute, maybe analogously to ASTFunctionDeclAttr by wrapping PragmaCommentDecl?

I'd be happy to add this if it's a good first issue.

Create a common interface/trait for CIR global values

In the original codegen, Clang has a parent class GlobalValue that can handle different types of child classes homogeneously (e.g. global variables, functions, etc.)

class GlobalValue : public Constant {

We often run into cases where we need to perform unnecessary casts to distinguish between global variables and functions because, despite both being global values, they do not share a common interface. Some examples:

// TODO(cir): can a tentative definition come from something other than a
// global op? If not, the assertion below is wrong and should be removed. If
// so, getGlobalValue might be better of returining a global value interface
// that alows use to manage different globals value types transparently.
if (GV)
assert(isa<mlir::cir::GlobalOp>(GV) &&
"tentative definition can only be built from a cir.global_op");

// TODO(cir): create a global value trait that allow us to uniformly handle
// global variables and functions.
if (auto Gv = dyn_cast<mlir::cir::GetGlobalOp>(Op)) {
auto *result =
mlir::SymbolTable::lookupSymbolIn(getModule(), Gv.getNameAttr());
if (auto globalOp = dyn_cast<mlir::cir::GlobalOp>(result))
if (!globalOp.isDeclaration())
return;
}

Ideally, the CIRGenModule::getGlobalValue methods should return something akin to a CIRGlobalValueInterafce that allows us to handle these globals transparently for actions common to any global value.

`Dialect/Math/canonicalize.mlir` is failing on macOS

Not sure if this is macOS specific, but the test fails due to floating point representation.
The test case around @sin_fold

func.func @sin_fold() -> f32 {
  %c = arith.constant 1.0 : f32
  %r = math.sin %c : f32
  return %r : f32
}

expects 0.84{{[0-9]+}} but canonicalizer produces 8.414710e-01.

The following diff fixes the test for me, but I'm not sure if it breaks something on other platforms:

diff --git a/mlir/test/Dialect/Math/canonicalize.mlir b/mlir/test/Dialect/Math/canonicalize.mlir
index d7c4bb712992..1b9eac808f0b 100644
--- a/mlir/test/Dialect/Math/canonicalize.mlir
+++ b/mlir/test/Dialect/Math/canonicalize.mlir
@@ -449,7 +449,7 @@ func.func @trunc_fold_vec() -> (vector<4xf32>) {
 }

 // CHECK-LABEL: @sin_fold
-// CHECK-NEXT: %[[cst:.+]] = arith.constant 0.84{{[0-9]+}} : f32
+// CHECK-NEXT: %[[cst:.+]] = arith.constant 8.4{{[0-9]+}}e-01 : f32
 // CHECK-NEXT:   return %[[cst]]
 func.func @sin_fold() -> f32 {
   %c = arith.constant 1.0 : f32
@@ -458,7 +458,7 @@ func.func @sin_fold() -> f32 {
 }

 // CHECK-LABEL: @sin_fold_vec
-// CHECK-NEXT: %[[cst:.+]] = arith.constant dense<[0.000000e+00, 0.84{{[0-9]+}}, 0.000000e+00, 0.84{{[0-9]+}}]> : vector<4xf32>
+// CHECK-NEXT: %[[cst:.+]] = arith.constant dense<[0.000000e+00, 8.4{{[0-9]+}}e-01, 0.000000e+00, 8.4{{[0-9]+}}e-01]> : vector<4xf32>
 // CHECK-NEXT:   return %[[cst]]
 func.func @sin_fold_vec() -> (vector<4xf32>) {
   %v1 = arith.constant dense<[0.0, 1.0, 0.0, 1.0]> : vector<4xf32>

Add LLVM lowering support for cir.scope

This should be done using the MLIR rewriters in clang/lib/CIR/CodeGen/LowerToLLVM.cpp. At some point we probably want to explore some incremental lowering for a mixed structured/unstructured control-flow, but for now:

  • Eliminate all cir.scopes
  • Make sure alloca's are properly moved to the function entry block

Patch remaining undesired usages of MLIR's built-in integer type

There are still some scenarios where we should be using the custom !cir.int type.

Most of these scenarios derive from the CIRGenBuilder.h helper class:

mlir::Type getInt8Ty() { return typeCache.Int8Ty; }
mlir::Type getInt32Ty() { return typeCache.Int32Ty; }
mlir::Type getInt64Ty() { return typeCache.Int64Ty; }

In some cases, the use of built-in integer types makes sense, like in the alignment attribute:

auto alignIntAttr = CGM.getSize(alignment);

In others, like function pointers, we should probably change to cir.int:

auto vtablePtrTy = builder.getVirtualFnPtrType(/*isVarArg=*/true);

Verification failure for recursive struct

The following code

struct ListNode {
  struct ListNode *next;
};

struct List {
  struct ListNode head;
};

void foo(struct List *lst) {
  lst->head.next = &lst->head;
}

currently fails verification due to

$ clang -fclangir-enable -S tmp.c
...
loc(fused["tmp.c":10:3, "tmp.c":10:26]): error: 'cir.store' op failed to verify that type of 'value' matches pointee type of 'addr'
fatal error: error in backend: CIR codegen: module verification error before running CIR passes

The problem is in the assignment operation which expands to

cir.store %2, %5 : !cir.ptr<!cir.struct<"struct.ListNode", !cir.ptr<!cir.struct<"struct.ListNode", incomplete, #cir.recdecl.ast>>, #cir.recdecl.ast>>, cir.ptr <!cir.ptr<!cir.struct<"struct.ListNode", incomplete, #cir.recdecl.ast>>>

As we can see type of the stored value is complete:

!cir.ptr<!cir.struct<"struct.ListNode", !cir.ptr<!cir.struct<"struct.ListNode", incomplete, #cir.recdecl.ast>>, #cir.recdecl.ast>>

but type of the address is not:

!cir.ptr<!cir.ptr<!cir.struct<"struct.ListNode", incomplete, #cir.recdecl.ast>>>

This cause verification failure due to StoreOp's TypesMatchWith constraint.

What would be the best way to fix this? My understanding is that MLIR does not allow recursive types...

Mismatched bitfied types in tests.

Current main does not pass check-clang with mismatched types in bitfieds:

loc("./clang/test/CIR/CodeGen/bitfields.c":24:7): error: member type mismatch
loc("./clang/test/CIR/CodeGen/bitfields.c":24:7): error: member type mismatch

Problem arises in GetMemberOp which returns different type the actual bitfield:

getResultTy().getPointee() == !cir.int<s, 32>

while actual bitfield type is:

recordTy.getMembers()[getIndex()] == !cir.int<u, 32>

Lowering of nested break/continue in loops

Looks like currently CIRLoopOpLowering does not lower the nested break/continue statements within the lowered loop. For example this program

#include <stdio.h>

int foo(int n) {
  int s = 0, i;
  for (i = 1; i < n; ++i) {
    if (i == 1)
      break;
    ++s;
  }
  return s;
}

int main() {
  int s = foo(10);
  printf("%d\n", s);
  return 0;
}

prints 9 instead of expected 0.

Probably the best way to fix this is to scan body region inside CIRLoopLowering, replacing break/continue yields with BrOps?

Move struct field name from callsite to type

This is great, glad you beat me to it, we needed this to happen sooner or later.

Brain dump: keeping the name at the "callsite" is a bit silly, we should probably store these names into the cir.struct themselves and when we create a pretty printer for struct_element_addr we could print the names for convenience, by just looking at the type.

Originally posted by @bcardosolopes in #148 (review)

Use readable representation for char types

Basically decide whether char types should be represented using an alias (!schar = !cir.int<s, 8>) or a custom type.

It should be taken into consideration that we want to differentiate between char (char) types and built-in C99 types (int8_t) in the IR.

Harden operation constraints from AnyType to CIRType

Now that #81 is fixed, we need to change operations constraints to only accept CIR types, simple example:

...
let arguments = (ins Arg<CmpOpKind, "cmp kind">:$kind,
                       AnyType:$lhs, AnyType:$rhs);

Should become:

...
let arguments = (ins Arg<CmpOpKind, "cmp kind">:$kind,
                       CIRType:$lhs, CIRType:$rhs);

And additional changes to CIRDialect.cpp and whatnots.

Incorrect gep indexes

looks like something is broken after #271
Reproduce case:

typedef struct {
    int x1;
    int x2;
    int x3;
} A;

void init(A* a) {
    a->x1 = 1;
    a->x2 = 2;
    a->x3 = 3;
}

int main() {
    A a;
    init(&a);
    return 0;
}

compilation failed with the next error:
llvm::Type *llvm::checkGEPType(llvm::Type *): Assertion Ty && "Invalid GetElementPtrInst indices for type!"' failed.`

GetMemberOp verification failure on recursive types

Let's discuss recursive types again.

The long story short - I have a verification failure in the GetMemberOp: index out of bounds.

The code to reproduce is the following:

typedef struct SomeStruct {
    struct SomeStruct* prev;
    struct SomeStruct* next;
    int x;
} A;

typedef struct {
    A some;
} B;

void foo(B* b, A* a) {    
    a->x = 42;     // <---- getMemberOp failed
}

Note, if we swap arguments as foo(A* a, B* b) everything will be fine.

I spent a huge amount of time trying to understand the reason ( what is probably bad for me :) )
And what I came to is approximately the following.

First of all, the CIRGenTypes.cpp and CIRRecordLayoutBuilder.cpp have a similar implementation to one in clang/Codegen. But the behaviour is different. I'm sure that the difference caused by CIR StructType implementation and by the fact that instances of llvm::Type counterpart are stored by pointer and are mutable.
For example, this is how the type is updated in the CGRecordLayoutBuilder :
Ty->setBody(Builder.FieldTypes....
And how it's done in CIR:
*Ty = Builder.getStructTy(builder.fieldTypes, ... ,
i.e. we create a new instance of mlir::Type here. This is matter for the TypeCache in the CIRGenTypes: the type entry is stored by value. So when we create a new instance - nothing is reflected anywhere.

As I said, once we swap foo's arguments, everything will be fine - i.e. it depends on order in which we create types and recursively traverse them.

There are several solution I can think of:

  1. relax the verification - may be it will help, since my test code worked with the StructElementAddr before, where no verification was used.
  2. add kind of a mutability in StructType, so we will be able to actually change something during the recursive calls of ConverType. Also we will need to be careful with passing types as arguments, i.e. pass them by reference.
  3. Something else, e.g. don't cache incomplete types.

So.. what thoughts do you have? May be it's a known issue and the answer is simple one?

Leverage `#cir.zero` in large constant strings initializers

The initializer for the null strings should leverage #cir.zero in some way. The reason for this is to improve performance when facing large zero/prefix-initialized strings. For example:

struct {
  char r[100000];
  char g[100000];
  char b[100000];
} image = {"", "1234", "\0"};

The resulting CIR code for this example consists of three #cir.const_arrays with thousands \00 hexadecimal chars each, making them unnecessarily large.

Originally posted by @sitio-couto in #244 (comment)

Invalid order of global variables lowering

Compiling this code

static const char *p = "123";                                                                                                                                                                              
                                                                                
const char *foo() {                                                             
  return p;                                                                     
}                                                                               

results in a crash:

$ /home/huawei/src/clangir/build/bin/clang -fclangir-enable -S tmp.c 
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
...

The root cause is in CIRGlobalOpLowering::matchAndRewrite:

    else if (auto attr = init.value().dyn_cast<mlir::FlatSymbolRefAttr>()) {    
      setupRegionInitializedLLVMGlobalOp(op, rewriter);                         
                                                                                
      // Fetch global used as initializer.                                      
      auto sourceSymbol =                                                       
          dyn_cast<mlir::LLVM::GlobalOp>(mlir::SymbolTable::lookupSymbolIn(     
              op->getParentOfType<mlir::ModuleOp>(), attr.getValue()));         

Here sourceSymbol is @".str" and is has not yet been lowered to LLVM when we start to lower @p so dyn_cast fails:

  cir.global "private" internal @p = @".str": !cir.ptr<!cir.int<s, 8>>
  cir.func no_proto @foo() -> !cir.ptr<!cir.int<s, 8>> extra( {inline = #cir.inline<no>} ) {
    ...
  }
  cir.global "private" constant internal @".str" = #cir.const_array<"123\00" : !cir.array<!cir.int<s, 8> x 4>> : !cir.array<!cir.int<s, 8> x 4> {alignment = 1 : i64}

Consider change cir.br block targets to have operands.

in #83 cir::TernaryOp lowerings.
Will generate cir.br operation with operands,
while BrOp::getSuccessorOperands says Current block targets do not have operands.

so the following mlir parser will get an error.

// cir-tool t.mlir
cir.func @test_br() -> !s32i {
    %0 = cir.const(#cir.int<0>: !s32i) : !s32i
    cir.br ^bb1(%0 : !s32i)
  ^bb1(%x: !s32i):
    cir.return %x : !s32i
}

although the cir.br operation in .td say Used to represent C/C++ goto's and general block branching
I think its usage is more close to cf.br operation.

So the plan is:

  1. support cir.br block targets to have operands.
  2. or replace cir::TernaryOp lowerings using cf.br instead of cir.br, and add cf-to-llvm lowerings in the pass.
    but it sounds conflict with DirectToLLVM

cc @sitio-couto @bcardosolopes

Ambiguity between symbol visibility and linkage type in `cir.func` assembly format

The following cir.func is valid and can be either a function with private linkage or a function with private symbol visibility.

cir.func private @func() -> ()

Both linkage and visibility tags are optional. In the example, only one is specified, however, it is impossible to know if private refers to linkage or symbol visibility since the keyword private is a valid value for both of these attributes.

// Default to external linkage if no keyword is provided.
state.addAttribute(getLinkageAttrNameString(),
GlobalLinkageKindAttr::get(
parser.getContext(),
parseOptionalCIRKeyword<GlobalLinkageKind>(
parser, GlobalLinkageKind::ExternalLinkage)));
::llvm::StringRef visAttrStr;
if (parser.parseOptionalKeyword(&visAttrStr, {"private", "public", "nested"})
.succeeded()) {
state.addAttribute(visNameAttr,
parser.getBuilder().getStringAttr(visAttrStr));
}

CIR Dialect Struct Type Recursive Reference

In CIRTypes.td. Struct Type is defined. And it's parameters is shown below:

let parameters = (ins
    ArrayRefParameter<"mlir::Type", "members">:$members,
    "mlir::StringAttr":$typeName,
    "bool":$body,
    "bool":$packed,
    "std::optional<::mlir::cir::ASTRecordDeclAttr>":$ast
  );

You used a Array of mlir::Type(ArrayRefParameter<"mlir::Type", "members">) to figure out the content in a struct type.

But how do I represent Recursive Reference using CIR_StructType? The Recursive Reference will lowring to llvm dialect like code below:

!llvm.struct<"a", ptr<struct<"a">>>  // example of recursive reference

Support of symbol references in CIRGenExprConst

Some parts of CIRGenExprConst.cpp assumes that attributes generated for RHS of assignments are TypedAttr's e.g.

  auto typedC = llvm::dyn_cast<mlir::TypedAttr>(C);                                                                                                                                                        
  if (!typedC)                                                                  
    llvm_unreachable("this should always be typed");                            
  return typedC;                                                                

But in some cases SymbolRefAttr could be generated as well e.g. in

$ cat tmp.c
typedef struct {
  char *name;
} A;

A foo = {"1"};

$ /home/huawei/src/clangir/build/bin/clang -fclangir-enable -S tmp.c
this should always be typed
UNREACHABLE executed at /home/huawei/src/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:1428!
...
#13 0x00007fa8faecf572 cir::ConstantEmitter::tryEmitPrivate(clang::Expr const*, clang::QualType) /home/huawei/src/clangir/clang/lib/CIR/CodeGen/CIRGenExprConst.cpp:1429:10

Perhaps GlobalViewAttr should be used instead of SymbolRefAttr in such cases?

Optimizations in clangir?

We have some projects that are based on Polygeist and would like to move to clangir if possible. But the default cir output is unoptimized compared to Polygeist's. For instance, cgeist runs mem2reg and code motion optimizations, which are very helpful for downstream tools to analyze the IR.

Is there any plan to add common optimizations to clangir before LLVM lowering pass?

LLVM 17.0.X tagged release.

As I mentioned at the last meeting, would it be possible to carve out tagged CIR release compatible with llvm 17.0.x release?

Crash when lowering endless for loop

yieldToCont operation in CIRLoopOpLowering may not always be present. E.g. in the following case

int foo() {
  int s = 0;
  for (;;)
    ++s;
  return s;
}

the generated condition region is just

{
  cir.yield continue loc(#loc12)
}

Currently compilation of this code crashes with failed to fetch yields in cond region.

Invalid lowering of for-loops

There seems to be in trivial bug in lowering of for-loops in LowerToLLVM.cpp.

The following simple code

#include <stdio.h>

int main() {
  int i, n = 3;
  for (i = 0; i < n; ++i)
    printf("%d\n", i);
}

results in

$ clang -fclangir-enable tmp.c
$ ./a.out
1
2
3

instead of expected

0
1
2

Currently CFG generated in CIRLoopOpLowering looks like

cond -> step -> body -> cond

but it should instead be

cond -> body -> step -> cond

Unfortunately we do not yet have permission to submit PRs but the fix is trivial anyways.

clang frontend command failed with exit code 134

Hi I just ran into a problem when generating CIR for a simple helloworld c file.
Here is the error message:

zhangzhao@zhangzhao-G3-3579:~/tst_cir$ /tmp/install-llvm/bin/clang -v -emit-cir main.c
clang version 16.0.0 (https://github.com/llvm/clangir.git 07c3d1ed6f639a705e8ca16bb96f7dce2eec9f9e)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/install-llvm/bin
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/11
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/12
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/12
Candidate multilib: .;@m64
Selected multilib: .;@m64
 (in-process)
 "/tmp/install-llvm/bin/clang-16" -cc1 -triple x86_64-unknown-linux-gnu -fclangir-enable -emit-cir -disable-free -clear-ast-before-backend -main-file-name main.c -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=all -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu x86-64 -tune-cpu generic -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -v -fcoverage-compilation-dir=/home/zhangzhao/tst_cir -resource-dir /tmp/install-llvm/lib/clang/16 -internal-isystem /tmp/install-llvm/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-linux-gnu/12/../../../../x86_64-linux-gnu/include -internal-externc-isystem /usr/include/x86_64-linux-gnu -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdebug-compilation-dir=/home/zhangzhao/tst_cir -ferror-limit 19 -fgnuc-version=4.2.1 -fcolor-diagnostics -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o main.cir -x c main.c
clang -cc1 version 16.0.0 based upon LLVM 16.0.0git default target x86_64-unknown-linux-gnu
ignoring nonexistent directory "/usr/lib/gcc/x86_64-linux-gnu/12/../../../../x86_64-linux-gnu/include"
ignoring nonexistent directory "/include"
#include "..." search starts here:
#include <...> search starts here:
 /tmp/install-llvm/lib/clang/16/include
 /usr/local/include
 /usr/include/x86_64-linux-gnu
 /usr/include
End of search list.
NYI
UNREACHABLE executed at /home/zhangzhao/llvm-project/clang/lib/CIR/CodeGen/CIRGenBuiltin.cpp:334!
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: /tmp/install-llvm/bin/clang -v -emit-cir main.c
1.	<eof> parser at end of file
2.	main.c:3:5: LLVM IR generation of declaration 'main'
 #0 0x0000564774ba9ee7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/tmp/install-llvm/bin/clang+0x23d7ee7)
 #1 0x0000564774ba7efe llvm::sys::RunSignalHandlers() (/tmp/install-llvm/bin/clang+0x23d5efe)
 #2 0x0000564774b304b8 CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x00007f757823bcf0 (/lib/x86_64-linux-gnu/libc.so.6+0x3bcf0)
 #4 0x00007f757829226b __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #5 0x00007f757829226b __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 #6 0x00007f757829226b pthread_kill ./nptl/pthread_kill.c:89:10
 #7 0x00007f757823bc46 raise ./signal/../sysdeps/posix/raise.c:27:6
 #8 0x00007f75782227fc abort ./stdlib/abort.c:81:7
 #9 0x0000564774b35dbf (/tmp/install-llvm/bin/clang+0x2363dbf)
#10 0x0000564775dd09b7 cir::CIRGenFunction::buildBuiltinExpr(clang::GlobalDecl, unsigned int, clang::CallExpr const*, cir::ReturnValueSlot) (/tmp/install-llvm/bin/clang+0x35fe9b7)
#11 0x0000564775da923d cir::CIRGenFunction::buildCallExpr(clang::CallExpr const*, cir::ReturnValueSlot) (/tmp/install-llvm/bin/clang+0x35d723d)
#12 0x0000564775dbcf65 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) CIRGenExprScalar.cpp:0:0
#13 0x0000564775db60fc cir::CIRGenFunction::buildScalarExpr(clang::Expr const*) (/tmp/install-llvm/bin/clang+0x35e40fc)
#14 0x0000564775da8230 cir::CIRGenFunction::buildAnyExpr(clang::Expr const*, cir::AggValueSlot, bool) (/tmp/install-llvm/bin/clang+0x35d6230)
#15 0x0000564775da9b6f cir::CIRGenFunction::buildIgnoredExpr(clang::Expr const*) (/tmp/install-llvm/bin/clang+0x35d7b6f)
#16 0x0000564775dca62f cir::CIRGenFunction::buildStmt(clang::Stmt const*, bool) (/tmp/install-llvm/bin/clang+0x35f862f)
#17 0x0000564775dca59f cir::CIRGenFunction::buildCompoundStmtWithoutScope(clang::CompoundStmt const&) (/tmp/install-llvm/bin/clang+0x35f859f)
#18 0x0000564775dc31bc cir::CIRGenFunction::generateCode(clang::GlobalDecl, mlir::cir::FuncOp, cir::CIRGenFunctionInfo const&) (/tmp/install-llvm/bin/clang+0x35f11bc)
#19 0x0000564775d8e205 cir::CIRGenModule::buildGlobalFunctionDefinition(clang::GlobalDecl, mlir::Operation*) (/tmp/install-llvm/bin/clang+0x35bc205)
#20 0x0000564775d91425 cir::CIRGenModule::buildTopLevelDecl(clang::Decl*) (/tmp/install-llvm/bin/clang+0x35bf425)
#21 0x0000564775d8b1cf cir::CIRGenerator::HandleTopLevelDecl(clang::DeclGroupRef) (/tmp/install-llvm/bin/clang+0x35b91cf)
#22 0x0000564775d19c31 cir::CIRGenConsumer::HandleTopLevelDecl(clang::DeclGroupRef) (/tmp/install-llvm/bin/clang+0x3547c31)
#23 0x0000564777367509 clang::ParseAST(clang::Sema&, bool, bool) (/tmp/install-llvm/bin/clang+0x4b95509)
#24 0x0000564775632df0 clang::FrontendAction::Execute() (/tmp/install-llvm/bin/clang+0x2e60df0)
#25 0x00005647755a3ddf clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/tmp/install-llvm/bin/clang+0x2dd1ddf)
#26 0x0000564775703662 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/tmp/install-llvm/bin/clang+0x2f31662)
#27 0x000056477389f3be cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/tmp/install-llvm/bin/clang+0x10cd3be)
#28 0x000056477389b70a ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#29 0x0000564775418362 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_1>(long) Job.cpp:0:0
#30 0x0000564774b301fc llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/tmp/install-llvm/bin/clang+0x235e1fc)
#31 0x0000564775417b8f clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (/tmp/install-llvm/bin/clang+0x2c45b8f)
#32 0x00005647753d8c4f clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/tmp/install-llvm/bin/clang+0x2c06c4f)
#33 0x00005647753d8efe clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/tmp/install-llvm/bin/clang+0x2c06efe)
#34 0x00005647753f7450 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/tmp/install-llvm/bin/clang+0x2c25450)
#35 0x000056477389aa5e clang_main(int, char**) (/tmp/install-llvm/bin/clang+0x10c8a5e)
#36 0x00007f7578223510 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3
#37 0x00007f75782235c9 call_init ./csu/../csu/libc-start.c:128:20
#38 0x00007f75782235c9 __libc_start_main ./csu/../csu/libc-start.c:368:5
#39 0x0000564773897935 _start (/tmp/install-llvm/bin/clang+0x10c5935)
clang-16: error: clang frontend command failed with exit code 134 (use -v to see invocation)
clang version 16.0.0 (https://github.com/llvm/clangir.git 07c3d1ed6f639a705e8ca16bb96f7dce2eec9f9e)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/install-llvm/bin
clang-16: note: diagnostic msg: Error generating preprocessed source(s).

The source file main.c is quite simple as:

#include <stdio.h>

int main(int argc, char** argv) {
    int a = 123;
    printf("123");
    return 0;
}

The CIR clang is built via following guide found from here:
https://llvm.github.io/clangir/GettingStarted/build-install.html

Could anyone help me figuring whats wrong here? Thanks!

Fix LLVM lowering for clang/test/CIR/CIRToLLVM/goto.cir

After latest round of rebase, lower to llvm (via -cir-to-llvm) stopped working with the message failed to legalize operation 'cf.br', and the test got XFAILed. This smells like some problem related to (not) registering ControlFlowDialect.

Invalid IR when accessing union

This code

typedef union {                                                                 
  char x;                                                                       
  struct {                                                                      
    short h, l;                                                                 
  } b;                                                                          
} T;                                                                            
                                                                                
void foo(T reg) {                                                               
  reg.b.l;                                                                      
}                                                                               

currently generates invalid IR:

$ ~/src/clangir/build/bin/clang -fclangir-enable -emit-llvm -S tmp.c
loc("tmp.c":4:14): error: 'llvm.getelementptr' op index 1 indexing a struct is out of bounds

It seems that we are missing a bitcast in buildLValueForField and code even has a comment about this:

  // TODO(CIR): CodeGen requires a bitcast here for unions or for structs where 
  // the LLVM type doesn't match the desired type. No idea when the latter might
  // occur, though.                                                             

Indeed adding

  if (rec->isUnion()) {
    auto memTy = getTypes().convertTypeForMem(FieldType);
    addr = builder.createElementBitCast(getLoc(field->getSourceRange()), addr, memTy);
  }

fixes the problem.

LLVM install failure `clang/CIR/Dialect/IR/CIROpsEnums.h.inc: No such file or directory`

Following GettingStarted:

$ git show 
commit 333ecc5b42ee6358afa60412371fc02543daf156 (HEAD -> main, origin/main, origin/HEAD)
Author: redbopo <[email protected]>
Date:   Thu May 25 22:01:21 2023 +0800
...

$ cmake -G "Unix Makefiles" -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_INSTALL_PREFIX=${INSTALLDIR}  -DLLVM_ENABLE_ASSERTIONS=ON  \
  -DLLVM_TARGETS_TO_BUILD="host"  -DLLVM_ENABLE_PROJECTS="clang;mlir;cir" ../
  
$ make install
~~ 7 hours later ~~
[ 85%] Building CXX object tools/clang/lib/CIR/Lowering/ThroughMLIR/CMakeFiles/obj.clangCIRLoweringThroughMLIR.dir/LowerCIRToMLIR.cpp.o
In file included from /X/llvm-clangir/clang/include/clang/CIR/Dialect/IR/CIRAttrs.h:19,
                 from /X/llvm-clangir/clang/include/clang/CIR/Dialect/IR/CIRDialect.h:28,
                 from /X/llvm-clangir/clang/lib/CIR/Lowering/ThroughMLIR/LowerCIRToMLIR.cpp:38:
/X/llvm-clangir/clang/include/clang/CIR/Dialect/IR/CIROpsEnums.h:18:10: fatal error: clang/CIR/Dialect/IR/CIROpsEnums.h.inc: No such file or directory
   18 | #include "clang/CIR/Dialect/IR/CIROpsEnums.h.inc"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [tools/clang/lib/CIR/Lowering/ThroughMLIR/CMakeFiles/obj.clangCIRLoweringThroughMLIR.dir/build.make:76: tools/clang/lib/CIR/Lowering/ThroughMLIR/CMakeFiles/obj.clangCIRLoweringThroughMLIR.dir/LowerCIRToMLIR.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:115851: tools/clang/lib/CIR/Lowering/ThroughMLIR/CMakeFiles/obj.clangCIRLoweringThroughMLIR.dir/all] Error 2
make: *** [Makefile:156: all] Error 2

Missing function attributes and data layout information

Hi,
I noticed that we miss target triple and function attributes when we lower C code to Clang IR and then LLVM IR. Do you know how can I preserve this information?

Example:
C code:
int foo() { return 0; }
When I run command clang -S -emit-llvm test.c -o - I got the following LLVM IR:

source_filename = "test.c"
target datalayout = "e-m:e-p270:32:32-p(...)"
target triple = "x86_64-unknown-linux-gnu"
define dso_local i32 @foo() #0 {
entry:
   ret i32 0
}
attributes #0 = { noinline nounwind optnone uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse, +sse2,+x87" "tune-cpu"="generic" }

When I run mlir code gen, then I miss information about function attributes and target triple:
bin/clang -fclangir-enable t.c -emit-cir -o - | bin/cir-tool -cir-to-llvm -o - | bin/mlir-translate -mlir-to-llvmir -o -
Output is as follows:

source_filename = "LLVMDialectModule"
;No target triple information
define i32 @foo() {
;some LLMV IR instructions
}
;no function attributes for foo function

Fix clang/test/CIR/CodeGen/lambda.cpp testcase

Somehow this got missed in some previous feature work and is now hitting an assert. Possibly due to adding machinery for building implicit assignments. Add necessary support during clang CIR codegen and fix the test - the assert is likely a good place to start.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.