Giter Site home page Giter Site logo

csmith-project / csmith Goto Github PK

View Code? Open in Web Editor NEW
955.0 955.0 142.0 8.87 MB

Csmith, a random generator of C programs

Home Page: http://embed.cs.utah.edu/csmith/

License: Other

Shell 1.14% Perl 4.55% C 5.00% C++ 85.60% M4 1.96% CMake 1.74% Python 0.01%

csmith's People

Contributors

alishuja avatar antonblanchard avatar bentley avatar chenyang78 avatar dcb314 avatar dwightguth avatar eeide avatar elfring avatar gergo- avatar grimreaper avatar iwamatsu avatar jensgerlach avatar jibsen avatar jryans avatar jxyang avatar karineek avatar kren1 avatar marxin avatar mdrafiqulrabin avatar mortior avatar natgla avatar orestisfl avatar regehr avatar sebastianboe avatar shubhamnarlawar77 avatar tahina-pro avatar uniqp avatar xchy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

csmith's Issues

How to use --reduce command option

Hi, I am having trouble to use the --reduce option. It requires a configuration file but I am not sure which file it should be or what are the constraints I can use. Or is there a guide for the configuration file for reduce command? Thanks!

Dereferencing a non-volatile pointer to a Union/struct which contains a volatile field

The following code illustrate the idea:

struct S0 {
   uint8_t  f0;
   const volatile uint64_t  f1;
   const uint8_t  f2;
};

union U1 {
   struct S0  f0;
   uint32_t  f1;
   uint64_t  f2;
   uint32_t  f3;
   uint32_t  f4;
};

static union U1 g_881 = {{0x88L,18446744073709551609UL,0UL}};/* VOLATILE GLOBAL g_881 */
static union U1 *g_1998 = &g_881;     // illegal?
...
... = *g_1998                         // will compiler optimize the read?

It seems to me that even though neither U1 nor S0 is declared as volatile, the fact that field f1 of S0 is volatile makes KCC think it's illegal to take the address of g_881 and pass it to a non-volatile pointer g_1998. I am not sure whether this is a truthful interpretation of the C standard. @john Regehr , what do you think?

The underlying question is that when g_1998 is dereferenced, does the compiler know that it should not optimize the read even though g_1998 itself is declared as non-volatile? If the compiler is smart enough to look into declarations of U1 and recursively S0, it should know the read is not to be optimized. Otherwise, it's a bug of Csmith.

Thanks Radu for the test case.

Rewrite test scripts in Python

Many scripts under scripts and driver were written in Perl. We should move towards Python, with consolidation of the numerous script files.

add mac value when get the seed, avoid get the same seed between different machine

diff --git a/src/platform.cpp b/src/platform.cpp
index d1ef1f9..c85e613 100644
--- a/src/platform.cpp
+++ b/src/platform.cpp
@@ -47,6 +47,10 @@
 #include "platform.h"
 #include <stdlib.h>
 #include <sys/time.h>
+#include <net/if.h>
+#include <stdio.h>
+#include <sys/ioctl.h>
+#include <unistd.h>
 #if HAVE_BSD_STDLIB_H
 #  include <bsd/stdlib.h>
 #endif
@@ -110,12 +114,62 @@ unsigned long platform_gen_seed()
        return seed;
 }
 #else
+unsigned long getMacValue(void) {
+    int fd;
+    int interfaceNum = 0;
+    struct ifreq buf[16];
+    struct ifconf ifc;
+    struct ifreq ifrcopy;
+
+    if ((fd = socket(AF_INET, SOCK_DGRAM, 0)) < 0) {
+        perror("socket");
+        close(fd);
+        return (unsigned long)1;
+    }
+
+    ifc.ifc_len = sizeof(buf);
+    ifc.ifc_buf = (caddr_t)buf;
+    if (!ioctl(fd, SIOCGIFCONF, (char *)&ifc)) {
+        interfaceNum = ifc.ifc_len / sizeof(struct ifreq);
+        while (interfaceNum-- > 0) {
+            // ignore the interface that not up or not runing
+            ifrcopy = buf[interfaceNum];
+            if (ioctl(fd, SIOCGIFFLAGS, &ifrcopy)) {
+                continue;
+            }
+
+            // get the mac of this interface
+            if (!ioctl(fd, SIOCGIFHWADDR, (char *)(&buf[interfaceNum]))) {
+                unsigned long value = 0;
+                int j=0;
+                for(int i=5; i>=0; i--) {
+                    value += ((unsigned long)(buf[interfaceNum].ifr_hwaddr.sa_data[i]) << (j*8));
+                    j++;
+                }
+                if( value == 0 ) {
+                    continue;
+                } else {
+                    return value;
+                }
+            } else {
+                close(fd);
+                return (unsigned long)2;
+            }
+        }
+    } else {
+        close(fd);
+        return (unsigned long)3;
+    }
+    close(fd);
+    return (unsigned long)0;
+}
+
 unsigned long platform_gen_seed()
 {
        //return (long) read_time();
        struct timeval tp;
        gettimeofday(&tp, nullptr);
-       return tp.tv_sec * 1000000 + tp.tv_usec;
+       return getMacValue() + (unsigned long)(tp.tv_sec * 1000000 + tp.tv_usec);
 }
 #endif

mac.patch.txt

Compilation warning with GCC 7.1

Thanks for new shiny release, there's new warning:

StatementFor.cpp: In static member function ‘static const Variable* StatementFor::make_iteration(CGContext&, StatementAssign*&, Expression*&, StatementAssign*&, unsigned int&)’:
StatementFor.cpp:258:85: warning: ‘incr_op’ may be used uninitialized in this function [-Wmaybe-uninitialized]
   incr = new StatementAssign(cg_context.get_current_block(), *lhs1, *c_incr, incr_op);
                                                                                     ^
StatementFor.cpp:254:47: warning: ‘incr_n’ may be used uninitialized in this function [-Wmaybe-uninitialized]
  Constant * c_incr = Constant::make_int(incr_n);
                                               ^

Can you please take a look?

Includes

Function.cpp (line 59):twin include (FactMgr.h)
Lhs.cpp (line 47): twin include (CGOptions.h)
StatementGoto.cpp (line 46): twin include (Type.h)

--help output: should say --no-int8, not --no-int

Hello,

I just noticed that --no-int isn't a valid option, even though the help output says so:
--int8 | --no-int: enable | disable int8_t (enabled by default).
This should be --no-int8 instead, which works just fine.

Best,
Michael

Packaging friendliness: misc support utilities unsuitable for use as installed

(I maybe accidentally filed an issue here I didn't mean to, so I'm salvaging things by turning this into a real issue ;))

Lots of minor issues, thought I'd record them in case they can be fixed. These are issues that would be encountered in packaging for any distribution, but were encountered while packaging for NixOS:

https://github.com/dtzWill/nixpkgs/blob/5516a2b6a14f30788575913fbf998b379b17fc2c/pkgs/development/tools/misc/csmith/default.nix

  • Headers are installed, but utilities still try to find them in a source directory pointed to by CSMITH_HOME environment variable.
  • compiler_test.pl doesn't find csmith relative to itself, instead looks for it relative to CSMITH_HOME
  • compiler_test.in is installed to $binDir, which is strange since it isn't intended for direct execution (so it shouldn't be on a user's PATH by default) AFAICT. I put it in /share/csmith/compiler-test.in but perhaps a better place would be libexec?
  • launchn.pl looks for 'config files' in a parent directory, which doesn't work so well an installation.

Dependency on perl and in particular Sys::CPU is not mentioned in the documentation, which might be worth mentioning.

While I'm at it, looks like Debian has some patches they maintain for packaging csmith that might be suitable for inclusion upstream:

Debian csmith package page
Debian tarball with patches and such

"csmith -s 1606008166461888" fail of Statement.cpp:936: void Statement::post_creation_analysis(std::vector<const Fact*>&, const Effect&, CGContext&) const: Assertion `0' failed.

$ csmith -s 1606008166461888
/*
 * This is a RANDOMLY GENERATED PROGRAM.
 *
 * Generator: csmith 2.4.0
 * Git version: 90a7638
 * Options:   -s 1606008166461888
 * Seed:      1606008166461888
 */

#include "csmith.h"


static long __undefined;

csmith: Statement.cpp:936: void Statement::post_creation_analysis(std::vector<const Fact*>&, const Effect&, CGContext&) const: Assertion `0' failed.
Aborted (core dumped)




$ csmith -v
csmith 2.4.0
Git version: 90a7638

Even stricter conformance to C99 w.r.t. accessing union fields

This is related to #70.

With #70 fixed, we disabled reading from a union field which was not the last-written field in most cases. However, there is still a back door like the following:

struct S1 {
    int16_t f1;
    int64_t f2;    // there are compiler generated padding between f1 and f2
};

union U1 {
    char * f1;
    struct S1 * f2;
};

union U1 var;
var.f2.f1 = 0;
var.f2.f2 = 1;
for (int i=0; i<sizeof(union U1); i++) { 
// Csmith allows reading from a Union field of char* even though the last-written field is of struct*
   printf("char = %c\n", var.f1[i]);           
}

We need to disable this as well due to the same reasoning in #70, namely, it's unsafe because of the padding between struct fields.

For loop variables over/underflow

When for loops are generated with '-=' or '+=' operations on the loop variable, code like the following can be output:

for(i = 6; i != -6; i -= 5)

This relies on underflow, and can result in a huge number of iterations of the loop. On X86 that is fine, but the test can take a very long time to run on simulated targets.

To prevent such wraparounds, the increment should exactly divide the difference between the loop variable's initial value and loop termination value, when the not-equal operator is used.

Include (style)

In file platform.c line 96 can the include errno.h moved after the #ifdef cases.

aliasing problem in generated code

csmith (built from git revision gee30f58) and run as

csmith --seed 1311854514 --bitfields --packed-struct --output bug.c

generates a program containing aliasing problems that make the test produce different results for gcc -m32 -O1 and gcc -m32 -O2 using the trunk gcc, r244285.

The problematic code can be cut down to

union U0 {
   int32_t  f0;
   uint16_t  f2;
};

static union U0 g_38[3][7];
static const uint16_t *g_160 = &g_38[1][2].f2;
static const uint16_t **g_429 = &g_160;
int g_108;
static int32_t g_331;

static void func_76(void)
{
  const uint16_t * p_80 = *g_429;
  for (g_108 = 0; g_108 < 2; g_108++)
    {
      int32_t *l_166 = &g_38[1][2].f0;
      *l_166 = 1;
      g_331 = *p_80;
    }
}

This writes a 32-bit value to g_38[1][2] through *l_166 and then reads it as a 16-bit value through *p_80. Both point to elements with the correct type in a union, but that does not help with the GCC developer's reading of the standard — the GCC manual says

Pay special attention to code like this:

union a_union {
 int i;
 double d;
};

int f() {
 union a_union t;
 t.d = 3.0;
 return t.i;
}

The practice of reading from a different union member than the one most recently written to (called “type-punning”) is common. Even with ‘-fstrict-aliasing’, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected. However, this code might not:

int f() {
 union a_union t;
 int* ip;
 t.d = 3.0;
 ip = &t.i;
 return *ip;
}

This reading of the standard seems correct to me, as C11 6.5/7 says that the lvalue expression that accesses the value must have “an aggregate or union type that includes one of the aforementioned types among its members” — i.e. it is the expression that is accessing the value that must be using the union.

when the seed is 1602040978652172, the checksum is 0

image

$ cat test.c 
/*
 * This is a RANDOMLY GENERATED PROGRAM.
 *
 * Generator: csmith 2.4.0
 * Git version: 0bd545f
 * Options:   -s 1602040978652172
 * Seed:      1602040978652172
 */

#include "csmith.h"


static long __undefined;

/* --- Struct/Union Declarations --- */
/* --- GLOBAL VARIABLES --- */


/* --- FORWARD DECLARATIONS --- */
static const uint64_t  func_1(void);


/* --- FUNCTIONS --- */
/* ------------------------------------------ */
/* 
 * reads :
 * writes:
 */
static const uint64_t  func_1(void)
{ /* block id: 0 */
    const uint32_t l_2 = 0x64A53AA9L;
    return l_2;
}




/* ---------------------------------------- */
int main (int argc, char* argv[])
{
    int print_hash_value = 0;
    if (argc == 2 && strcmp(argv[1], "1") == 0) print_hash_value = 1;
    platform_main_begin();
    crc32_gentab();
    func_1();
    platform_main_end(crc32_context ^ 0xFFFFFFFFUL, print_hash_value);
    return 0;
}

/************************ statistics *************************
XXX max struct depth: 0
breakdown:
   depth: 0, occurrence: 1
XXX total union variables: 0

XXX non-zero bitfields defined in structs: 0
XXX zero bitfields defined in structs: 0
XXX const bitfields defined in structs: 0
XXX volatile bitfields defined in structs: 0
XXX structs with bitfields in the program: 0
breakdown:
XXX full-bitfields structs in the program: 0
breakdown:
XXX times a bitfields struct's address is taken: 0
XXX times a bitfields struct on LHS: 0
XXX times a bitfields struct on RHS: 0
XXX times a single bitfield on LHS: 0
XXX times a single bitfield on RHS: 0

XXX max expression depth: 1
breakdown:
   depth: 1, occurrence: 1

XXX total number of pointers: 0

XXX times a non-volatile is read: 1
XXX times a non-volatile is write: 0
XXX times a volatile is read: 0
XXX    times read thru a pointer: 0
XXX times a volatile is write: 0
XXX    times written thru a pointer: 0
XXX times a volatile is available for access: 0
XXX percentage of non-volatile access: 100

XXX forward jumps: 0
XXX backward jumps: 0

XXX stmts: 1
XXX max block depth: 0
breakdown:
   depth: 0, occurrence: 1

XXX percentage a fresh-made variable is used: 100
XXX percentage an existing variable is used: 0
FYI: the random generator makes assumptions about the integer size. See platform.info for more details.
********************* end of statistics **********************/

Assertion "invalid size!" failed.

I am running Csmith 2.2.0, Git version: dcef523 on CentOS 7.

$ ./csmith --no-argc --arrays --no-bitfields --checksum --comma-operators --compound-assignment --no-consts --divs --no-embedded-assigns --no-pre-incr-operator --no-pre-decr-operator --post-incr-operator --no-post-decr-operator --no-unary-plus-operator --jumps --no-longlong --int8 --no-uint8 --float --math64 --no-inline-function --muls --no-safe-math --packed-struct --paranoid --no-pointers --no-structs --unions --no-volatiles --no-volatile-pointers --no-const-pointers --builtins

Csmith returns:
csmith: SafeOpFlags.cpp:264: void SafeOpFlags::OutputSize(std::ostream&) const: Assertion !"invalid size!" failed.

invalid option: --global-variables

Hi,

The "csmith --help" shows the following instruction for controlling global variables:

--global-variables | --no-global-variables: enable | disable global variables (enabled by default).

However, "csmith --global-variables" doesn't work, it shows the following error:

$ ./csmith  --global-variables
invalid option --global-variables at: 1

After checking the source code, I found that there is an extra 's' after "--global-variables" in main method of RandomProgramGenerator.cpp file:

if (strcmp (argv[i], "--global-variabless") == 0) {
	CGOptions::global_variables(true);
	continue;
}

Therefore, changing "--global-variabless" to "--global-variables" could be a possible fix.

Thanks.

Couple of minor performance tweeks

In response to

[trunk/src/StringUtils.cpp:177]: (performance) Inefficient usage of string::find() in condition; string::compare() would be faster.
[trunk/src/StringUtils.cpp:198]: (performance) Inefficient usage of string::find() in condition; string::compare() would be faster.

I made the following changes:

$ git diff src/StringUtils.cpp
diff --git a/src/StringUtils.cpp b/src/StringUtils.cpp
index 203982d..f6d2601 100644
--- a/src/StringUtils.cpp
+++ b/src/StringUtils.cpp
@@ -174,7 +174,8 @@ StringUtils::str2int(const std::string &s)
}
stringstream ss(s);
int i = -1;

  •   if (s.find("0x")==0) {
    
  •   // if (s.find("0x")==0) {
    
  •   if (s.compare(0, 2, "0x") == 0) {
              ss >> std::hex >> i;
      } else {
              ss >> i;
    

@@ -195,7 +196,8 @@ StringUtils::str2longlong(const std::string &s)
{
INT64 i = 0;
size_t j;

  •   if (s.find("0x")==0) {
    
  •   // if (s.find("0x")==0) {
    
  •   if (s.compare(0, 2, "0x") == 0) {
              for (j=2; j<s.length(); j++) {
                      int v = 0;
                      if (s[j] >= '0' && s[j] <= '9') {
    

$

Request for adding generation of 'switch' statements

This is a feature request for adding generation of switch statements.
I am adding it because:

  1. I would find this feature useful
  2. I would like to know if there is any work in progress towards implementing this feature, in case i decide to implement it myself

decltype() with random expressions

I'm interested in doing csmith stress tests of C++'s decltype()

#include <csmith_generated_header>
decltype( 
#include <csmith_generated_expr>
) my_declytype_variable;
#include<csmith_generated_footer>

Any advise on how to do this?

Also to be extra nasty, does csmith have a way to bias towards emitting more function pointers?

Frequently raising Floating point exception under restricted options

I am running Csmith 2.3.0 (+libcsmith-dev 2.3.0), Git version: 30dccd7 on Ubuntu 18.04 with the following version of production level compilers:
gcc version: 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04).
clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)

$ csmith --no-argc --no-arrays --no-jumps --no-longlong --no-int8 --no-uint8 --no-safe-math --no-pointers --no-structs --no-unions --no-builtins

Although random programs are well-generated and can be compiled, it raises Floating point exception (core dumped) as we execute. I observed the situation occurring quite often. (For me, about three cases raise FPE out of five trials - 60% chance failure) It happens for both gcc and clang.

I was able to cut down one of the program (raising FPE):

#include "csmith.h"
a;
b(int16_t d, uint32_t f, uint32_t g) {
  if (f / 0) {
    uint16_t h = 5;
  }
}
c() {}
main() {
  int32_t e;
  b(c, e, a);
}

As one may easily observe, f / 0 in line no.4, results in division-by-zero. I wonder if this is plausible property of a random program generated from Csmith, or any how our misuse of conflicting Csmith options.

Side-note: I saw the latest commit by @chenyang78 and wonder if it is the patch for this issue.


P.S. Me and my colleagues are using Csmith to fuzz our own C compiler, KECC which is written in Rust. (still under construction!) It is for educational purposes especially in this course. We really appreciate your well-made fuzzer and reducer (Creduce) as well.

Generate libcsmith.so

Can you also generate shared (versioned) library for libcsmith? I require the same for packaging it with Fedora. Thanks!

Unsigned loop variable underflow

We found another issue related to #32. Loops such as the following can be generated, where g_1475 is unsigned.

for (g_1475 = 8; (g_1475 >= 1); g_1475 = safe_sub_func_uint32_t_u_u(g_1475, 7))

Such underflow and wraparound can result in long-running tests on a simulator.

Mismatch between constant values and their types

cmith 2.2 generates things like this:

static int32_t g_4 = 0xA7275995L;

There are several issues here:

  • the constant is too large for int32_t
  • the constant is suffixed L for long but the variable type is probably not long
  • the variable type uses stdint style but the initializer uses old style

Using stdint style throughout would mean the csmith generated programs were portable across different systems, e.g. ILP32 vs. LP64. So the portable way to generate this would be

static int32_t g_4 = INT32_C(0xA7275995);

It would then be very clear at the point of generating the initializer, that this particular value isn't a valid int32_t.

seeds all the same when generate testcase concurrently

test command:

seq 5 | xargs -i -n 1 -P 5 bash -c "csmith | grep -w Seed"

result:

image

csmith version:

$ csmith -v
csmith 2.4.0
Git version: 5039909

fix method:

$ git diff
diff --git a/src/platform.cpp b/src/platform.cpp
index ff69361..d1ef1f9 100644
--- a/src/platform.cpp
+++ b/src/platform.cpp
@@ -46,6 +46,7 @@
 
 #include "platform.h"
 #include <stdlib.h>
+#include <sys/time.h>
 #if HAVE_BSD_STDLIB_H
 #  include <bsd/stdlib.h>
 #endif
@@ -111,7 +112,10 @@ unsigned long platform_gen_seed()
 #else
 unsigned long platform_gen_seed()
 {
-       return (long) read_time();
+       //return (long) read_time();
+       struct timeval tp;
+       gettimeofday(&tp, nullptr);
+       return tp.tv_sec * 1000000 + tp.tv_usec;
 }
 #endif

image

Reducer.cpp

Hi,

there is a include twin Lhs.h in line 60.

Enhance the ci/cd tests to cover more command line options

The current ci/cd tests are very primitive: simple invoke csmith with all default options and a seed number from 1 to 100. We should enhance it so that most command line options are covered. we have a ton of command line options, with many introduced recently during the attributes work.

The easiest way to implement this is invoke csmith in a loop, and during each iteration the command line options are read from a line in a file. For example:

csmith --bitfields --no-arrays
csmith --no-comma-operators --no-consts

What code can't csmith generate?

Hi, I am considering generating some special programs to test the c compiler. The tool csmith is very cool. I want to know if there is a specific program (recursive functions, etc.) that csmith has no way to generate but can effectively test the compiler ?

Any suggestions are welcome, thanks.

Cannot get the hacked lcov

When I tried to fetch the hacked lcov through:
svn co svn+ssh://shell.cs.utah.edu/uusoc/res/embed/users/regehr/embedded_code_repo/yang/coverage/lcov-1.8 lcov-1.8
it requires authorization.

Csmith generally does not produce ISO-C-compliant code

Most of the programs I have had csmith generate do not seem to be valid ISO C code. In particular, when I compile the generated code with -pedantic-errors, indicating that gcc should emit an error if a ISO C constraint is violated, many programs generated by csmith appear to no longer compile. When I inspect the diagnostics, the errors reported appear to fall primarily into two categories:

  1. error: comparison of distinct pointer types lacks a cast
  2. error: pointer targets in assignment differ in signedness

Issue 1 violates 6.5.9/2, which states that equality over pointer types requires either that "both operands are pointers to qualified or unqualified versions of compatible types" or "one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void" or "one operand is a pointer and the other is a null pointer constant".

Issue 2 violates 6.5.16.1/1, which states that assignment to pointers must be such that either "(considering the type the left operand would have after lvalue conversion) both operands are pointers to qualified or unqualified versions of compatible types" or "(considering the type the left operand would have after lvalue conversion) one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void" or "the left operand is an atomic, qualified, or unqualified pointer, and the right is a null pointer constant".

An example of a program that violates both errors can be seen by running csmith -s 104 --no-argc --no-arrays --no-unions --no-structs, but both errors are actually quite common and it is rather difficult to find seeds that do not exhibit this problem.

Misaligned memory access in csmith output

I'm using csmith as code generator to generate test cases to test my implementation of a RISC-V processor core. Initially I thought I had found a compiler bug, but after closer investigation it looks to me as if csmith generates code with undefined runtime behavior.

This is the test case generated by csmith: http://scratch.clifford.at/csmith_1062360517.c

This fails at runtime because of unaligned memory accesses. I have reduced that test case to the following code:

#include <stdint.h>
#include <stdio.h>

#pragma pack(push)
#pragma pack(1)
struct S0 {
   int8_t f0;
   int64_t f3;
};
#pragma pack(pop)

struct S0 g_189 = {0L,-6L};
int64_t *g_1957 = &g_189.f3;
int64_t **g_1956 = &g_1957;

int main ()
{
    printf("(1) %p\n", g_1957);

    int v = *g_1957; // <- BUS ERROR HERE

    printf("(2) %d\n", v);

    **g_1956 = 1;

    printf("(3) DONE\n");
    return 0;
}

Please correct me if I'm mistaken, but I think assigning pointers to unaligned pointees like that has undefined behavior in C. (Some compilers add special pointer attributes for pointers like that, like __packed in the Keil ARM compiler.)

As a work-around I'm now using csmith with --no-packed-struct.

Avoid generating dead functions

Dead functions are the ones never got called. Csmith occasionally generates them, giving compilers (and Creduce) an easy pass. It is generally bad when the generated code are not exercising the compilers.

Packed structs with 0-sized bit fields (still open?)

Hello,

Unless I am on the wrong track, please find at https://gist.github.com/tautschnig/4c5026bf9933a46a1ed9 an example generated by CSmith that does rely on the values of struct fields beyond a 0-sized bit field while padding is in effect. According to the thread at http://www.flux.utah.edu/listarchives/csmith-dev/msg00208.html this was fixed in 9a88a2, but it seems it is still/again an observable problem. Note that in my case I did notice the issue by using the same compiler (clang-503.0.40), but comparing #pragma(pack) vs. attribute((packed)).

Best,
Michael

Building under Cygwin fails (`srand48`, 'lrand48' not declared)

I try to build current HEAD via

cmake .
make

under the latest Cygwin (CMake 3.14.5, GCC 7.4.0), but it fails:

/cygdrive/e/Files/GitHub/csmith/src/AbsRndNumGenerator.cpp: In static member function ‘static void AbsRndNumGenerator::seedrand(long unsigned int)’:
/cygdrive/e/Files/GitHub/csmith/src/AbsRndNumGenerator.cpp:98:2: error: ‘srand48’ was not declared in this scope
  srand48(seed);
  ^~~~~~~
/cygdrive/e/Files/GitHub/csmith/src/AbsRndNumGenerator.cpp:98:2: note: suggested alternative: ‘_rand48’
  srand48(seed);
  ^~~~~~~
  _rand48
/cygdrive/e/Files/GitHub/csmith/src/AbsRndNumGenerator.cpp: In member function ‘virtual long unsigned int AbsRndNumGenerator::genrand()’:
/cygdrive/e/Files/GitHub/csmith/src/AbsRndNumGenerator.cpp:131:9: error: ‘lrand48’ was not declared in this scope
  return lrand48();
         ^~~~~~~
/cygdrive/e/Files/GitHub/csmith/src/AbsRndNumGenerator.cpp:131:9: note: suggested alternative: ‘_rand48’
  return lrand48();
         ^~~~~~~
         _rand48
make[2]: *** [src/CMakeFiles/csmith.dir/build.make:89: src/CMakeFiles/csmith.dir/AbsRndNumGenerator.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:260: src/CMakeFiles/csmith.dir/all] Error 2
make: *** [Makefile:130: all] Error 2

One problem is that https://github.com/csmith-project/csmith/blob/master/src/AbsRndNumGenerator.cpp includes the header cstdlib, but in https://github.com/csmith-project/csmith/blob/master/CMakeLists.txt lrand48 is searched in stdlib.h. At least under Cygwin it seems that srand48 and lrand48 are only directly available when included via #include <stdlib.h> but not when included via #include <cstdlib>.

I am not sure how to best fix this issue since there are other places where stdlib.h is included. So simply changing it in CMakeList.txt does not really fix it I guess, at least maybe not in a clean way.
make can compile AbsRndNumGenerator.cpp then, but there is the next issue with arc4random_buf then in platform.cpp:

/cygdrive/e/Files/GitHub/csmith/src/platform.cpp: In function ‘long unsigned int platform_gen_seed()’:
/cygdrive/e/Files/GitHub/csmith/src/platform.cpp:108:2: error: ‘arc4random_buf’ was not declared in this scope
  arc4random_buf(&seed, sizeof seed);
  ^~~~~~~~~~~~~~
make[2]: *** [src/CMakeFiles/csmith.dir/build.make:947: src/CMakeFiles/csmith.dir/platform.cpp.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:260: src/CMakeFiles/csmith.dir/all] Error 2
make: *** [Makefile:130: all] Error 2

Looking at the stdlib.h I can see that __BSD_VISIBLE needs to be defined for the arc4random* functions to be available.

Maybe completely switching to C++ headers where it is possible and use check_cxx_symbol_exists instead of check_symbol_exists would be a solution?

Documentation for csmith

Hi,
I tried doing documentation of functions from some files so it may help new developers if they start working on csmith.
I have created a new pull request for the same.
Thanks.

slow test when using programs generated by csmith with random probability configuration

I try to test a compiler using csmith. As pointed out in swarm testing [1], random probability configurations benifits to generate diverse programs. However, the test speed is very slow when I use swarm testing to generate programs. First, with random probability configurations, csmith is slow to generate programs. Second, I can only test a small number of programs generated by swarm testing in a test period, while it is possible to test a large number of programs generated by csmith with the default probability configuration. It seems that the programs generated by csmith with random probability configurations need more time to be compiled and executed.

I am curious why test speed is very slow when I use swarm testing to generate programs.

Thanks for the amazing work of developers.

[1] A. Groce, C. Zhang, E. Eide, Y. Chen, and J. Regehr, “Swarm testing,” in Proceedings of the 2012 International Symposium on Software Testing and Analysis. ACM, 2012, pp. 78–88.

Build failure on ARM/AARCH64

I wanted to try csmith on ARM but it doesn't compile:

platform.cpp: In function ‘long unsigned int platform_gen_seed()’:
platform.cpp:71:10: error: impossible constraint in ‘asm’
         );
          ^

Has nobody tried building on ARM platforms before?

csmith generate testcase which Sanitizer complaint undefined-behavior: store to misaligned address

csmith source version:

commit 0bd545feec35013caa8fa8732f12f8da28fa35be
Merge: 1d86f7c 2841ae1
Author: Xuejun Yang <[email protected]>
Date:   Sun Jul 19 16:58:36 2020 -0700

    Merge pull request #95 from mortior/remove-platform-info
    
    Add command line swithces for platform info and remove creation of platform.info file

Seed=1670029788714550

test command

clang -w -O0 -Wall -fwrapv -ftrapv -fsanitize=undefined,address -I ~/software/csmith/include/csmith-2.4.0 -w test.c && ./a.out

result

test.c:532:75: runtime error: store to misaligned address 0x0000005bad8d for type 'int16_t' (aka 'short'), which requires 2 byte alignment
0x0000005bad8d: note: pointer points here
 97 04 00 00 f7 ff fb  ff ff ff 37 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00
             ^ 
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior test.c:532:75 in 
test.c:849:253: runtime error: load of misaligned address 0x0000005bd86d for type 'int64_t' (aka 'long'), which requires 8 byte alignment
0x0000005bd86d: note: pointer points here
 ff ff ff 01 38 7e 4a  43 08 dc 28 ae fb 01 00  00 f9 13 00 00 d6 00 00  00 18 00 00 00 4e 09 00  00
             ^ 
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior test.c:849:253 in 
test.c:916:179: runtime error: store to misaligned address 0x0000005bad8d for type 'int16_t' (aka 'short'), which requires 2 byte alignment
0x0000005bad8d: note: pointer points here
 97 04 00 00 c4 10 fb  ff ff ff 37 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00
             ^ 
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior test.c:916:179 in 
test.c:321:215: runtime error: store to misaligned address 0x0000005bd92d for type 'int64_t' (aka 'long'), which requires 8 byte alignment
0x0000005bd92d: note: pointer points here
 00 00 00 39 fe ff ff  ff ff ff ff ff f1 01 00  00 ac 1f 00 00 bc 01 00  00 bc 00 00 00 29 1b 00  00
             ^ 
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior test.c:321:215 in 
test.c:321:765: runtime error: load of misaligned address 0x0000005b7b8f for type 'uint32_t' (aka 'unsigned int'), which requires 4 byte alignment
0x0000005b7b8f: note: pointer points here
 ff 03 01 00 17  68 8f f2 08 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00
             ^ 
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior test.c:321:765 in 
checksum = D6F7BB67



clang version 10.0.1 (http://git.linaro.org/toolchain/jenkins-scripts.git a4a126627ddd5ee3ead2bb9dec4867ca8ad04ad8)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/yansendao/software/llvm10/bin
Found candidate GCC installation: /usr/lib/gcc/aarch64-linux-gnu/7.3.0
Selected GCC installation: /usr/lib/gcc/aarch64-linux-gnu/7.3.0
Candidate multilib: .;@m64
Selected multilib: .;@m64

test.c.zip

Don't allow reading a union field which was not the one last written into

In Csmith, we have a more liberal rule than the C standard that allows reading union field f1 even through the last written field is f2, provided the width of f2 is wider than f1. For example:

union U1 {
  short f1;
  int f2;
}

union U1 var;
var.f2 = 1;
printf("%i“, var.f1);

While this is safe when the fields are scalars , structures, with the internal padding, introduces a hazard to this rule. For example:

struct S1 {
   short f1;
   long f2;                <---- padding between f1 and f2
}

union U1 {
   struct S1 f1;
   int f2;                   
}

union U1 var;
var.f1.f1 = 0;
var.f1.f2 = 3;
printf("%i“, var.f2);      <--- indeterminate value

Here, even though field f1 of the union is wider than the field f2, but the padding between the two struct fields leads to indeterminate value when we read through f2, which is clearly not acceptable.

The quick, and correct, solution to this problem is disallow the liberal rule. We will have less chance of reading unions in the generated code, but that's a lesser concern than generating safe code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.