Giter Site home page Giter Site logo

Comments (18)

delcypher avatar delcypher commented on May 31, 2024

The simplest (and best) fix for this is to remove the *.i (preprocessed files) files completely and have the build system just build the *.c source files.

The purported reasons for having preprocessed files is that

  • It makes the benchmarks completely standalone (i.e. no header file
    dependencies)
  • It prevents differences in the benchmarks appearing which could
    result from having different versions of system header files (e.g.
    different versions of glibc might lead to differences in the
    pre-processed code due to differences in the glibc header files)

Having worked a little bit with the *.i files I can list quite a
number of disadvantages that I have come across:

  • It is a huge waste of space. We are effectively doubling (at least)
    the size of the repository by having these pre-processed files under
    source control
  • It doubles the maintenance effort when fixing benchmarks files
    because both the *.i and corresponding *.c files need to be
    fixed by hand! I dare not regenerate the *.i files automatically
    because I have no way of knowing what build environment the original
    creator of the benchmark used (this is why we need to define a
    "canonical" build environment).
  • The line pragmas in the *.i files make debugging benchmarks
    extremely difficult. When I found compilation issues I had to remove
    all the line pragmas to get a readable error message from the
    compiler.
  • It's really easy to make an architecture mistake when pre-processing
    a source file (i.e. what I reported on in my first e-mail in this
    thread).

from sv-benchmarks.

dbeyer avatar dbeyer commented on May 31, 2024

For each verification task, it is specified whether it is meant for 32 bit or for 64 bit.
If you find verification tasks that were incorrectly preprocessed, please fix those.
Removing the preprocessed files is not an option.

from sv-benchmarks.

PhilippWendler avatar PhilippWendler commented on May 31, 2024

Which directories contain the incorrectly preprocessed files that you have found? Is it only heap-manipulation?

from sv-benchmarks.

delcypher avatar delcypher commented on May 31, 2024

@PhilippWendler I'm afraid this is too long ago for me to remember. Using a recent clang with -Werror=incompatible-library-redeclaration will probably allow you to pick up which directories have at least some of the problems.

from sv-benchmarks.

PhilippWendler avatar PhilippWendler commented on May 31, 2024

I did this, the list of affected directories is this:

array-examples
array-memsafety
bitvector-regression
ldv-commit-tester
ldv-consumption
ldv-linux-3.12-rc1
ldv-linux-3.16-rc1
ldv-linux-3.4-simple
ldv-linux-4.2-rc1
ldv-memsafety
ldv-races
ldv-regression
ldv-validator-v0.6
ldv-validator-v0.8
list-ext-properties
list-properties
loops
memory-alloca
memsafety
memsafety-ext
ntdrivers
product-lines
pthread
pthread-lit
seq-pthread
ssh
termination-numeric

The full list of files is in incompatible-library-redeclaration.txt.

Maybe this is also the cause for some of the undefined behavior reported in https://github.com/runtimeverification/evaluation/tree/master/svcomp-benchmark

In some files, the incompatible library definition is not that bad, because it only affects irrelevant calls to snprintf. In other files, however, it is clearly a sign of incompatible preprocessings.

As there are so many files, we should discuss a general approach to fixing this problem. Do we simply regenerate the preprocessed files with -m32? Or do we prefer to relabel some of these categories as 64bit?

from sv-benchmarks.

delcypher avatar delcypher commented on May 31, 2024

In some files, the incompatible library definition is not that bad, because it only affects irrelevant calls to snprintf. In other files, however, it is clearly a sign of incompatible preprocessings.

Oh wow. It's more serious than I thought. I would consider this yet another reason to avoid preprocessed source files or at the very least avoid letting humans do the preprocessing.

As there are so many files, we should discuss a general approach to fixing this problem. Do we simply regenerate the preprocessed files with -m32? Or do we prefer to relabel some of these categories as 64bit?

As this effects so many benchmarks I would recommend this is discussion is moved to the mailing list as this potentially affects many benchmarks.

from sv-benchmarks.

zvonimir avatar zvonimir commented on May 31, 2024

Hi all. Are there any plans for this to be fixed?
We are experiencing these issues in the memory safety category. We could create a patch that will fix just the benchmarks we are observing problems with, but that is clearly not the best solution. Having said that, if SVCOMP-wide solution is not feasible at this point, then we will work on submitting targeted fixes.

Oh, one more thing. I am pretty sure that I have seen patches to benchmarks being issues where only the .i file was fixed. So one has to be careful when regenerating benchmarks to make sure that these patches will not be overridden. Alternatively, we can just hope for the best and patch again if/when needed :).

from sv-benchmarks.

PhilippWendler avatar PhilippWendler commented on May 31, 2024

@zvonimir Unfortunately, only very few people seem to care about this because there have been few contributions towards this, and the people that work on this have only limited man power. So if you want to get this fixed, please help and encourage others to help.

Since e2a1165 the build system shows which benchmarks potentially have problems with incorrect pre-processing because clang issues a warning about incompatible-library-redeclaration. You can see a list of directories where this warning currently occurs by grepping for this string in */Makefile. The current list of potentially affected directories (excluding ldv-*) is

  • array-industry-pattern (#220)
  • array-memsafety
  • bitvector-regression (#208)
  • list-ext-properties
  • list-properties
  • loop-industry-pattern (#219)
  • loops (#206)
  • memory-alloca
  • memsafety-ext
  • memsafety
  • ntdrivers (#207)
  • product-lines (#205)
  • pthread-lit
  • pthread
  • seq-pthread
  • ssh (#207)

Note that not all warnings about incompatible-library-redeclaration come from incorrect pre-processings, sometimes people just manually re-declare some standard function like log or memcpy with a different signature.

Because as you noticed there might have been changes to only the .i files somewhere, it is important that any changes with updated pre-processing need to be reviewable, despite their potentially large size. When I fixed the pre-processing of the directory heap-manipulation, I separated all trivial (82ce8c6) and whitespace-only (81ce220, bf24e97) changes from the rest of the changes (743bb49). This made the latter change much smaller and easier to review. I suggest do follow the same procedure when fixing the pre-processings for other directories.

from sv-benchmarks.

PhilippWendler avatar PhilippWendler commented on May 31, 2024

I created proposed fixes for the directories that I know: bitvector-regression (#208), loops (#206), ntdrivers and ssh (#207), and product-lines (#205). In all these cases there was a better solution than doing a new pre-processing: I fixed manually-written declarations manually, and replaced a few preprocessed files with their non-preprocessed sources.

In ntdrivers, there are actually a few remaining cases that implement malloc in some weird way. I am not sure whether these should be replaced with an extern malloc declaration or have their malloc renamed.

from sv-benchmarks.

tautschnig avatar tautschnig commented on May 31, 2024

@PhilippWendler could we please document (in the README file in the long run, but as first step just in here) what the system is we should be preprocessing on, and what the exact command to use is? At present, the README just says use "cpp -P or something similar". I'd be very happy to help, but frankly I don't know what system to use so as to avoid massive changes, or having to review massive changes before git add -p-ing just those few that we actually want.

Though, see also my comments on the mailing list today, which would mean we could just close this one.

from sv-benchmarks.

PhilippWendler avatar PhilippWendler commented on May 31, 2024

Currently there are no such requirements, except that no line markers are generated and that the architecture matches. In practice almost all benchmarks have been preprocessed with GCC and its standard library so far. I'm not in the position to impose any requirements, but I would suggest to use gcc -P -E -m<arch> on a recent Ubuntu LTS.

I am afraid there is no way to avoid generating massive changes when re-preprocessing old files (at least I do no know any way). I guess one could attempt to replicate the old preprocessing environment as closely as possible (sometimes this was documented in the commit message, or try the latest Ubuntu LTS at that time), but I guess that if you change the architecture compared to the old preprocessing, you will get massive changes even with the same environment. But maybe you find some way? What I did was to try making the changes as easily reviewable as possible (cf. description here).

from sv-benchmarks.

tautschnig avatar tautschnig commented on May 31, 2024

If anyone has Ubuntu 16.04 at hand, running the below script should hopefully yield the most reasonable diff. Neither 14.04 nor 18.04 seemed like the best candidates.

#!/bin/bash

set -e

for d in \
 array-memsafety \
 ldv-races \
 ldv-regression \
 list-ext-properties \
 list-properties \
 memory-alloca \
 memsafety-ext \
 memsafety \
 pthread-atomic \
 pthread-ext \
 pthread-lit \
 pthread-nondet \
 pthread-wmm \
 pthread \
 seq-pthread \
 termination-libowfat
do
  echo $d
  cd $d
  for f in *.c
  do
    ff=$(basename $f .c)
    if [ ! -s $ff.i ]
    then
      if [ ! -s $f.i ]
      then
        echo "preproc file $ff.i not found"
        continue
      else
        ff=$f
      fi
    fi
    gcc -m32 -E -P $f -o $ff.i
  done
  cd ..
done

cd ldv-memsafety/memleaks-notpreprocessed/
for f in *.c
do
  gcc -E -P -m32 $f -o ../$(basename $f .c).i
done

Or maybe we should just finally get rid of preprocessed files altogether.

from sv-benchmarks.

tautschnig avatar tautschnig commented on May 31, 2024

Just for the record: Ubuntu 16.04 doesn't help much either. @kfriedberger What OS have you been working on when preparing e76bcd9? I would then re-try processing on that OS.

from sv-benchmarks.

kfriedberger avatar kfriedberger commented on May 31, 2024

The commit e76bcd9 is quite old (over 4 years). At that time I worked with Ubuntu 12.04 or 14.04 or Arch Linux (rolling release, no version number), all of them in x64_86. None of the machines has survived ;-)

from sv-benchmarks.

tautschnig avatar tautschnig commented on May 31, 2024

@kfriedberger Thanks, I'll try to set up a machine with one of those.

from sv-benchmarks.

tautschnig avatar tautschnig commented on May 31, 2024

I have tried both Ubuntu 12.04 and Arch Linux (though a very recent one for the latter), and neither provide particularly good results. I'll now simply go for Ubuntu 18.04 as it is no worse than the others and matches the competition environment.

from sv-benchmarks.

tautschnig avatar tautschnig commented on May 31, 2024

I believe #765 finally fixes this problem for all remaining folders.

from sv-benchmarks.

MartinSpiessl avatar MartinSpiessl commented on May 31, 2024

#765 is merged, does this mean this issue can be closed?

from sv-benchmarks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.