Giter Site home page Giter Site logo

ecraven / r7rs-benchmarks Goto Github PK

View Code? Open in Web Editor NEW
266.0 266.0 32.0 7.85 MB

Benchmarks for various Scheme implementations. Taken with kind permission from the Larceny project, based on the Gabriel and Gambit benchmarks.

Makefile 0.02% Shell 1.77% TeX 3.41% Scheme 64.63% HTML 28.27% JavaScript 1.90%

r7rs-benchmarks's People

Contributors

arnebab avatar codemac avatar dcurrie avatar ecraven avatar jeffbezanson avatar jobol avatar justinethier avatar leppie avatar michaellenaghan avatar svenha avatar vyzo avatar wasamasa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

r7rs-benchmarks's Issues

CPU limit penalizes parallel GC

The CPU limit in the bench script is 300 seconds of CPU time (5 minutes). Compilation has 5 minutes, then running has 5 minutes. But unfortunately this penalizes parallelization, whose goal is usually to minimize wall-clock time rather than CPU time; and the end result of the benchmarking process is wall-clock time. Of course this result is worse the more cores you have.

I think Guile has hit this limit in ctak at run-time, and now that some optimizations allow for better visibility into compile, we hit it there at compile-time as well.

I think the limit should be raised to 15 minutes. It's a decent trade-off between the concern of wanting short benchmark runs and allowing for parallelism in GC.

Where is graph.scm?

The Makefile references graph.scm, but it doesn't seem to be committed. Is it available?

Compute more repetitions for chudnovsky

chudnovsky is a bignum benchmark, as is pi---they both compute up to 500 digits of $\pi$.

Some chudnovsky benchmark times are so small that one can't have confidence in their accuracy (compared to startup times, for example).

One could increase computation times by increasing the number of computed digits of $\pi$ (currently limited to 500) or by increasing the number of times each length of $\pi$ is computed (currently twice).

I think a general benchmark suite should test performance on smallish bignums, as they are more useful in practice, so I would recommend increasing the number of repetitions instead of increasing the size of the bignums involved.

So here's a proposed diff:

diff --git a/inputs/chudnovsky.input b/inputs/chudnovsky.input
index c310e06..46b9135 100644
--- a/inputs/chudnovsky.input
+++ b/inputs/chudnovsky.input
@@ -1,4 +1,4 @@
-2
+20
 50
 500
 50

Brad

r7rs compatibility libraries for chez

Thanks for maintaining this set of benchmarks.

In the 'bench' script, the chez command refers to '--libdirs /home/nex/scheme/chez'

Could you please describe how you set up the compatibility libraries for chez?

Thanks,
Jan Erik

Add support for S7 scheme

It would be interesting to have S7 scheme in the benchmark. It is a interpreted scheme, successor of Tinyscheme, but much faster. S7 has a compatibility layer for r7rs, see below a possible prelude.

I've performed some preliminary tests on my machine (MacBookAir 2019) and the results are the following (comparison with Guile 1.8 and 3.0.4).

test S7 Guile1.8.8 Guile3.0.4
browse 24.27 80.32 12.060597
deriv 25.194 61.39 18.581995
destruc 52.077 TIMELIM 7.143701
diviter 9.685 77.85 15.453743
divrec 11.803 78.55 17.41294
puzzle 27.716 191.35 18.086531
triangl 33.931 98.16 8.519252
tak 12.925 134.2 4.757643
takl 20.968 TIMELIM 9.456034
ntakl 17.073 TIMELIM 9.516082
cpstak 103.358 221.03 59.444873
ctak 44.139 TIMELIM TIMELIM
fib 10.218 195.78 12.090909
fibc 25.799 TIMELIM TIMELIM
fibfp 1.885 45.98 22.001634
sum 6.637 281.63 6.866215
sumfp 2.499 105.1 42.058511
fft 32.198 TIMELIM 7.685201
mbrot 24.403 TIMELIM 50.086067
mbrotZ 18.556 TIMELIM 67.011491
nucleic 19.946 67.46 15.347245
pi NO TIMELIM 0.564552
pnpoly 17.981 TIMELIM 24.886723
ray 20.455 TIMELIM 18.51229
simplex 46.344 TIMELIM 13.895531
ack 10.572 TIMELIM 8.413945
array1 11.483 160.88 9.241778
string 1.714 1.82 1.872806
sum1 0.47 1.63 4.427402
cat 1.187 TIMELIM 28.396944
tail 1.188 TIMELIM 9.821691
wc 8.266 57.91 16.963138
read1 406 0.95 5.804979
compiler 41.155 TIMELIM 5.149011
conform 51.031 TIMELIM 10.508732
dynamic 22.736 69.58 7.374259
earley TIMELIM TIMELIM 9.489885
graphs 127.611 TIMELIM 23.026826
lattice 139.275 292.7 15.937364
matrix 72.073 TIMELIM 9.881781
maze 23.258 TIMELIM 4.70391
mazefun 19.51 129.61 9.664338
nqueens 55.11 TIMELIM 19.372148
paraffins 31.424 TIMELIM 4.24542
parsing 39.443 TIMELIM 10.687959
peval 29.677 98.91 15.644764
primes 7.73 39.33 7.521318
quicksort 93.996 TIMELIM 13.252736
scheme 71.462 TIMELIM 15.142413
slatex 32.069 48.96 45.047143
chudnovski NO TIMELIM 0.306648
nboyer 39.274 151.42 5.10214
sboyer 31.537 168.81 4.755798
gcbench 20.54 TIMELIM 3.511493
mperm 173.33 TIMELIM 10.650118
equal 781 TIMELIM TIMELIM
bv2string 10.782 TIMELIM 4.489627

chudnovski and pi fails but should be easy to arrange for that.

This is the s7.prelude I'm using

(define (this-scheme-implementation-name) "s7")
(define exact-integer? integer?)        
(define (exact-integer-sqrt i) (let ((sq (floor (sqrt i)))) (values sq (- i (* sq sq)))))
(define inexact exact->inexact)
(define exact inexact->exact)
(define (square x) (* x x))
(define (vector-map f v) (copy v)) ; for quicksort.scm
(define-macro (import . args) #f)
(define (jiffies-per-second) 1000)
(define (current-jiffy) (round (* (jiffies-per-second) (*s7* 'cpu-time))))
(define (current-second) (floor (*s7* 'cpu-time)))

(define read-u8 read-byte)
(define write-u8 write-byte) 
(define u8-ready? char-ready?) 
(define peek-u8 peek-char)
(define* (utf8->string v (start 0) end) 
  (if (string? v)
      v
      (substring (byte-vector->string v) start (or end (length v)))))
(define* (string->utf8 s (start 0) end) 
  (if (byte-vector? s)
      s
      (string->byte-vector (utf8->string s start end))))
(define write-simple write)


(define* (string->vector s (start 0) end)
  (let ((stop (or end (length s)))) 
    (copy s (make-vector (- stop start)) start stop)))

(define vector-copy string->vector)
(define* (vector-copy! dest at src (start 0) end) ; end is exclusive
  (let ((len (or end (length src))))
    (if (or (not (eq? dest src))
            (<= at start))
        (do ((i at (+ i 1))
             (k start (+ k 1)))
            ((= k len) dest)
          (set! (dest i) (src k)))
        (do ((i (- (+ at len) start 1) (- i 1))
             (k (- len 1) (- k 1)))
            ((< k start) dest)
          (set! (dest i) (src k))))))

(define make-bytevector make-byte-vector)
(define bytevector-ref byte-vector-ref)
(define bytevector-set! byte-vector-set!)
(define bytevector-copy! vector-copy!)
(define bytevector-u8-ref byte-vector-ref)
(define bytevector-u8-set! byte-vector-set!)

;; records
(define-macro (define-record-type type make ? . fields)
  (let ((obj (gensym))
        (args (map (lambda (field)
                     (values (list 'quote (car field))
                             (let ((par (memq (car field) (cdr make))))
                               (if (pair? par) (car par) #f))))
                   fields)))
    `(begin
       (define (,? ,obj)
         (and (let? ,obj)
              (eq? (let-ref ,obj 'type) ',type)))
       
       (define ,make 
         (inlet 'type ',type ,@args))

       ,@(map
          (lambda (field)
            (when (pair? field)
              (if (null? (cdr field))
                  (values)
                  (if (null? (cddr field))
                      `(define (,(cadr field) ,obj)
                         (let-ref ,obj ',(car field)))
                      `(begin
                         (define (,(cadr field) ,obj)
                           (let-ref ,obj ',(car field)))
                         (define (,(caddr field) ,obj val)
                           (let-set! ,obj ',(car field) val)))))))
          fields)
       ',type)))

chez: precompile before execution

I think for a fairer comparison, chez benchmark should precompile the program and compatibility libraries before execution, as is done with gambit.

Add a few reference implementation in other language.

Well, I know this might be off-topic, but I really want to see the comparison of some ((Schemes (like chez)) who claim to have comparable performance with static-compiled language like c) with c. I did some unscientific benchmark and chez is slower of 2x ~ 10x, racket is of course even slower.. but they are unscientific.
I've seen on Reddit and StackOverflow that people want some benchmark between scheme and c. Adding just c implementations for them of course require works, but not so much. If this is considered helpful maybe I'll start translating some of the benchmarks into c.

Update Ypsilon to 2.x

Ypsilon is now developed at https://github.com/fujita-y/ypsilon and it becomes a LLVM-based R6RS/R7RS compiler.

The author claims that its performance is comparable to Guile 3.x on his machine. It would be great if Ypsilon could come back to the benchmarking matrix officially.

mperm run failures

Looks like the mperm benchmark is failing on all Schemes. I checked on both Chez and Racket, and it appears the function run-benchmark is defined twice at the end of the file.

racket run is dead

I was having the same problems until I did the following:

% raco pkg uninstall r7rs
% raco pkg install -i r7rs

now things (fib* so far) run for me.

Chicken 5

Hello!
Version 5 of Chicken scheme was released in november 2018. Do you mind bumping version used in the benchmark?

No license file

I have some patches, but am not allowed to contribute them since there is no license file.

Can I send you a PR to add one? Which license would you want?

Chicken Questions

I was wondering about the reasoning for the current Chicken flags (in particular C5). For example, according to the wiki, -O2 already includes -optimize-leaf-routines and -inline, so specifying these seems redundant.

I was also wondering if there's any particular reason not to compile with -O3, or even -C -O3, which passes the -O3 flag to the C compiler. I'm not entirely certain if or how these would work or break existing tests, the nature of the question is more exploratory.

Consistently set initial heap sizes

Quite a number of these tests have a high garbage collection component. It's well known that allocation-heavy benchmarks will run faster with larger heap sizes, and different implementations may be tuned to different heap sizes relative to live data. For consistency, it would be good to tune all implementations to have the same heap size for each benchmark -- i.e. for each benchmark, determine the minimum heap size at which the benchmark runs on any implementation, and then run all implementations at, say, 2.5x that heap size.

For Guile you can do this by setting the GC_INITIAL_HEAP_SIZE and GC_MAXIMUM_HEAP_SIZE environment variables. Like, let's say you want to determine the minimum heap size for chudnovsky; then you do GC_INITIAL_HEAP_SIZE=3m GC_MAXIMUM_HEAP_SIZE=3m ./bench guile chudnovsky to try at 3 megabytes, and you vary the 3m until you find a heap size at which the benchmark doesn't run. You record that size for chudnovsky, then do it for all the others. For chudnovsky for example I find it to be 2700k or so. So let's say we run at 2.5 heap size, then then when running the tests you do GC_INITIAL_HEAP_SIZE=6750k GC_MAXIMUM_HEAP_SIZE=6750k ./bench guile chudnovsky. But, better to set GC_INITIAL_HEAP_SIZE only when running the compiled artifact and not the compiler!

Anyway, a thought, just if you were interested :) I will probably do this for Guile at some point for our internal benchmarks.

Is total-accumulated-runtime unintentionally misleading?

total-accumulated-runtime shows Chez with a small lead over Gambit, and Gambit with a small lead over Larceny. But tests-finished shows that Chez is only running 45 benchmarks, while Gambit is running 51, and Larceny is running 54. Are Gambit and Larceny taking a hit in that first graph just because they're doing more work?

PS Really, really love this!

add marks for racket 7.1 and racket 6.12

can you please add benchmarks for

  • racket 7.1
  • racket 6.12

marking version 6.12 is interesting
cos in 7.0 racket switched to chez scheme 'backend'

adding clojure is out of scope, i guess?

Gambit prelude addition for bv2string

firefly:~/programs/r7rs-benchmarks> git diff src/GambitC-prelude.scm
diff --git a/src/GambitC-prelude.scm b/src/GambitC-prelude.scm
index 51ffde0..73f9cdd 100644
--- a/src/GambitC-prelude.scm
+++ b/src/GambitC-prelude.scm
@@ -35,6 +35,23 @@
 (define write-string write)
 
 (define (this-scheme-implementation-name) (string-append "gambitc-" (system-version-string)))
+
+(define (string->utf8 s)
+  (with-output-to-u8vector
+   '()
+   (lambda ()
+     (display s))))
+
+(define (utf8->string v)
+  (call-with-input-u8vector
+   v
+   (lambda (p)
+     (list->string (read-all p read-char)))))
+
+(define make-bytevector make-u8vector)
+
+(define bytevector-u8-set! u8vector-set!)
+
 ;; TODO: load syntax-case here, to get syntax-rules.
 ;; google says (load "~~/syntax-case"), but that doesn't work on my machine :-/
 

I'm not sure that these lines in the femtolisp prelude:

+(define utf8->string identity)
+(define string->utf8 identity) 

are really true---are utf8-encoded strings really just byte-vectors in femtolisp? It does make the benchmark to faster, though!

Avoid unsafe optimizations

In PR #15, a change was made to ensure all implementations were run in "safe mode". However this was reverted in b196000. Currently the benchmarks compare safe and unsafe implementations. What's the goal here?

My expectation would be that all Schemes should be compiled in such a way that they don't use unsafe optimizations.

pi.scm

r7rs has exact-integer-sqrt.

In Gambit, if you replace square-root with integer-sqrt, and quartic root with (lambda (x) (integer-sqrt (integer-sqrt x)) then the results on my machine go from

+!CSVLINE!+gambitc-v4.8.5,pi:50:500:50:2,.7766382694244385

to

+!CSVLINE!+gambitc-v4.8.5,pi:50:500:50:2,.03497314453125

I.e., it's 20 times as fast. Perhaps it would be a better benchmark if a similar replacement were made. (In other words, perhaps you're spending most of the time in inefficient implementations of width and root, which may not be what you think the CPU is spending its time on.)

add startup time benchmark

It would be nice to see the startup time of each scheme. Those that take very long to run can't be considered for writing unix-like one-shot tools like cat or grep. From my own trials some take extremely long to start.

Gambit compiling "compiler" seems to tickle C toolchain bug

In results.GambitC you find

Testing compiler under GambitC
Including prelude /home/nex/src/r7rs-benchmarks/src/GambitC-prelude.scm
Compiling...
gambitc_comp /tmp/larcenous/GambitC/compiler.scm /tmp/larcenous/GambitC/compiler.exe
{standard input}: Assembler messages:
{standard input}:5355: Warning: end of file not at end of a line; newline inserted
{standard input}:6227: Error: no such instruction: `mo'
{standard input}: Error: open CFI at the end of file; missing .cfi_endproc directive
gcc: internal compiler error: Killed (program cc1)
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://bugs.archlinux.org/> for instructions.
+!CSVLINE!+gambitc,compiler,COMPILEERROR

So it appears that it's not a Gambit bug per se, but rather a problem compiling the C file that Gambit produces.

I looked for this because I had no problem with running compiler on my own Ubuntu box. I don't know why your setup has this problem.

Brad

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.