Giter Site home page Giter Site logo

Memory leak on context switches about fibers HOT 9 CLOSED

wingo avatar wingo commented on July 18, 2024
Memory leak on context switches

from fibers.

Comments (9)

civodul avatar civodul commented on July 18, 2024 1

The culprit appears to be 84addfb. Reverting it gives us back stable heap usage in the example above.

Instead of reverting it, a possible fix is:

diff --git a/fibers.scm b/fibers.scm
index 6e81f98..11c5389 100644
--- a/fibers.scm
+++ b/fibers.scm
@@ -150,7 +150,9 @@ parameter mutations to the fiber."
         (with-dynamic-state dynamic-state thunk))))
   (define (create-fiber sched thunk)
     (schedule-task sched
-                   (capture-dynamic-state thunk)))
+                   (capture-dynamic-state
+                    (lambda ()
+                      (%start-stack #t thunk)))))
   (cond
    (scheduler
     ;; When a scheduler is passed explicitly, it could be there is no

(That %start-stack call was removed in 84addfb.)

Surprising because %start-stack does very little, and what it does only affects the stacks.c API. In fact, the following patch has the same effect:

diff --git a/fibers.scm b/fibers.scm
index 6e81f98..23d07fd 100644
--- a/fibers.scm
+++ b/fibers.scm
@@ -150,7 +150,11 @@ parameter mutations to the fiber."
         (with-dynamic-state dynamic-state thunk))))
   (define (create-fiber sched thunk)
     (schedule-task sched
-                   (capture-dynamic-state thunk)))
+                   (capture-dynamic-state
+                    (let ((f (make-fluid)))
+                      (lambda ()
+                        (with-fluids ((f 42))     ;WTF?
+                          (thunk)))))))
   (cond
    (scheduler
     ;; When a scheduler is passed explicitly, it could be there is no

So I suspect this whole thing exhibits a bug in dynamic stack capture or something.

from fibers.

civodul avatar civodul commented on July 18, 2024

With Fibers 1.0.0 (replacing yield-current-task with yield-current-fiber from (fiber internal)), the problem doesn't seem to occur.

from fibers.

civodul avatar civodul commented on July 18, 2024

Heap size becomes stable if we change the second fiber above to:

   (spawn-fiber
    (lambda ()
      (let loop ()
        ;; (yield-current-task)
        (suspend-current-task schedule-task)
        (loop))))

The difference between (yield-current-task) and (suspend-current-task schedule-task) is that the former returns #t after its call to abort-to-prompt. And indeed, if we add a continuation to the schedule-task call, the leak is back:

   (spawn-fiber
    (lambda ()
      (let loop ()
        ;; (yield-current-task)
        (suspend-current-task
         (lambda (sched k)
           ;; When the 'schedule-task' call is not in tail position, memory is leaked. 
           (->bool (schedule-task sched k))))
        (loop))))

from fibers.

civodul avatar civodul commented on July 18, 2024

The program below has a stable heap size when built with -O1, but not when built with -O1 -Ocps or -O2 (with Guile 3.0.8):

(use-modules (fibers)
             (fibers channels)
             (fibers scheduler)
             (statprof))

(run-fibers
 (lambda ()
   (define channel
     (make-channel))
   (define leave-channel
     (make-channel))

   (spawn-fiber
    (lambda ()
      (sleep 20)
      (put-message leave-channel 'leave)))
   (spawn-fiber
    (lambda ()
      (let loop ()
        (let ((tag ((@ (fibers scheduler) scheduler-prompt-tag) (current-scheduler))))
          (abort-to-prompt tag schedule-task))

        (loop))))
   (spawn-fiber
    (lambda ()
      (let loop ()
        (pk 'heap-size (assoc-ref (gc-stats) 'heap-size))
        (sleep 2)
        (loop))))
   (get-message leave-channel))

 #:parallelism 1                                  ;don't create POSIX threads
 #:hz 0)

from fibers.

civodul avatar civodul commented on July 18, 2024

The culprit appears to be 84addfb. Reverting it gives us back stable heap usage in the example above. (Many thanks to @cwebber for telling me to just bisect — that it didn't occur to me is embarrassing :-))

from fibers.

cwebber avatar cwebber commented on July 18, 2024

Even more than that, it's the very same commit I found was a problem when I bisected #58 !

That commit caused bugs in both space and time it appears. :)

from fibers.

civodul avatar civodul commented on July 18, 2024

Escalating! 👉 https://issues.guix.gnu.org/59021

from fibers.

civodul avatar civodul commented on July 18, 2024

The change in libguile/vm.c discussed in the Guile issue did fix a leak, but I have another one shown by this reproducer:

(use-modules (fibers)
             (fibers channels)
             (fibers scheduler)
             (statprof)
             (ice-9 match))

(include "heap-profiler.scm")

(run-fibers
 (lambda ()
   (define leave-channel
     (make-channel))

   (spawn-fiber
    (lambda ()
      (sleep 60)
      (put-message leave-channel 'leave)))
   (let* ((sock (socket AF_INET
                        (logior SOCK_STREAM SOCK_NONBLOCK SOCK_CLOEXEC)
                        IPPROTO_IP))
          (port 4567))
     (bind sock (make-socket-address AF_INET INADDR_LOOPBACK port))
     (listen sock 4)
     (format #t "listening on port ~a~%" port)
     (format #t "In Bash, run:
   while : ; do echo foo> /dev/tcp/localhost/~a ; done~%~%" port)

     (spawn-fiber
      (lambda ()
        (let loop ()
          (match (accept sock (logior SOCK_NONBLOCK SOCK_CLOEXEC))
            ((connection . peer)
             (close-port connection))
            (_ #f))
          (loop)))))
   (spawn-fiber
    (lambda ()
      (define delay 2)
      (let loop ((previous (assoc-ref (gc-stats) 'heap-total-allocated)))
        (let* ((stats (gc-stats))
               (allocated (assoc-ref stats 'heap-total-allocated)))
          (pk 'heap-size (assoc-ref stats 'heap-size)
              'MiB/s (/ (- allocated previous) (expt 2. 20) delay))
          ;; (profile-heap)
          (sleep delay)
          (loop allocated)))))
   (get-message leave-channel)

   (profile-heap))

 #:parallelism 1                                  ;don't create POSIX threads
 #:hz 0)

It's a relatively slow leak because on my laptop the allocation rate is only ~1 MiB/s, but it is leaky.

With the heap profiler called at the end, I get:

;;; (heap-size 7622656 MiB/s 1.01300048828125)
  %   type                               self    avg obj size
 33.9 program                                931936    32.4
 25.7 pair                                   705696    16.0
  9.0 unknown                                248000   361.0
  5.0 symbol                                 138112    32.0
  4.3 struct                                 117088    97.7
  3.8 vector                                 103219    30.7
  3.0 stringbuf                               82224    48.3
  2.7 pointer                                 75136    16.0
  1.8 string                                  49488    32.0
  1.7 bytevector                              48032    59.2
  1.6 variable                                42816    20.3
  1.4 smob                                    38192    38.3
  1.2 hash-table                              34272    32.0
  1.1 heap-number                             30928    32.0
  0.9 port                                    24992    32.0
  0.8 weak-table                              21088    28.5
  0.7 vm-continuation                         19760    30.9
  0.7 atomic-box                              18160    32.0
  0.3 partial-continuation                     8832    32.0
  0.2 primitive                                4848    16.0
  0.1 weak-vector                              3424    18.9
  0.0 dynamic-state                            1296    16.8
  0.0 primitive-generic                        1280    32.0
  0.0 keyword                                   864    16.0
  0.0 frame                                     272    34.0
  0.0 fluid                                     256    18.3
  0.0 bitvector                                 176    35.2
  0.0 foreign-program                            96    32.0
  0.0 array                                      48    48.0
  0.0 weak-set                                   48    48.0
sampled heap: 2.62316 MiB (heap size: 7.26953 MiB)

... suggesting we've allocated and retained many closures (program here).

from fibers.

civodul avatar civodul commented on July 18, 2024

Turns out the leak was introduced in c25dcb9. Fixed now!

from fibers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.