Giter Site home page Giter Site logo

Comments (18)

enoperm avatar enoperm commented on May 24, 2024 1

Sorry for "hijacking" this issue, though I do believe what I have observed is not entirely unrelated to the current state of wasmer-go

I have played around with the library, really like the concept, though I have ran into some issues with memory usage.

As an example, the code below demonstrates three separate issues:

  1. Using a single engine instance for a large number of queries will cause memory usage to climb until around 7GiBs or so, at which point all subsquent queries are stuck - I suspect it has to do with the allocator, though I have not verified it. I'm pretty sure a leak is involved somewhere because I do not think any state should be retained from just querying for nl.
  2. Using a single engine instance for a large number of queries that involve host-interop, will cause memory usage to climb until around 2GiBs, at which point I think the interop-related code uses an int32 to slice into WASM-allocated memory, resulting in a panic:
$ go run main.go -- --with-interop
2023/04/20 14:59:30
2023/04/20 15:00:53 trealla: query error: trealla: panic: runtime error: slice bounds out of range [-2130246640:]
exit status 1
  1. The reason I described the issue here lies with this leak. In theory, it is not an issue if a WASM application leaks any memory, since it is very cheap to to just throw it away and allocate one with a fresh slate... except doing so will consume memory infinitely. The runtime does not seem to ever release the instantiated WASM modules/memory, and the result is a far larger, and faster memory leak than just using a persistent instance.
package main

import (
	"context"
	"fmt"
	"log"
	"math/rand"
	"os"
	"time"

	"github.com/trealla-prolog/go/trealla"
)

func newEngine(ctx context.Context) (trealla.Prolog, error) {
	engine, err := trealla.New()
	if err != nil {
		return nil, err
	}

	err = engine.Register(ctx, "invoke", 1, trealla.Predicate(func(pl trealla.Prolog, subquery trealla.Subquery, goal trealla.Term) trealla.Term {
		v := fmt.Sprintf("%v", rand.Float64())
		return trealla.Compound{
			Functor: "invoke",
			Args: []trealla.Term{
				trealla.Atom(v),
			},
		}
	}))
	return engine, err
}

func main() {
	ctx, cancel := context.WithCancel(context.Background())
	defer cancel()

	engine, err := newEngine(ctx)
	if err != nil {
		log.Fatal(err)
	}

	log.Print()
	defer func() {
		recover()
		log.Print()
		os.Exit(2)
	}()

	getEngine := func() trealla.Prolog {
		return engine
	}

	// perhaps ugly, but suffices for testing
	query := `nl`
	for _, arg := range os.Args[1:] {
		switch arg {
		case "--with-interop":
			query = `invoke(A)`

		case "--with-new-instance":
			getEngine = func() trealla.Prolog {
				engine, _ := newEngine(ctx)
				return engine
			}
		}
	}

	for {
		func() {
			ctx, cancel := context.WithTimeout(ctx, 10*time.Second)
			defer cancel()
			engine := getEngine()
			doWork(ctx, query, engine)
		}()
	}
}

func doWork(ctx context.Context, query string, engine trealla.Prolog) {
	resultSet := engine.Query(ctx, query)
	defer resultSet.Close()

	for resultSet.Next(ctx) {
	}

	if err := resultSet.Err(); err != nil {
		log.Println(err)
		os.Exit(1)
	}
}

Since the alternative is getting stuck forever, I'm not entirely sure the int32 issue should be fixed just yet - I would personally prefer crashing and restarting to never getting anywhere at all, for one. But I think, assuming I'm not wrong about the source of the leak when each query is served by a new instance, a different WASM runtime might be more beneficial than it seems at first glance.

from go.

guregu avatar guregu commented on May 24, 2024 1

trealla-js looks fine so it shouldn't be a problem with Trealla itself. I think the issue here is we hold a global wasmer.Store which probably just infinitely accumulates memory. So we'd need to instantiate a new module for each Prolog to avoid that, but then we run into the wasmer-go bug you linked. I'll see about using a different library.

BTW the interop bug with the negative pointers should be fixed in the next release, but it can still run out of memory if hits 4GB. Not sure if any of the Go libraries support 64-bit WASM but if they do it should be possible to fix that.

from go.

enoperm avatar enoperm commented on May 24, 2024 1

I have plugged in the wasmtime branch into the toy project where I initially detected the memory leak and I can confirm that it can now serve each request on a new Trealla instance just fine without any apparent leaks. ✔️

from go.

guregu avatar guregu commented on May 24, 2024

There's one blocker for using wasmtime: bytecodealliance/wasmtime-go#34
We need an easy way to capture stdout to receive results from the Prolog interpreter.
We could work around it using files but it's not ideal.

wasmer solves this with a little C shim: https://github.com/wasmerio/wasmer-go/blob/b4462e6583f8d7b964e32bdd8d065cf96fba6c08/wasmer/wasi.go#L18-L54
So it probably wouldn't be too hard to add this to wasmtime.

I think we could also maybe export Go functions into WASM and have Trealla call it via FFI to report results, but I'm not quite sure how that works and if it is possible yet.

I would be especially interested in switching to any WASM runtime with good Windows support. Then we can support all major platforms out of the box.

from go.

guregu avatar guregu commented on May 24, 2024

This looks promising as well: https://github.com/tetratelabs/wazero
It's more in spirit with the library I think. Not sure if it's performance is anywhere close to the others.
wazero struggles with my broken realpath implementation in TPL so I need to fix that and do some experimenting.

from go.

guregu avatar guregu commented on May 24, 2024

Thanks @enoperm, that is very concerning. I’ll look into it 👀

from go.

enoperm avatar enoperm commented on May 24, 2024

Do remember to use a separate cgroup or something if you run the repro code with --with-new-instance. It is a good way to test your OOM killer...

from go.

guregu avatar guregu commented on May 24, 2024

I think the leak around instantiating new Prologs is my fault, not wasmer's, seems like I forgot to add a Close to Prolog. I'll add that and let's see if it helps.

I think it would also not be too hard to move to wasmtime, Trealla has some stuff now to capture stdout/stderr so we can work around the wasmtime-go issue.

from go.

guregu avatar guregu commented on May 24, 2024

To be more specific, the wasmer stuff relies on runtime finalizers to free memory if they aren't explicitly closed so maybe a tight loop could prevent them from being called. Adding a Close to the Prolog interface should help with that.

from go.

enoperm avatar enoperm commented on May 24, 2024

It is possible that I may have been harsh towards wasmer, I suppose. I'm not an expert on the Go runtime, so when I dug into the code and saw that finalizers exist, I just assumed they'd be surely called by the GC before the kernel triggers the OOM killer, and then encountered the previously linked issue and it kind of fit: if the wasm runtime does not free up things anyway, then any finalizers are pointless.

from go.

guregu avatar guregu commented on May 24, 2024

I'm not too sure how Go's runtime handles them either, TBH, so no worries.
I tried adding a Close() method to the Prolog instance that calls wasmer's instance.Close() but it doesn't seem to help. I'll try and figure out whether it's a problem here or with wasmer-go. I'll also check if trealla-js has the same issue or not (I remember measuring it a while back and it was fine) which will rule out Trealla itself.
Thanks again for the test code, it is very helpful.

from go.

guregu avatar guregu commented on May 24, 2024

Good news, I got wasmtime working (in the wasmtime branch if you want to check it out). Running the test program with --with-new-instance and closing the engine in the loop allows it to properly garbage collect the memory. Also, looks like it properly supports 64-bit Wasm so I'll play around with compiling Trealla to that this weekend.
Bad news, it seems like there's still a leak with queries against a single Prolog instance. Looks like this is happening on the Wasm side, probably from the stuff I added to get wasmtime to work ;). Should be able to fix it.
Also the Tak function benchmark seems like twice as slow on wasmtime vs wasmer, not sure what's up with that.

from go.

guregu avatar guregu commented on May 24, 2024

Found the leak! Got the default test stable at around ~64MB. Looks like there's some weirdness with the interop stuff still, so once that gets fixed we should be good and I'll push the new stuff.

from go.

guregu avatar guregu commented on May 24, 2024

I believe this is what was causing the interop memory issue: trealla-prolog/trealla#162
Once we fix that everything should be solid.

from go.

guregu avatar guregu commented on May 24, 2024

More good news: we've fixed all the leaks. Just need to tweak the interop stuff to work better with wasmtime and I can make a new release.

from go.

guregu avatar guregu commented on May 24, 2024

Great! I have some more changes I'm just about to push. Will test it a bit more and make a new release shortly.

from go.

guregu avatar guregu commented on May 24, 2024

Hmm, seems like wasmtime can crash sometimes. Probably my fault, investigating.

from go.

guregu avatar guregu commented on May 24, 2024

Whoops, was a concurrency issue, fixed now.
Made an issue for wasm64 support as well: #8

from go.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.