Giter Site home page Giter Site logo

Comments (10)

kayceesrk avatar kayceesrk commented on June 8, 2024 1

Thanks. caml_stat_alloc raises an OCaml exception on failure, whereas the _noexc version returns a NULL value on failure. Some of these allocations occur before the program even gets into OCaml. So raising OCaml exceptions will not be the right thing to do. You should change your code such that it checks for NULL, and when it fails to allocate memory, you may do caml_fatal_error("not enough memory to startup").

Another point that we should discuss (upstream) is what the default values for Max_minor_heap_def and Max_domains_def should be. How much virtual memory do we want the default OCaml executable to consume? Now it consumes 256GB of virtual memory. My opinion is that the Max_minor_heap_def should be the same as the default minor heap size (2mb?), and possibly Max_domains_def should be 16?

If the program creates more domains than caml_params->max_domains, the error message here may point users at the Max_domains OCAMLRUNPARAM parameter.

For AFL, now there is a test that checks whether the program can run. @jmid introduced -m none option in the manual. Perhaps we may extend the manual section to say something about the new options. This depends on the default choices for Max_minor_heap_def and Max_domains_def. If the sizes are small enough (< 50mb default for AFL?), then we don't need -m none.

For Valgrind, confirm if $ Max_domains=1 Max_minor_heap=256k valgrind my_ocaml_prog.exe does not error out on start up. We may have to include this in the manual somewhere if the chosen defaults are large enough to cause valgrind failure.

It would be best to discuss this on ocaml/ocaml. Can you make a PR to OCaml with the changes?

from ocaml-multicore.

sabine avatar sabine commented on June 8, 2024

Update: I'm tentatively putting the allocation of sampled_gc_stats into caml_init_gc.

--

When I change Max_domains to a parameter passed by OCAMLRUNPARAM, I need to modify the static array sampled_gc_stats, as it can no longer be a fixed-length array when the number of domains is a runtime parameter.

Where's the correct place to put the sampled GC stats?

I could move it into domain_state, but the way sampled_gc_stats is used, we need to iterate over all the domains. There is caml_try_run_on_all_domains which could probably be used to iterate over all the domains... But that's a locking operation, and I guess that the reason why sampled_gc_stats is a static array in the first place is that we want to avoid locks here and that we do not care about being 100% accurate.

It seems like a another option is to turn sampled_gc_stats into a pointer and allocate the array on the heap, but I don't see the right place to perform the allocation.

from ocaml-multicore.

kayceesrk avatar kayceesrk commented on June 8, 2024

@sabine Thanks. Is there a branch where we can follow along this work?

It seems like a another option is to turn sampled_gc_stats into a pointer and allocate the array on the heap, but I don't see the right place to perform the allocation.

I would think that this is the right approach. caml_init_gc ingc_ctrl.c would be a good place to have it; sampled_gc_stats are related to the GC.

from ocaml-multicore.

sabine avatar sabine commented on June 8, 2024

Here's the branch compare: https://github.com/ocaml/ocaml/compare/trunk...sabine:minor_heap_max_domains_OCAMLRUNPARAM_795?expand=1

I placed the allocations for the max_domain-length arrays from runtime/domain.c into the caml_init_domains function.

I don't understand yet, which method of allocating memory should be used here. I can see caml_map_mem being used already in init_domains. Do we have a breakdown of the different memory allocation functions and their semantics somewhere?

In st_posix.h is an array tick_thread_stop: https://github.com/sabine/ocaml/blob/aac0479bc04a193668f9275c50c9a5cad0614042/otherlibs/systhreads/st_posix.h#L35
and in st_posix.c, we have the array thread_table:
https://github.com/sabine/ocaml/blob/aac0479bc04a193668f9275c50c9a5cad0614042/otherlibs/systhreads/st_stubs.c#L107

Not sure where these should go. Considering it's domains-related, they could possibly be initialized during init_domains, but maybe there's a better place somewhere else?

from ocaml-multicore.

kayceesrk avatar kayceesrk commented on June 8, 2024

Don't use caml_mem_map. That goes through mmap on Linux and is really only to be used for the allocation of the minor heaps for the domains. For allocating blocks of memory, where you would typically use malloc, use caml_stat_alloc_noexc from caml/memory.h. It works the same as malloc, but has the ability to optionally track malloced blocks in the runtime so that we can detect memory leaks. See OCAMLRUNPARAM option c: https://ocaml.org/manual/runtime.html and ocaml/ocaml#10865.

caml_init_domains is also fine for allocating these arrays.

from ocaml-multicore.

Engil avatar Engil commented on June 8, 2024

Not sure where these should go. Considering it's domains-related, they could possibly be initialized during init_domains, but maybe there's a better place somewhere else?

The problem with things in systhreads is that these sits in otherlibs, which means that the library may not be linked at all time.
It also makes the setup job for these independent from the "core runtime" setup job.
I guess one could argue about having systhreads intrinsics sits within the runtime itself (and this way we may manage them in a more uniform way, at least for the internal data structures.).

However, maybe another solution would be to just use the systhreads entrypoints:
If you looks at st_stubs.c, you can see caml_thread_initialize_domain called on domain spawn (and at the start of a program, to setup up the first domain).
This could do the trick. One thing that would be required though would be a sister hook to this to also handle tearing down memory on domain shutdown.

from ocaml-multicore.

kayceesrk avatar kayceesrk commented on June 8, 2024

I really don't want more hooks if possible. Callbacks are terrible in general and doubly so in C.

from ocaml-multicore.

Engil avatar Engil commented on June 8, 2024

Then we can move the handling of systhreads data structures within the runtime itself (would mean no hook, would be arguably cleaner.), but it would mean we now have references to systhreads in the runtime itself, sort of defeating the idea of having it sit in otherlibs.
I'm also wondering if there is not alternative path where we could let the GC handle these memory segments in some ways, since we have an assertion that the initial domain thread is always the last thread exiting on any given domain.

from ocaml-multicore.

kayceesrk avatar kayceesrk commented on June 8, 2024

Hooks may be the simplest solution considering the alternatives. However, I don't know when they are supposed to free the memory? If the main domain terminating frees the memory, there may still be other threads (on other domains) that try to access tick_thread_stop and thread_table, and crash? We don't free all_domains for the same reason, because it is not clear when to free it.

Related to ocaml/ocaml#10865 (comment). Linux atexit to free this memory will also not work: https://bugzilla.redhat.com/show_bug.cgi?id=790837. I would suggest going ahead without the free functions for now (so no additional hooks!), but add a comment to the new allocations about them not being freed.

from ocaml-multicore.

sabine avatar sabine commented on June 8, 2024

I added the remaining allocations behind a NULL-check, since it looks like caml_thread_initialize_domain is probably called more than once - clearly we only want to allocate the arrays with length caml_params->max_domains on the first call.

I left comments on all the allocations that link back to this issue and state that they're not freed.

The code compiles and running the tests gives me

Summary:
  3098 tests passed
   66 tests skipped
    0 tests failed
  167 tests not started (parent test skipped or failed)
    0 unexpected errors
  3331 tests considered

How do I go about testing further? E.g. do we already have some code on which to test Valgrind or AFL?

Edit: do we need to use caml_stat_alloc_noexc or caml_stat_alloc?
So far, I've only been using caml_stat_alloc_noexc, but considering how some of these allocations must succeed in order for the program to run, it seems like caml_stat_alloc might be the appropriate allocation function?

from ocaml-multicore.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.