Giter Site home page Giter Site logo

Heap hardening about php-src HOT 7 OPEN

jvoisin avatar jvoisin commented on May 28, 2024 5
Heap hardening

from php-src.

Comments (7)

arnaud-lb avatar arnaud-lb commented on May 28, 2024

Thank you for creating this issue! I though a bit more about this since #13943 (comment):

  • Type isolation requires considerable changes to the allocator, as addresses are reused in various scenarios. In particular we need to change zend_mm_gc() so it doesn't release address space, large slots must be allocated in fixed size bins like small ones, and for huge ones I'm not sure. I'm considering the layout used by mimalloc for small/large bins.
  • GigaCages may not be practicable, or may not be effective, for multiple reasons. One is that the maximum string size is SIZE_MAX.
  • ASLR: On Linux the mmap base is randomized by 28 bits by default, but we align chunks to 2MiB and the layout is predictable inside chunks, so we only get 19 bits or randomness in practice. We can improve that without too much effort, so even though ASLR is not a panacea it would still be worth it. Having a different base in every child process, and re-basing from time to time would also help a bit (literally, according to this paper).
  • Under the threat model of a remote attacker, in some scenarios GET/POST may be the only way to heap feng chui. Allocating user inputs in a separate heap would be efficient against heap feng shui in this case.

Longer term, we should check if replacing refcounting+cycle GC by a full tracing GC is practicable, because it would help. Although refcounting can not be entirely removed because CoW semantics rely on it.

from php-src.

jvoisin avatar jvoisin commented on May 28, 2024
  • mimalloc is pretty neat and performant, and I'd recommend looking at isoalloc as well. I spent some time last year trying to produce some easily digestible mitigation/design comparison between userland allocators which might be relevant here, as well as benchmarking the performances of the different allocators, even gave a small talk on the topic
  • Err, indeed data with SIZE_MAX won't fix in a GigaCage, sigh.
  • Unfortunately, having a different base means re-executing the process after the fork, which might significantly impact performances wrt. CoW. It's one of the reasons Android's Zygote doesn't do it. Moreover, I think that the threat model here is "an attacker with (limited) PHP code execution", meaning that ASLR can usually be inferred/ignored in some ways. Randomization applied to freelist would/could help though.
  • Isolating GET/POST is a great idea indeed!

from php-src.

arnaud-lb avatar arnaud-lb commented on May 28, 2024

Great, thank you!

  • Unfortunately, having a different base means re-executing the process after the fork, which might significantly impact performances wrt. CoW. It's one of the reasons Android's Zygote doesn't do it. Moreover, I think that the threat model here is "an attacker with (limited) PHP code execution", meaning that ASLR can usually be inferred/ignored in some ways. Randomization applied to freelist would/could help though.

Agreed with changing the base entirely. What I had in mind was to use a random mmap hint in zend alloc, and allocate contiguously from that hint (to avoid splitting the address space too much). After that we can randomize bin placement inside chunks (but I feel this can be easily defeated with heap feng shui) and freelists inside bins indeed.

Regarding the threat model, I'm focusing more on the remote attacker model for now, as I feel this is the most critical.

from php-src.

jvoisin avatar jvoisin commented on May 28, 2024

Agreed with changing the base entirely. What I had in mind was to use a random mmap hint in zend alloc, and allocate contiguously from that hint (to avoid splitting the address space too much). After that we can randomize bin placement inside chunks (but I feel this can be easily defeated with heap feng shui) and freelists inside bins indeed.

Oh, I see. Yes, having a randomized per-child base would help a bit, as an attacker wouldn't be able to use forks to bruteforce the randomization, albeit memory allocated before the fork would still be at the same offset across processes. As for periodic rebasing, I guess having the master process re-executing itself once in a while would be an acceptable hack tradeoff.

Remote PHP exploitation is pretty exotic, to my knowledge, to my knowledge, the only person to do it (publicly) is @cfreal. Local exploitation is much more common, usually to bypass open_basedir and disable_functions.

from php-src.

devnexen avatar devnexen commented on May 28, 2024

...

@jvoisin, just curious ; would you recommend using the userfaultfd api in that case ?

from php-src.

jvoisin avatar jvoisin commented on May 28, 2024

@jvoisin, just curious ; would you recommend using the userfaultfd api in that case ?

I'd rather keep things simple and portable: map two pages PROT_NONE and let the process violently crash in case of violation. I'm under the impression that userfaultfd adds a lot of complexity, which is never a good thing for security-related features.

from php-src.

devnexen avatar devnexen commented on May 28, 2024

Oh not so much complexity it allows to handle the violation more smoothly than the usual technique you re referring to. But ... that s just linux :)

from php-src.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.