Giter Site home page Giter Site logo

Comments (9)

dodexahedron avatar dodexahedron commented on August 27, 2024

I have other problems with that enum that probably won't matter until a future version of unicode, but I'm ignoring those for now because this is fine for V2 and, so long as masks are defined correctly, shifting things left by 1 shouldn't be terribly difficult as a break-fix, if that happens before some other solution is implemented or Microsoft switches to UTF-8 and back-ports it to all windows versions (hey - we all have dreams, right?).

Although......

That being said, .net8 is UTF-8 by default for console encoding. So... Has anyone tried treating stuff as Utf8 natively, yet?

You have to work with Span<byte> more than string, but not having to do all that damn conversion all over the place would kill like half the work the library has to do.

from terminal.gui.

dodexahedron avatar dodexahedron commented on August 27, 2024

I think this belongs in this issue more than #3275 but it's still related to both.

So, here's what I did on this end of things and what the result is:

As with any time I've touched multi-type files, types in CursesDriver and ConsoleDriver got split out to their own files.

That's of course a non-change to the program.

But I ended up looking at CursesDriver.ProcessInput (I don't remember the original reason, but it doesn't matter because the problem is fixed).

There were lots of points where cleanup was possible, but there was also a ton of copying going on.

It looks like the HandleEscSeqResponse method had well-intentioned thoughts on performance applied, given everything was by ref for it (Kudos, whoever did that in the first place).

Unfortunately, that's not how the compiler actually compiled that code, and there were still plenty of places where value types were being copied just for temporary use and then discarded.

So, I created a private ref struct to encapsulate the data relevant to those code paths.

A ref struct is a stack-only type. Neither an instance of a ref struct or any of its properties or fields can ever escape its ref-safe context, which basically means it can never outlive the original referent.

This is how Span<T> is defined, so if you are familiar with that at all, the restrictions are the same. Here's a doc for some details: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/ref-struct

Anyway...

This allows the method and the methods it calls to operate on the same references to stack memory, resulting in a serious speedup and reduction in allocations.

A ref struct can never exist on the heap, so it's a guarantee that there are no heap allocations if you're working with a ref struct.

They can't even be reference BY things on the heap (which is why they cannot be elements of an array, for example).

So...

I turned the 5 values that ProcessInput handles and passes to that HandleEscSeqResponse method into this ref struct:

/// <summary>
/// A private byRefLike struct to simplify calls and keep everything really on the stack.
/// </summary>
/// <remarks>
/// This is a ref struct. It cannot escape the stack and the compiler will enforce that if you try.
/// </remarks>
private ref struct EscapeSequenceData
{
    public ref int code;
    public ref KeyCode k;
    public ref int wch;
    public Key keyEventArgs;
    public ConsoleKeyInfo [] cki;
}

It has two reference types in it, which is why those two aren't also ref. That would be an unnecessary extra layer of indirection.

Then I refactored the HandleEscSeqResponse into a method that takes one ref to that type and returns the same ref at the end, doing all its work in-place on that struct.

Just to show progression, I also kept the same variable names, initially, in the ProcessInput method, but made them refs to their corresponding fields in the struct, meaning now the old code was already using the new struct.

Once that was done and verified passing, I removed those now-redundant references, one by one, until all that was left was the struct and one strange place there's a separate int (wch2) defined that I didn't want to mess with, in the moment, so it was left alone. I'm sure it can probably get nuked too, but whatev.

Then I did some more cleanup in the ProcessInput, mostly around using switches instead of large if/else constructs.

The end result is that, vs the code before those changes, for the tests covering them, there has been somewhere between a 10-20x speedup and a reduction in heap allocations and copies and such during those methods significant enough to be visible in the profiler, even though they're all small and seemingly innocent values. This basically means the compiler must have been generating wrapper classes for the old handler that took 5 refs.

With the ref struct and the arrays being collection expressions, the compiler can very aggressively optimize that, now.

Now...

Not to oversell that speedup, though... On the test machine, we're talking those tests taking 0.004 seconds instead of 0.07 seconds, for example. It's within the realm of human perception, especially for a fast typist, but it isn't like this made the whole program noticeably faster beyond a few dozen milliseconds of input lag.

But hey, every bit helps, right?

I have plenty more to do in there before I'm actually finished with it and go back to the TextFormatter class.

I've already reduced the number of warnings, suggestions, and tips from the analyzer for CursesDriver from an almost full scroll bar (when I started) to this, at this point:
image

After I finish with this, though, I think I'll put in a PR at a working commit so others can rebase and whatnot, if it isn't too difficult to separate (it might be).

from terminal.gui.

tig avatar tig commented on August 27, 2024

Re HandleEscSeqResponse see

That code is fragile, not well tested, and hard to understand. We need a robust ANSI esc sequence parsing engine...

from terminal.gui.

tig avatar tig commented on August 27, 2024

Oh, as I did all the driver refactoring work in the past 6 months, CursesDriver has been the most neglected. Note that it still does not support TrueColor.

from terminal.gui.

dodexahedron avatar dodexahedron commented on August 27, 2024

Ha yeah there's a lot of original code in there. It's some of the most obviously different.

And the PInvoke mechanism with pre-caching the delegates and whatnot did make sense back then, but is now just needless complexity. I'm finding it hard not to just go ahead and address whatever I run into in there. 😅

from terminal.gui.

dodexahedron avatar dodexahedron commented on August 27, 2024

Honestly, some of the old generated code in there makes some sense, and the intent behind the KeyCode enum also makes some sense (more than the old stuff). Obviously, the class full of constants is a C-ism that the code generator that was used didn't understand makes more sense in C# as an enum. But oddly enough there's value both ways, so I'm torn on that one.

But, with the way the compiler is treating the uint, it's actually treating literals of the enum as values to be cast at run-time, not as constants, and that's not something I expected to see.

I think the best solution lies somewhere in between, honestly.

The way we use that enum all over the place, we need to be able to treat it as a numeric value. But uint enums can't do that without a run-time cast (oddly unless they are subjected to an arithmetic operation, in which case they ARE implicitly numeric).

I'm wondering if it's a caveat of uint (seems likely), so I'm gonna run to the docs to check the spec on that. I know uint and ulong aren't CLS compliant, at least, so I wouldn't be surprised at all if they're less capable than their signed brethren.

If that turns out to be the case, we might be able to just turn it into a long, for now, and isolate both 32-bit halves for each purpose.

But...

I think the best ultimate solution is in a well-defined struct that keeps things as more primitive structures and exposes an enum (with implicit cast operators) for seamless use both as a numeric value (also using cast operators) and as an enumerated value, for the convenience it gives for things like configuration with the built-in behaviors of enums.

Now... Since it's all just 32-bit binary values, in the current code, I'm using Unsafe.As<int,KeyCode>, Unsafe.As<int,uint>, etc to do the equivalent of a c-style static cast of one pointer type to another, which is pretty much the cheapest way to convert a type in dotnet (or anywhere else, really), and is perfectly fine so long as you are sure the values are compatible. Since we know they're compatible, being 32-bit values, it's not "unsafe" at all. :)

from terminal.gui.

dodexahedron avatar dodexahedron commented on August 27, 2024

Hell, it may even be advantageous to store two 32-bit values inside a Vector64 or a wider type, to enable some wicked fast operations over sequences of values (which would end up getting SSE-ified by the compiler at that point). Not to mention we have direct access to all those wide instructions via those classes, which are great for a lot of that bitwise math.

However the data is stored, we can expose it in any way we like, for the public API.

It's probably a good idea for us to abstract away the implementation a little bit, anyway, to protect against breakage around version changes.

Actually...

Thinking about it a little more...

That's actually very likely a case where the strategy of doing multiple possible cases and just selecting the right one is significantly faster than anything else.

What I mean is that, for example, if there's a method like the one I'm working on right now that has a bunch of conditionals based on the same or slightly different values, which then do more math based on that decision, you load the values and constants into a Vector128 or something and then apply all of those operations at once at the start. Then the conditional is just a compare and jump, and the value is already done, all in fewer clock cycles, even though the processor threw away 3/4 of the values it computed.

from terminal.gui.

dodexahedron avatar dodexahedron commented on August 27, 2024

Damn. The compiler behavior around those enums is exactly as the first line describing enum conversions states (emphasis mine):

For any enumeration type, there exist explicit conversions between the enumeration type and its underlying integral type. If you cast an enum value to its underlying type, the result is the associated integral value of an enum member.

That seriously sucks! :(

That even makes things like that Curses class that has the switch to convert from one enum to another even more expensive than it seems. Extremely minor, of course, in the grand scheme of things, but I'm just wondering how I didn't know this (or how I forgot it more likely lol) in the 23+ years I've been using C#. 😆

I suppose I do tend to avoid large enums in favor of formal structs or classes, because of the lack of flexibility in enums such as not being able to write operators. Extension methods are a pity prize. I want operators, damn it!

from terminal.gui.

dodexahedron avatar dodexahedron commented on August 27, 2024

On the topic of the Curses interop....

That old code was from the mono days.

I bet we can do a clean implementation in less work than it would take to untangle that thing.

The majority of that class is PInvokes and some code that is honestly partially redundant to other, newer, better code we have elsewhere.

Maybe I should shift my strategy in that direction, rather than trying too hard to fix this thing up?

from terminal.gui.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.