lordmilko / chaosdbg Goto Github PK
View Code? Open in Web Editor NEW.NET/Native Debugger for Windows
.NET/Native Debugger for Windows
Certain commands (bm
, x
) allow specifying wildcard expressions. Not sure how wildcards are treated in WinDbg's MASM evaluator vs when evaluating normal expressions with ?
Some scenarios we need tests for:
*!*
*!foo
foo!*
foo*!bar
foo!*bar
ntdll!*foo*
ntdll!*foo*1
ntdll!*foo* 1 //In this case, we should see there's a space and know that the 1 is not related to the wildcard expression
MasmParser.ParseIdentifier
currently allows having an AsteriskToken
when looking for a symbol expression, but we need to change that to only consider asterisks valid when we're in "wildcard mode"
DbgEng resolves symbols used in expressions via dbgeng!MultiSymFromName
SymTagThunk
)TryGetModuleQualifiedSymbolValue
and TryGetSimpleSymbolValue
IDA Pro is able to show the targets of switch statements.
In ntdll!RtlRestoreContext
it says "sp analysis failed" on the jmp rdx
. Does this mean IDA might be "simulating" values on the stack in order to figure out where the jump might go?
I've seen other things in IDA that indicate there may be a "physical" jump table located elsewhere in the assembly. There may be multiple scenarios we need to support
Once this analysis is working, we then need to implement for displaying the targets of a switch statement in graph mode similar to IDA
u poi 0x00007ffcb6f1a388
we get an error that it's an invalid commandCordbDisasmSymbolResolver
?CordbILFunction.Disassembly
nor CordbFunction.ILToDisassembly
will know how to handle having more than 1 code chunkAttempting to set CORDEBUG_JIT_DISABLE_OPTIMIZATION
in CordbEngine.LoadModule
will cause CORDBG_E_CANT_CHANGE_JIT_SETTING_FOR_ZAP_MODULE
to be thrown within mscordbi if it's an NGEN image. We have logic to disable the use of NGEN images, but if a user opts to keep NGEN images (need to provide support for toggling this setting), we should check if the loaded module is an NGEN image and not try and set this flag to improve perf
When executing chaos.exe
in a loop 100 times (1..100 | foreach { & "ChaosDbg\src\chaos\bin\Debug\chaos.exe" --minimized pwsh -e interop }
) (and I think it might've been configured to exit after breaking in?) when we hit Ctrl+C we tripped over something in CordbEngine.Break
and hit a NullReferenceException
. It looks like what happened was if you hit Ctrl+C while we're still waiting on the TargetCreated
task to complete, Process
will be null. Do we also need to cancel the process creation if you hit Ctrl+C? Not sure
I think there's a race in adding the pause reason in CordbEngine.Break
: if we just added this, but had just started processing an event in the time between this and calling EnsureHasStopReason
, the event we just received is going to update the last stop reason, and so we'll think we didn't set one, when we did
In CordbProcess.Terminate
we call TryContinue
once to ensure that the Win32 Event Thread is running to be able to process our termination request (made above), but if we've stopped multiple times, do we need to keep continuing until the debugger is fully running, or will one continue be enough? We need a unit test for this question, and need to test both in interop and non-interop
Notes from CordbProcess.Terminate
:
Suppose an unhandled exception occurred on the unmanaged callback thread. As we begin to shut down, the cancellation token of the cordb engine thread will be cancelled, result in us trying to terminate the program. But when we try and call terminate here, we'll see that the process isnt synchronized!
I think we need to fix the synchronization issue prior to cancelling the engine token. Need to document everywhere related to terminal shutdown where to go in the code to see the full documentation about our shutdown process across both managed, unmanaged, the engine loop, etc
And maybe utilize the InProcThread that the unmanaged callback uses to continue to call Stop()
and then somehow throw from there in a way that'll get caught?
In CordbSessionInfo
, after EngineCancellationTokenSource?.Cancel();
, we have the following comment:
If the fatal shutdown thread calls engine.dispose, but the UI thread called engine.dispose, the UI thread disposed will be set to true, and so we'll return immediately from this, despite the fact that shutdown is still in progress
Furthermore, the comments prior to WaitForCriticalFailureThread()
talk about how we'll bail out of waiting if we're the engine thread...but we removed that logic from WaitForCriticalFailureThread
for some reason. In the if (localCriticalFailureThread != null)
check of WaitForCriticalFailureThread
, the check if we're on the engine thread was commented out for some reason, don't know why
Comment in CordbEngine.Dispose
: if somebody tries to dispose the engine while the critical failure thread is running, thats going to cause an issue!
When the user does intentionally kill their process, how should CordbEngine
report the fact it died to the user? Do our existing event handlers cover this?
Comment in CordbEngine.StopAndTerminate
: when chaos.exe crashes cos it doesn't know how to handle cpp exceptions, it doesn't seem like we're cancelling the engine token. So what causes that to happen when we have an unhandled exception on the unmanaged event thread?
In CordbLauncher.Attach
we do Process.Modules
, however if the process suddenly exits, we'll crash while trying to enumerate the target's modules
We have an assert that the process
is not null
in CordbEngine.PreEventCommon
, but if there was a critical failure during session init and the Win32 Event Thread didn't stop, this could trigger the assert. Is it possible to make this scenario occur?
In case we need to manually invoke multicast delegate event items again:
var preEventDelegate = (MulticastDelegate) ucb.GetType().GetField(nameof(CordbUnmanagedCallback.OnPreEvent), BindingFlags.Instance | BindingFlags.NonPublic).GetValue(ucb);
foreach (var handler in preEventDelegate.GetInvocationList())
handler.Method.Invoke(handler.Target, new[] { s, e });*
ToString()
, our ExceptionData
handler hits the NotImplementedException
. Should we actually be returning true
instead?new ByteSequence(Int3, Int3, Int3, "*", "0x33C0"), //test eax,eax
- this is causing extra false positives because it doesn't just match test eax,eax, and we ignore repeated junk bytes in our xfg versionI'm not sure if we should even use XFG patterns at all
public static ByteSequence[] XfgCompatiblePatterns =
{
//todo: i think this is needed for "48895C248" but it can cause false positives
//new ByteSequence(Ret, Int3, "*", Mov_RspDisp_AnyReg) //todo: seems to cause some false positives, e.g. a branch in RtlReleaseActivationContext is erroneously matched
//4883EC38
Int3Int3Int3_MARK_SubRspAny,
//4053
Int3Int3_MARK_RexPushAny_SubRspAny,
//Int3Int3Int3_MARK_RexPushAny, //gives a false positive
/* Matched patterns:
* 48895C248
*/
Int3Int3_MARK_MovRspDispAnyReg,
/*new ByteSequence(Int3, Int3, Int3, "*", MovAnyRsp, RexW_AnyB_AnyR_Mov),*/
//488BC4
Int3Int3_MARK_MovAnyRsp,
RetInt3_MARK_MovAnyRsp,
//Extra
//4C8BD1
Extra1, //mov r10,rcx
Extra2, //mov r10,rcx
//4053
Int3Int3Int3_MARK_RexPushAny_PushAny,
//new ByteSequence(Int3, Int3, Int3, "*", RexPushAny, RexPushAny),
//but we DO want to match RtlAreBitsSet
//todo: i think we should maybe comment these all out (even the ones we have below) and then add unit tests for each pattern
//asserting that we got a match. not sure if we should embed the bytes we're matching against in our code, or if we should reference ntdll directly,
//since we can have issues when theres multiple matches of addresses inside a function. e.g. when we have a jmp from function A to B. maybe we should
//just handle those scenarios specially, by including the bytes for function A and function B that gets jumped to and test the behavior worked as expected
//for this corner case we found. should probably have a method we can use to generate the bytes, etc and everything for each test case, and comment that
//the function is from ntdll!foo etc. each test should call upon all patterns, and assert that there were no false positives of things only matched in raw.
};
Roslyn has an ExpressionEvaluator that may have some level of integration with Visual Studio's Dkm debugger API
dnSpy had the same idea, and created its own fork of Roslyn's ExpressionCompiler
For Visual Studio, the action begins in Microsoft.VisualStudio.Debugger.Engine.dll!Microsoft.VisualStudio.Debugger.EntryPoint
IDkmLanguageExpressionEvaluator_EvaluateExpression
is at the top of the call stack, and after calling into Dkm pops out at IDkmClrExpressionCompiler_CompileExpression
Microsoft.CodeAnalysis.ExpressionEvaluator.ExpressionCompiler.dll!Microsoft.CodeAnalysis.ExpressionEvaluator.ExpressionCompiler
(specifically a CSharpExpressionCompiler
IDkmClrExpressionCompiler.CompileExpression
is implemented by the expression compilerdnSpy doesn't actually seem to utilize ExpressionCompiler
; instead, it calls straight into the EvaluationContext
(which is what ExpressionCompiler
does internally as well). This may explain how dnSpy can potentially get away with not having declared any of the IDkm
interfaces (honestly I'm a bit confused as to how it even compiles given these aren't defined anywhere and the relevant DLL isn't referenced or included in the output directory)
Visual Studio seems to be able to bypass doing a funceval for simple things (e.g. 1+1
) but when you actually call a function, it'll do a real funceval
The actual act of calling CordbThread::CreateEval
is done from managed code. There are 3 modules of interest: vsdebugeng.manimpl
, Microsoft.VisualStudio.VIL
and Microsoft.VisualStudio.VIL.Host
The following stack trace shows several points of interest:
015af048 58a30a6e Microsoft.VisualStudio.VIL.DebuggerHost.CorThread.CreateEval()
015af05c 58a3033d Microsoft.VisualStudio.VIL.DebuggerHost.RealFuncEval.DoRealFuncEval(Microsoft.VisualStudio.Debugger.Metadata.MethodBase, Ilrun.CallArgs)
015af0e8 58a3c924 Microsoft.VisualStudio.VIL.VisualStudioHost.InspectionHook.TryRealFuncEval(Microsoft.VisualStudio.VIL.VisualStudioHost.VSGlobalContext, Microsoft.VisualStudio.VIL.VisualStudioHost.LocalContextWrapper, Microsoft.VisualStudio.Debugger.Metadata.MethodBase, Ilrun.CallArgs)
015af11c 58a3c752 Microsoft.VisualStudio.VIL.VisualStudioHost.InspectionHook.Hook_TryRealFuncEval(Ilrun.VirtualMachine, Microsoft.VisualStudio.Debugger.Metadata.MethodBase, Ilrun.CallArgs)
015af134 58a3c6cb Microsoft.VisualStudio.VIL.VisualStudioHost.InspectionHook.Hook_CallFromRootFrame(Ilrun.VirtualMachine, Microsoft.VisualStudio.Debugger.Metadata.MethodBase, Ilrun.CallArgs)
015af148 58d730ea Ilrun.VirtualMachine2.ExecuteHookedMethod(Ilrun.HookedMethod, Ilrun.CallArgs)
015af18c 58d70d94 Ilrun.VirtualMachine2.ForwardLoop()
015af1a0 58d709d6 Ilrun.VirtualMachine2.RunForward()
015af1d4 58d6e84f Ilrun.VirtualMachine2.ExecuteMethodInternal(Microsoft.VisualStudio.Debugger.Metadata.MethodBase, Ilrun.CallArgs, Ilrun.IVirtualStackFrame)
015af21c 58d6e709 Ilrun.VirtualMachine2.ExecuteInspectionQuery(Microsoft.VisualStudio.Debugger.Metadata.MethodBase, Ilrun.IVirtualStackFrame)
015af24c 58a2f31f Microsoft.VisualStudio.VIL.VisualStudioHost.VilEvaluationServices.InterpretInspectionQuery(Microsoft.VisualStudio.Debugger.Evaluation.DkmInspectionSession, Microsoft.VisualStudio.Debugger.DkmWorkList, Microsoft.VisualStudio.Debugger.Evaluation.DkmEvaluationFlags, Microsoft.VisualStudio.Debugger.Evaluation.DkmFuncEvalFlags, UInt32, Microsoft.VisualStudio.Debugger.CallStack.DkmStackWalkFrame, Microsoft.VisualStudio.Debugger.Metadata.Assembly, System.String, System.String, System.String, Microsoft.VisualStudio.Debugger.Evaluation.ClrCompilation.DkmClrCompilationResultFlags, Microsoft.VisualStudio.Debugger.Evaluation.DkmEvaluationResultCategory, Microsoft.VisualStudio.Debugger.Evaluation.DkmEvaluationResultAccessType, Microsoft.VisualStudio.Debugger.Evaluation.DkmEvaluationResultStorageType, Microsoft.VisualStudio.Debugger.Evaluation.DkmEvaluationResultTypeModifierFlags, Microsoft.VisualStudio.VIL.VisualStudioHost.VilEvaluationResult, Microsoft.VisualStudio.Debugger.Metadata.Type, Microsoft.VisualStudio.VIL.VisualStudioHost.InspectionQueryUserContext, System.String ByRef, Microsoft.VisualStudio.Debugger.Metadata.MethodInfo ByRef)
015af394 587411a4 VSDebugEngine.ClrInspector.VilHelper.ExecuteQueryInternal(Microsoft.VisualStudio.Debugger.Metadata.Assembly, System.String, System.String, Microsoft.VisualStudio.Debugger.Evaluation.ClrCompilation.DkmClrCompilationResultFlags, Microsoft.VisualStudio.Debugger.Evaluation.DkmEvaluationResultCategory, Microsoft.VisualStudio.Debugger.Evaluation.DkmEvaluationResultAccessType, Microsoft.VisualStudio.Debugger.Evaluation.DkmEvaluationResultStorageType, Microsoft.VisualStudio.Debugger.Evaluation.DkmEvaluationResultTypeModifierFlags, Microsoft.VisualStudio.Debugger.Evaluation.ClrCompilation.DkmClrCustomTypeInfo, Microsoft.VisualStudio.Debugger.Evaluation.DkmInspectionContext, Microsoft.VisualStudio.Debugger.CallStack.DkmStackWalkFrame, System.String, System.String, System.Collections.ObjectModel.ReadOnlyCollection`1, Microsoft.VisualStudio.Debugger.DkmWorkList, VSDebugEngine.ClrInspector.EvaluationResultContinuation)
015af408 5874585c VSDebugEngine.ClrInspector.VilHelper.ExecuteInspectionQuery(Microsoft.VisualStudio.Debugger.Evaluation.ClrCompilation.DkmCompiledClrInspectionQuery, Microsoft.VisualStudio.Debugger.Evaluation.DkmInspectionContext, Microsoft.VisualStudio.Debugger.CallStack.DkmStackWalkFrame, System.String, Microsoft.VisualStudio.Debugger.DkmWorkList, VSDebugEngine.ClrInspector.EvaluationResultContinuation)
015af458 5873bac8 VSDebugEngine.ClrInspector.EntryPoint.Microsoft.VisualStudio.Debugger.ComponentInterfaces.IDkmClrInspectionQueryProcessor.Execute(Microsoft.VisualStudio.Debugger.Evaluation.ClrCompilation.DkmCompiledClrInspectionQuery, Microsoft.VisualStudio.Debugger.DkmWorkList, Microsoft.VisualStudio.Debugger.Evaluation.DkmInspectionContext, Microsoft.VisualStudio.Debugger.Evaluation.DkmILContext, System.String, Microsoft.VisualStudio.Debugger.DkmCompletionRoutine`1)
015af484 078c241a Microsoft.VisualStudio.Debugger.EntryPoint.IDkmClrInspectionQueryProcessor_Execute(IntPtr, IntPtr, IntPtr, IntPtr, IntPtr, IntPtr, IntPtr)
In Microsoft.VisualStudio.VIL
it creates a series of DecodedInstruction
items for a DecodedMethod
It seems that the expression was already pre-evaluated by the time the IL instructions are passed to the API call that creates the DecodedMethod
, indicating the expression was already pre-computed inside Roslyn's expression compiler
CordbDisasmSymbolResolver
CordbDisasmSymbolResolver
, I have a comment that the CLR gives super generic names for thunks. But is this even true? Not sure if we have to test with powershell.exe instead of pwsh.exeCordbDisasmSymbolResolver
. We need to move this into DebuggerSymbolProvider
DbgEngFormatter.TryFormatIndirect
is not very robust
lea
? In this case the memory base will be rax
or somethingmov rdx,[7FFC116C2450h]
. Right now we hardcode getting Op0Kind
. There can be more than 2 operands (somehow) in some scenarios, so we may need to loop over all operands maybe?DbgHelpDisasmSymbolResolver
DbgEngDisasmSymbolResolver
deferredBreakpoint
and currentBreakpoint
properlyAdd(CORDB_ADDRESS address)
, consider checking whether the address we're trying to set a breakpoint for is in managed code, and if so throw an error that a managed breakpoint should be used instead (which occurs via ICorDebugCode
). IXCLRDataProcess.GetAddressType
can potentially be used to check whether an address is part of a managed method or notFlushInstructionCache
ourselves, or whether the CLR handles this for us?
CordbProcess::SetUnmanagedBreakpoint
and raw breakpoints we use for setting breakpoints inside the CLR. It's possible mscordbi does call FlushInstructionCache
but we're not doing that for our raw breakpointsProcessDeferredBreakpoint
, we need to implement special logic so that we don't re-enable the step breakpointRestoreCurrentBreakpoint
, we need to implement special logic so that we don't re-enable the step breakpointRestoreCurrentBreakpoint
I say If that instruction doesn't already have a breakpoint on it, we need to insert a sneaky step
- is this true though? Are we in fact just sneaky stepping always?RestoreCurrentBreakpoint
, do we need special logic if we're on the last instruction of a function? This talks about applying "step-in" logic in certain circumstancesAddNativeStep
, when handling the scenario where we're on an int3, we need to tell the difference between having an int3 on the "next" instruction vs "actually, we just hit an int3, so we were going to increment. the IP after it anyway". But having said that, if we're stepping, we want to break on the next instruction anyway, so i guess maybe we do need to set a hard breakpoint here. Maybe update the comment in the int3 case to note thisDataBreakpoint
and BreakpointSetError
events?UnmanagedException
: how does dbgeng handle queuing up a deferred breakpoint which it'llcreate a single step out of, but then the user single steps themselves. the deferred breakpoint should just be ignored right? I would guess the "I can see theres a breakpoint on the next instruction" logic would take care of thatDoContinue
: suppose we're at a hardcoded int 3 and we try and step, and we didn't change to another thread. in that case I would guess we need to NOT step since we're just going to increment the IP manually?TryClearHardInterrupt
:
address & (size - 1)
rep stosd
instructions doesn't work. Need to understand why and implement special logic for all instructions like it that can cause issues when steppingret
is executed twice. Not sure if its because its a ret, there's a syscall behind it, or its a bug in our debugger! I think I saw that when we were able to break in before it gets to the normal point where this issue occurs (which I think is in wait for single/multiple objects) it didn't double step, so it may indeed be related to the syscall. It doesn't happen the second time you step through the methodGhidra has several config files that list noreturn functions
CordbNativeFrame.GetRegisterRelativeVariable
we interpret what register to use when the register is CV_ALLREG_VFRAME
. According to dbgeng!MachineInfo::CvRegToMachine
the register should be RSP on x64 when we see CV_ALLREG_VFRAME
. Need to test this is true. We've seen CV_ALLREG_VFRAME
a lot for x86, but haven't seen it at all for x64; symbols send to say to use RBP explicitlyIDA Pro is able to identify functions even when no symbols are present. Even when symbols are present, sometimes things aren't actually functions after all (e.g. the many bad functions in System.Data
)
Ghidra utilizes a series of heuristics for identifying the locations of functions. Consider implementing this functionality as well. Note that we've seen cases where a function began with sub
against rsp
or something, so we need to make sure that not only that such functions are identified as well, but that all functions listed in symbols are identified as functions using this heuristic based approach
MemoryDumper
more robust. Consider adding nested classes (e.g. MemoryDumper.Dword
) for dumping the various memory typesIf an unhandled exception occurs on any of our background threads (Cordb, DbgEng, InProcThread, Symbol Resolution Thread, PowerShell Engine Thread, and possibly anywhere else we use DispatcherThread
/ new Thread()
) we need to somehow capture this and notify the user that a fatal error occurred
Depends on: #1
We've applied a heuristic that says that if we run into another known entity while disassembling a function, we've gone too far and therefore this must be a bad function. However, in some cases you can have multiple symbols within a function, e.g. ntdll!RtlpInterlockedPopEntrySList
Related issues:
ntdll!DbgBreakPointWithStatusEnd
contains no instructions. The real symbol is ntdll!DbgBreakPointWithStatus
OLEAUT32!LPSAFEARRAY_UserUnmarshal64
the finally block appears to have its own symbol. That means we're erroneously treating this finally block as being its own "function" which isn't true, its a part of LPSAFEARRAY_UserUnmarshal64
Remove(CORDB_ADDRESS baseAddress)
we need to remove any managed symbols we generated as well. We don't generate a managed symbol "module", but we do generate symbols that may exist within the module we're removing on demand as requiredTargetFrameworkAttribute
of the assembly. This isn't a good solution, but this is how dnSpy does it too (see DotNetHelpers.IsDotnetExecutable
)WithFrameContext
, DbgEng appears to have logic that says when a frame is not the top frame in a call stack, rewind the IP by 1 before calling SymSetContext
, otherwise symbol resolution won't work on the frame. Is this true? How do we reproduce this issue?TrySafeManagedSymFromAddr
TryGetCodeHeaderData
. Is it possible for it to throw in this particular context however? TrySafeManagedSymFromAddr
must never throw!DeferCalculateModuleInfoAsync
pendingNativeOperations
. Not sure what's going onRemoveManagedModule
CordbDisasmSymbolResolver
into DebuggerSymbolProvider
, we need logic that considers whether the given IP has already is already the indirected address, or whether we need to indirect it for the user. When a user queries a given address, they might already be querying the indirected address, but in the case of CordbDisasmSymbolResolver
the address hasn't been indirected yet. Our DebuggerSymbolProvider
needs to allow being told whether it should try and indirect the address it was given or not.error
file with the same name format that the real symsrv uses and then rename if it downloads successfully. Not sure if we need to automatically cleanup failed downloads somehow?COMIMAGE_FLAGS
of the IMAGE_COR20_HEADER
to see if a file may be C++/CLI (and also need to test that that is a valid way of detecting C++/CLI images). Do load symbols for C++/CLI files. If a normal .NET file has exports, we should probably list those like unmanaged symbols, and maybe need to make sure we do so regardless of what DIA has to saySymbolServerW
handler is called, we should throw a special exception that will bubble up to the user telling them to use slow SymSrv insteadIterate over all operands on each instruction disassembled and store the operand values on the DisasmFunctionResolutionContext
. We then need to somehow know when we're finished processing the function stack, and then analyze all of the addresses that we identified to see whether they're within the allowed bounds of the current function. If so, if it's not an address we already know of, add it to the list of known entities for the module
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.