i#2140 : mixing syscall methods#2476
Conversation
Adds a simple test using int 2e for syscall, even if another method is used.
| } | ||
|
|
||
| #ifdef X86 | ||
| if (my_dcontext != NULL && instr_is_syscall(bb->instr) && |
There was a problem hiding this comment.
Why do you check "my_dcontext"? The other code here does not do that. If it is necessary please add a comment explaining why.
There was a problem hiding this comment.
I forgot to remove it after trying another solution.
I am removing it now
|
|
||
| #ifdef X86 | ||
| if (my_dcontext != NULL && instr_is_syscall(bb->instr) && | ||
| instr_get_opcode(bb->instr) == OP_int && |
There was a problem hiding this comment.
We'd prefer to avoid adding code to the main routine here: either this should be in mangling, if it's something that should happen after a client sees the unchanged code, or it should be in one of the syscall/interrupt handler routines already called during bb building.
There was a problem hiding this comment.
Well, there is a good question that relates to my broader question of how to fix this case.
Should we relax the constraint in arch.c :
/* we assume only single method; else need multiple do_syscalls */
ASSERT(new_method == get_syscall_method());
There was a problem hiding this comment.
You can see that the UNIX case right above what you're quoting does relax it and has do_int_syscall vs the sysenter do_syscall. We can do something similar for Windows.
There was a problem hiding this comment.
I will look into it.
Do you know why this was done for Unix and not for Windows ?
There was a problem hiding this comment.
It looks like I should compute both do_syscall (int and other) method at initialization since the do_syscall method is written in the .data section, so it cannot be updated on Windows
There was a problem hiding this comment.
.data can be written later by temporarily unprotecting the target page (while trying to avoid races) -- there are some helpers for this like SELF_UNPROTECT_DATASEC
| #else | ||
| pExceptionInfo->ContextRecord->Rip++; | ||
| #endif | ||
| #if 0 |
There was a problem hiding this comment.
This "#if 0" is accidentally left here? If this is deliberate, better to be under something descriptive like #ifdef VERBOSE
There was a problem hiding this comment.
Cool, I am changing it
| pExceptionInfo->ContextRecord->Rip++; | ||
| #endif | ||
| #if 0 | ||
| //prints information about exception |
| get_syscall_method() != SYSCALL_METHOD_INT) { | ||
| instr_t * i2 = INSTR_CREATE_nop(dcontext); | ||
| SYSLOG_INTERNAL(SYSLOG_WARNING, | ||
| "Changing interruption to nop at "PFX"\n", bb->instr_start); |
There was a problem hiding this comment.
I do not understand: why would we turn it into a nop?
There was a problem hiding this comment.
That was the first POC I could find.
But then, it is clearly not transparent to the application.
| if (pExceptionInfo->ExceptionRecord->ExceptionCode == EXCEPTION_ACCESS_VIOLATION) { | ||
| /* Go to next instruction. */ | ||
| #ifndef X64 | ||
| pExceptionInfo->ContextRecord->Eip++; |
There was a problem hiding this comment.
You're assuming the instruction is a single byte? Please explain in a comment.
There was a problem hiding this comment.
Well, it looks like (using the verbose output of the program) EIP got already incremented by one before the call of error handler.
I get exception code=c0000005 address=d21175 eip=d21176, info = 0, ffffffff
There was a problem hiding this comment.
An access violation will point directly at the faulting instruction. Something is off here, right?
There was a problem hiding this comment.
I do not fully understand what is going on, but it looks like the printed EIP from the context in exception handler does not point to the faulting instruction, but rather at the address+1. I run the program directly in this case (no dynamorio).
This look like with an int 2d, except EIP got incremented instead of ExceptionAddress.
Can you reproduce this behavior by running security-win32.except-int2e.exe ?
|
After testing, it looks like the right solution is to prepare multiple routines for the different sys calls cases. |
|
Here is another patch. |
|
run aarch64 tests |
|
run aarch64 tests |
|
I do not have an aarch64 platform to run tests. |
|
Ah sorry for the confusion. I've added a machine to run the precommit suite on AArch64 for pull requests, and a build is triggered by commenting I only tried to trigger a build, because of 91cf77d#diff-e1b5b4d92663976c0db6c7ccc36d578fR110 which might have an impact on ARM/AArch64? |
|
@fhahn -- it's probably best to document that "run aarch64 tests" and other info in the wiki |
|
Haven't looked at the code, but this confuses me:
We can't just ignore int 2e syscalls by the app, right, so I'm not sure what you mean? |
We consider them as interruptions, not syscalls. The code simply replaces in the good spots |
DR must see certain syscalls (control or memory related) for correct operation. Not monitoring some syscalls without knowing the syscall number and whether DR cares about it is unsafe. DR promises that clients will see all syscalls. Something like Dr. Memory or any taint tracker will have false positives and false negatives if it misses syscalls. |
|
Ok, so the test case is not complete enough and the patch is not good enough. |
|
I would think we should leave the app syscalls using their original gateway methods. DR does have support for multiple app syscall gateways on Linux: e.g., see get_do_int81_syscall_entry and EXIT_REASON_NI_SYSCALL_INT_0x81 and ditto for 82, and using both sysenter and int 0x80. Presumably we can do the same thing on Windows. The messy part is what to do when DR needs to fabricate a syscall that wasn't there in the app code. That's where there needs to be a single "primary" syscall method that DR will use. This is the syscall_method global state stuff. We just need to relax the checks there and pick one, like we do on Linux. |
1 similar comment
|
I would think we should leave the app syscalls using their original gateway methods. DR does have support for multiple app syscall gateways on Linux: e.g., see get_do_int81_syscall_entry and EXIT_REASON_NI_SYSCALL_INT_0x81 and ditto for 82, and using both sysenter and int 0x80. Presumably we can do the same thing on Windows. The messy part is what to do when DR needs to fabricate a syscall that wasn't there in the app code. That's where there needs to be a single "primary" syscall method that DR will use. This is the syscall_method global state stuff. We just need to relax the checks there and pick one, like we do on Linux. |
Copied code from int 81 handling on Linux. Checks syscall param_base readability before logging. Translates exception address in case exception is caused by int 2e. Fixes issue 2140
| #else | ||
| /* we assume only single method; else need multiple do_syscalls */ | ||
| ASSERT(new_method == get_syscall_method()); | ||
| /* There is usually a single method. */ |
There was a problem hiding this comment.
Check needed to be relaxed
| } | ||
| } | ||
| #ifdef WINDOWS | ||
| if (instr_get_opcode(bb->instr) == OP_int) { |
There was a problem hiding this comment.
Maybe it is not the best style
| cxt->CXT_XIP = (ptr_uint_t) dcontext->asynch_target; | ||
| /* now handle the fault just like RaiseException */ | ||
| DODEBUG({ known_source = true; }); | ||
| } else if ((app_pc) pExcptRec->ExceptionAddress == |
There was a problem hiding this comment.
Syscall can cause exception (the test case does this).
So we need to translate the address from the do_syscall method to the program address where the sys call happens
There was a problem hiding this comment.
I don't follow: the syscall itself should not cause an exception: it would just return a failure return value, not throw an exception. Could you elaborate on why an exception would happen whose PC is the entry to our generated int 0x2e?
There was a problem hiding this comment.
Well, I do not know what it should make, but the syscall does throw an exception, see the test case.
I guess that this happens because the syscall parameters are invalid (0xdeadc0de)
There was a problem hiding this comment.
This doesn't make sense to me: this has never happened before. Invalid parameters result in the kernel returning an appropriate error code, not throwing an exception into user mode. Is this limited to WOW64 and this happens during the syscall argument marshalling in (64-bit) user mode and that's why there's an exception, because it's not yet in the kernel?
There was a problem hiding this comment.
But you're saying this exception is on an int 0x2e, not a WOW64 "syscall" gateway of the call* to the WOW64 library layer.
| sys_param(dcontext, param_base, 7)); | ||
| LOG(THREAD, LOG_SYSCALLS, 3, "\tparam 8: "PFX"\n", | ||
| sys_param(dcontext, param_base, 8)); | ||
| /* Needs to check readability before logging. */ |
There was a problem hiding this comment.
Check is needed because the program can set param_base to an invalid value (unreadable address in memory) and that happens with the test case
There was a problem hiding this comment.
Sure, though this method is racy as the memory could become unreadable concurrently with this code. The more robust method is a safe_read or try/except, but since this is just logging code this is ok if it makes the code simpler: but please use a comment stating we're aware of the race.
|
So, here is a new patch, inspired by int 81 handling on Linux. |
derekbruening
left a comment
There was a problem hiding this comment.
Looks pretty good, just a few requests/suggestions
| /* we assume only single method; else need multiple do_syscalls */ | ||
| ASSERT(new_method == get_syscall_method()); | ||
| /* There is usually a single method. */ | ||
| ASSERT_CURIOSITY(new_method == get_syscall_method()); |
There was a problem hiding this comment.
Since we handle it maybe there's no reason to print out a curiosity. But we still only handle combining syscall or sysenter with int 2e: we won't handle both syscall and sysenter (seems really unlikely of course), so maybe this should be a check that either the new or the old is int 2e, right?
| /* FIXME i#1551: maybe we should union int80 and svc as both are inline syscall? */ | ||
|
|
||
| #ifdef WINDOWS | ||
| /* Like int 81 on Linux. */ |
There was a problem hiding this comment.
81 is on Mac, not Linux
| #ifdef WINDOWS | ||
| if (instr_get_opcode(bb->instr) == OP_int) { | ||
| if (instr_get_interrupt_number(bb->instr) == 0x2e) { | ||
| /* Special handling of int 2e when it is not the syscall method. */ |
There was a problem hiding this comment.
So the assumption is that syscall is seen first, and is the primary method? So that should be checked in check_syscall_method, right? B/c we won't handle the other order (right?)
There was a problem hiding this comment.
yes.
To be more precise, "int" is the second method. Wow64 can be the first method as well.
| PRE(ilist, instr, | ||
| instr_create_save_immed16_to_dcontext(dcontext, reason, | ||
| EXIT_REASON_OFFSET)); | ||
| } |
There was a problem hiding this comment.
I see several instances of this if..else sequence now for writing the exit reason: how about putting it into a local helper function and calling that from all the cases in this file?
There was a problem hiding this comment.
ok done with mangle_special_exit
| instr_create_save_immed16_to_dcontext(dcontext, reason, | ||
| EXIT_REASON_OFFSET)); | ||
| } | ||
| } |
There was a problem hiding this comment.
assert otherwise (i.e., if num is not 0x2e)
| cxt->CXT_XIP = (ptr_uint_t) dcontext->asynch_target; | ||
| /* now handle the fault just like RaiseException */ | ||
| DODEBUG({ known_source = true; }); | ||
| } else if ((app_pc) pExcptRec->ExceptionAddress == |
There was a problem hiding this comment.
I don't follow: the syscall itself should not cause an exception: it would just return a failure return value, not throw an exception. Could you elaborate on why an exception would happen whose PC is the entry to our generated int 0x2e?
| sys_param(dcontext, param_base, 7)); | ||
| LOG(THREAD, LOG_SYSCALLS, 3, "\tparam 8: "PFX"\n", | ||
| sys_param(dcontext, param_base, 8)); | ||
| /* Needs to check readability before logging. */ |
There was a problem hiding this comment.
Sure, though this method is racy as the memory could become unreadable concurrently with this code. The more robust method is a safe_read or try/except, but since this is just logging code this is ok if it makes the code simpler: but please use a comment stating we're aware of the race.
| LOG(THREAD, LOG_SYSCALLS, 3, "\tparam 8: "PFX"\n", | ||
| sys_param(dcontext, param_base, 8)); | ||
| /* Needs to check readability before logging. */ | ||
| if (is_readable_without_exception((byte *) param_base, 8 * sizeof(app_pc))) { |
There was a problem hiding this comment.
The check should be inside a DOLOG to avoid waste when logging is turned off
|
Here comes a new patch. The important point is that syscall in the test case generates an exception (read access violation). So dynamoRIO indeed needs to translate the address when it catches an exception caused by a syscall routine. |
|
Maybe this exception has something to do with code in syscall.c |
Adds a simple test using int 2e for syscall, even if another method is used.
What is the best fix ?
Allowing multiple syscall methods ?