TOCTOU: Funny Name for a Serious Bug
What is TOCTOU
Time-of-check, time-of-use — or TOCTOU — is a type of software bug that can lead to serious security vulnerabilities. At the time of writing, searching the keyword “TOCTOU” in the Common Vulnerabilities Database returns 94 cases where a TOCTOU bug could be exploited maliciously. These cases show examples of arbitrary code execution, privilege escalation, unintended file deletion, and many other exploits in widely used software.
TOCTOU is a specific type of race condition. For a full technical description, Mitre’s list of common software weaknesses offers the following:
The software checks the state of a resource before using that resource, but the resource’s state can change between the check and the use in a way that invalidates the results of the check. This can cause the software to perform invalid actions when the resource is in an unexpected state.
In simpler terms, the program wants to do something but only if some condition is true.
if (condition) { doSomething(); }
However, a time-of-check to time-of-use bug may allow doSomething
to execute while condition
is not currently true. A TOCTOU bug occurs when another thread or process running in parallel changes the value of condition
after the if check, but before the call to doSomething
. The end result is that doSomething
is called when condition
is false, and this can lead to disastrous consequences.
Take, for example, a pseudo code program to prevent a checking account from being overdrawn.
if (accountBalance() > 0) { withdrawMoney(); }
Assume the account starts with $1. The account owner makes two transactions simultaneously, causing this code block to be executed twice in parallel. If both threads check the accountBalance
before either thread has withdrawn any money, both threads will see $1 in the account. Then both threads will withdraw money, likely causing the account to become overdrawn.
This is a time-of-check to time-of-use bug because the state (accountBalance
) is changed between the check (accountBalance() > 0
) and the use withdrawMoney()
.
Classic Example
Wikipedia gives an excellent classical example of a Time of Check Time of Use bug. This example shows a race condition between multiple processes that allows an attacker to access a protected file without permission.
Victim if (access("file", W_OK) != 0) { exit(1); } fd = open("file", O_WRONLY); // Actuall writing over /etc/passwd write(fd, buffer, sizeof(buffer));
Attacker // After the access check symlink("/etc/passwd", "file"); // Before the open, // "file" -> password database
In this case, the vulnerable program intends to check if the user can write to a file using access
and then if — and only if — the user has permission, it will open and write to the file.
However, the permission check and the actual opening of the file are not atomic, making it possible for some other process to interleave the permission “check” and “use” of that permission.
An attacker may be able to change the file to point to something that the user does not have access to read, in this case /etc/passwd
, but because the victim program has already succeeded the call to access
, it continues and opens the file anyway.
Assuming file
was initially a symlink to a file that the attacker created and has write access to, the chain of events leading to a vulnerability are:
- The victim program calls
access
to see if the attacker has write permission tofile
- The
access
check succeeds becausefile
currently points to the text file created by the attacker - In another process, the attacker changes
file
to point to/etc/passwd
which they do not have permission to write to - The victim program calls
open("file", O_WRONLY)
and allows the attacker to write to/etc/passwd
Thus the attacker was able to write to the highly confidential /etc/passwd
file.
This bug’s root cause is a race condition involving the filesystem, where the victim program expected the execution to be atomic. Inter-process race conditions like this have led to many security vulnerabilities. However, TOCTOU is possible anywhere parallelism can occur.
Multi-threaded TOCTOU
Although the classic examples of TOCTOU usually show a race on the filesystem between different processes, TOCTOU can be just as dangerous in multi-threaded software.
Null Pointer Dereference
Consider the following example.
Object *global; // Thread 1 if (global != nullptr) { // null dereference auto value = *global; }
// Thread 2 global = nullptr;
In this example, the programmer attempted to avoid a nullptr dereference in thread one by only dereferencing Object *global
if it is not null. However, as the check and dereference are not done atomically, thread two can set global
to be null after the check, but before the dereference. This results in thread one attempting to dereference nullptr.
One potential fix is to ensure that the check and use are made atomic and cannot be interleaved.
Object *global; // Thread 1 pthread_mutex_lock(&lock); if (global != nullptr) { auto value = *global; } pthread_mutex_unlock(&lock);
// Thread 2 pthread_mutex_lock(&lock); global = nullptr; pthread_mutex_unlock(&lock);
Now the locks ensure that thread two can not interleave the check and dereference on thread one.
Although this particular case may seem relatively straightforward, this pattern can lead to serious security vulnerabilities in real software.
TOCTOU in Windows
A time-of-check to time-of-use race triggered a critical use after free vulnerability in Windows XP. The vulnerability allowed attackers to crash the system and potentially even execute arbitrary code with elevated privileges.
Based on the description, a rough recreation of what the vulnerability might have looked like is shown below.
struct Procedure { bool isProcessing; void (*function)(); // function pointer }; std::list<Procedure*> pendingProcedures; void* workerThread(void* arg) { while (!pendingProcedures.empty()) { auto it = pendingProcedures.begin(); while (it != pendingProcedures.end()) { Procedure* proc = *it; if (proc->isProcessing) { continue; } proc->isProcessing = true; // Process Procedure proc->function(); // Update "it" to next procedure in list // Remove proc from pendingProcedures delete proc; } } }
First, notice there is a Procedure
struct that contains a flag isProcessing
. Next, there is a list of Procedure*
called pendingProcedures
. Lastly, there is a function called workerThread
that loops over the list of pending Procedures and processes them.
The worker thread searches for a procedure that is not processing at the line if(proc->isProcessing)
. Once a procedure with isProcessing
set to false is found, the thread “acquires” the procedure by setting isProcessing
to true.
The trouble here is a race on the isProcessing
flag. Multiple worker threads can acquire the same procedure through the following chain of events.
// Thread 1 auto proc = *it; [proc = 0xabc123] if (proc->isProcessing) proc->isProcessing = true; // Process proc // ... delete proc;
// Thread 2 auto proc = *it; [proc = 0xabc123] if (proc->isProcessing) proc->isProcessing = true; // proc has already been deleted // Dangerous Use After Free! proc->function();
When both worker threads acquire the same procedure for processing, it is likely one thread may process the procedure after another thread has already called delete proc
. This causes a use after free error, and can potentially be exploited by malicious agents to execute arbitrary code.
Preventing TOCTOU
Despite the dire consequences, there is no consensus on how to detect and prevent TOCTOU bugs reliably.
For inter-process and filesystem-level TOCTOU race conditions, file locks, transactional operating systems, and other approaches have so far been proposed, but none have yet emerged as the de-facto solution.
Multi-threaded TOCTOU seem to be even more difficult to detect and prevent. Despite a wealth of research, there are very few, if any production ready tools for detecting TOCTOU bugs in concurrent programs. Tools like Valgrind or Intel Inspector may be able to detect the side effect of a TOCTOU bug (e.g. a use after free, race condition, or double delete), but neither can detect the TOCTOU directly. However, newer tools like Coderrect’s code scanner offer some support for detecting TOCTOU directly, as well as other types of concurrency bugs.
Overall, it seems that for now the best approach for preventing TOCTOU is developer awareness. As more developers become familiar with, and aware of, TOCTOU style race conditions they will be less likely to inadvertently allow TOCTOU bugs in to critical code. Nonetheless, mistakes are inevitable, and for those cases, we can rely on tools like Coderrect, Valgrind, and Intel Inspector to assist developers in detecting problems in their code.