Detecting Races in Redis

Redis is a popular in-memory database that persists on disk. Redis has been heavily adopted and is widely used. The GitHub repository has over 43,000 stars, over 18,000 questions have been tagged with [redis] on StackOverflow, and there are various meetups and conferences for Redis every year.

This tutorial assumes that you have gone through one of the three starter case tutorials and have successfully run Coderrect.


Detect the race

Redis is easy to build. To detect races on redis, simply run the following:

git clone -b 6.0.0 https://github.com/antirez/redis.git
cd redis
coderrect -t -o report -e redis-server,redis-benchmark make -j

The analysis time for each project will be printed to the terminal as is shown below:

Analyzing /path/to/redis/src/redis-benchmark ...
 - ✔ [00m:01s] Loading IR From File                    
 - ✔ [00m:02s] Running Compiler Optimization Passes (Phase I)                                              
 - ✔ [00m:00s] Canonicalizing Loops                    
 - ✔ [00m:00s] Propagating Constants                     
 - ✔ [00m:00s] Running Compiler Optimization Passes (Phase II)                                               
 - ✔ [00m:00s] Running Pointer Analysis                        
 - ✔ [00m:09s] Building Static Happens-Before Graph                                    
 - ✔ [00m:11s] Detecting Races               
 - ✔ [00m:00s] Scanning for additional OpenMP Regions                                                                        

Analyzing /path/to/redis/src/redis-server ...
 - ✔ [00m:04s] Loading IR From File                    
...


----------------------------The summary of races in redis-server------------------------

		12 shared data races

----------------------------The summary of races in redis-benchmark------------------------

		 3 shared data races

Coderrect provide an option to list all potential target binaries and specify which binaries should be analyzed with the following command.

coderrect make

Coderrect will automatically detect which binaries were built using make and list the potential targets to be analyzed as shown below.

The project creates multiple executables. Please select one from the list below
to detect races. 

In the future, you can specify the executable using the option "-e" if you know 
which ones you want to analyze. 

    coderrect -e your_executable_name1,your_executable_name2 your_build_command_line

 1) src/redis-benchmark
 2) src/redis-cli
 3) src/redis-server
 4) deps/lua/src/luac
 5) deps/lua/src/lua

Please select binaries by entering their ordinal numbers (e.g. 1,2,6):1,2,3

In addition, Coderrect allow users to analyze all generated binaries by using -analyzeAllBinaries flag to command line. Coderrect will skip the asking stage and automatically analyze all binaries.

coderrect -analyzeAllBinaries make

Interpret the Results

The coderrect tool reports a quick summary of the most interesting races directly in the terminal for quick viewing when run with -t. The tool also generates a more comprehensive report that can be viewed in the browser.


Terminal Report

Next we will explain an example of a race reported in redis-server, as well as how to debug the race based on the provided information:

==== Found a race between: 
line 888, column 9 in debug.c AND line 885, column 16 in debug.c
Shared variable:
 at line 72 of server.c
 72|struct redisServer server; /* Server global state */
Thread 1:
 886|        serverLogRaw(LL_WARNING|LL_RAW,
 887|        "nn=== REDIS BUG REPORT START: Cut & paste starting from here ===n");
>888|        server.bug_report_start = 1;
 889|    }
 890|}
>>>Stacktrace:
>>>pthread_create [bio.c:123]
>>>  bioProcessBackgroundJobs [bio.c:123]
>>>    lazyfreeFreeObjectFromBioThread [bio.c:209]
>>>      decrRefCount [lazyfree.c:130]
>>>        freeListObject [object.c:365]
>>>          _serverPanic [object.c:291]
>>>            bugReportStart [debug.c:873]
Thread 2:
 883|
 884|void bugReportStart(void) {
>885|    if (server.bug_report_start == 0) {
 886|        serverLogRaw(LL_WARNING|LL_RAW,
 887|        "nn=== REDIS BUG REPORT START: Cut & paste starting from here ===n");
>>>Stacktrace:
>>>pthread_create [networking.c:2942]
>>>  IOThreadMain [networking.c:2942]
>>>    _serverAssert [networking.c:2891]
>>>      bugReportStart [debug.c:791]

At a glance we can quickly see that the race is between the lines 885 and 888 in the debug.c file. Line 888 writes to server.bug_report_start and line 885 reads server.bug_report_start. To confirm this race we need to show 1) both accesses can be made in parallel, and 2) the parallel accesses are being made to the same server.bug_report_start object.

884|void bugReportStart(void) {
885|  if (server.bug_report_start == 0) {
886|    serverLogRaw(LL_WARNING|LL_RAW,
887|    "nn=== REDIS BUG REPORT START: ...n");
888|    server.bug_report_start = 1;
889|  }
890|}

To prove that 1) both accesses can be made in parallel, we need to show that thread 1 and 2 can both call the bugReportStart function at the same time. We can check the stack trace to see if there are any synchronizations between the threads after they are spawned.

Thread 1
>>>Stacktrace:
>>>pthread_create [bio.c:123]
>>>  bioProcessBackgroundJobs [bio.c:123]
>>>    lazyfreeFreeObjectFromBioThread [bio.c:209]
>>>      decrRefCount [lazyfree.c:130]
>>>        freeListObject [object.c:365]
>>>          _serverPanic [object.c:291]
>>>            bugReportStart [debug.c:873]
Thread 2:
>>>Stacktrace:
>>>pthread_create [networking.c:2942]
>>>  IOThreadMain [networking.c:2942]
>>>    _serverAssert [networking.c:2891]
>>>      bugReportStart [debug.c:791]

The first stack trace shows that thread 1 is spawned at bio.c:123 inside of the bioInit function. The second stack trace shows that thread 2 is spawned at networking.c:2942 in the initThreadedIO function. By looking one level up the stack trace, we can see that both bioInit and initThreadedIO are called one after the other from InitServerLast in server.c.

void InitServerLast() {
    bioInit();
    initThreadedIO();

If we check the body of the bioInit function we see there are no synchronizations or joins after thread 1 is spawned. This means thread 1 will continue to execute even after the main thread has returned from bioInit and entered initThreadIO. Then, while in initThreadIO the main thread spawns thread 2 in parallel with thread 1. Thus we have shown that both threads do indeed run in parallel.

Next we must show that 2) the parallel accesses are being made to the same server.bug_report_start object. The report also shows us the creation site of the shared object on which the race occurs (in this case server).

Shared variable:
 at line 72 of server.c

By inspecting server.c:72 we see that server is actually a global variable! This means that that both threads are indeed accessing the same object in memory.

72| struct redisServer server; /* Server global state */

So, by showing that 1) both accesses can be made in parallel and 2) the parallel accesses are being made to the same server.bug_report_start object we have confirmed this race in redis.


HTML Report

The terminal is great to get a quick idea about what races are reported, but a full and more detailed report can be viewed in the browser.

When we initially ran coderrect we included the -o report flag. This created a directory named report and a file named index.html within that directory. To view the full report open the index.html file in a browser.

Depending on your system this can be done through the terminal with some command. E.g. browse report/index.html on Gnome based systems or open report/index.html on Mac.