Detect races in an open source project Memcached

This tutorial assumes that you have gone through one of the three starter case tutorials and have successfully run Coderrect.

Coderrect detected 3 new races (2 of them were confirmed) in memcached. To detect the reported bugs using coderrect, download and checkout the buggy version of memcached using the following commands.

$ git clone https://github.com/memcached/memcached.git
$ cd memcached
$ git checkout 82029ecc9b3dd0f57b3f9ab9761f44714cceed6f

Detect the race

  1. Build memcached using coderrect
# install dependencies
$ apt install libevent-dev
# configure memcached
$ ./autogen.sh && ./configure
# build memcached using coderrect
$ coderrect -t make

The coderrect -t make command will compile and analyze the program automatically.

  1. Detect races using coderrect

After compilation, coderrect automatically detects and lists all the potential targets to analyzed as follows:

1) timedrun
2) sizes
3) memcached-debug
4) memcached
5) testapp
Please select binaries by entering their ordinal numbers (e.g. 1,2,6):

Select 3) memcached-debug as the target to detect races on the debug version of memcached.


Interpret the Results

The coderrect tool generates a comprehensive report that can be viewed in a browser.

HTML Report

To view the full report, open ‘.coderrect/report/index.html‘ in your browser.

The HTML report looks like the following picture.

Terminal Report

To get a quick overview of the detected races, coderrect can also report a summary of the most interesting races in the terminal (with -t flag, checkout all coderrect options). The terminal races report looks like the following:

==== Found a race between: 
line 162, column 5 in crawler.c AND line 1464, column 16 in items.c
Shared variable:
 at line 1577 of items.c
 1577|        calloc(1, sizeof(struct crawler_expired_data));
Thread 1:
 160|    pthread_mutex_lock(&d->lock);
 161|    d->crawlerstats[slab_cls].end_time = current_time;
>162|    d->crawlerstats[slab_cls].run_complete = true;
 163|    pthread_mutex_unlock(&d->lock);
 164|}
>>>Stacktrace:
>>>pthread_create [crawler.c:505]
>>>  item_crawler_thread [crawler.c:505]
>>>    lru_crawler_class_done [crawler.c:378]
>>>      crawler_expired_doneclass [crawler.c:350]
Thread 2:
 1462|        crawlerstats_t *s = &cdata->crawlerstats[i];
 1463|        /* We've not successfully kicked off a crawl yet. */
>1464|        if (s->run_complete) {
 1465|            char *lru_name = "na";
 1466|            pthread_mutex_lock(&cdata->lock);
>>>Stacktrace:
>>>pthread_create [items.c:1703]
>>>  lru_maintainer_thread [items.c:1703]
>>>    lru_maintainer_crawler_check [items.c:1647]

Each reported race starts with a summary of where the race was found.

==== Found a race between: 
line 162, column 5 in crawler.c AND line 1464, column 16 in items.c

Next the report shows the name and location of the variable on which the race occurs.

Shared variable:
 at line 1577 of items.c
 1577|        calloc(1, sizeof(struct crawler_expired_data));

For example, the above result shows that the race occurs on the variable allocated at line 1577 in items.c file.

Next the tool reports information about the two unsynchronized accesses to the shared variable. For each of the two accesses, the code snippet, and the stack trace is shown.

Since finding the reported location in the code may be a little tedious, the report shows a preview of the file at that location.

Thread 1:
 160|    pthread_mutex_lock(&d->lock);
 161|    d->crawlerstats[slab_cls].end_time = current_time;
>162|    d->crawlerstats[slab_cls].run_complete = true;
 163|    pthread_mutex_unlock(&d->lock);
 164|}

The code snippet shows that the race is on variable crawlerstats[slab_cls].run_complete. Coderrect also shows the stack trace that triggers the race to make validation on the race simpler.

>>>Stacktrace:
>>>pthread_create [crawler.c:505]
>>>  item_crawler_thread [crawler.c:505]
>>>    lru_crawler_class_done [crawler.c:378]
>>>      crawler_expired_doneclass [crawler.c:350]