OpenScan by Coderrect

a project to scan open source software


Use Coderrect Scanner on Open Source Projects


We automatically scan a group of open source projects for multi-threaded issues and report findings to their developers.

Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker.

Coderrect Scanner found and subsequently confirmed with the development team a data race in Redis v6.07.

Data race on server.master_repl_offset between the main thread and an IO thread.


To reproduce
==== Found a race between:
line 4988 in server.c AND line 164 in replication.c Shared variable: at line 72 of server.c
72|struct redisServer server; /* Server global state */

Thread 1:
4986| {
4987| memcpy(server.replid,rsi.repl_id,sizeof(server.replid));
>4988| server.master_repl_offset = rsi.repl_offset; 4989| /* If we are a slave, create a cached master from this
4990| * information, in order to allow partial resynchronizations*/

>>>Stacktrace:
>>>main
>>> loadDataFromDisk [server.c:5272]
Thread 2:
162| unsigned char *p = ptr;
163|
>164| server.master_repl_offset += len;
165|
166| /* This is a circular buffer, so write as much data we can at every*/ >>>Stacktrace:
>>>pthread_create [networking.c:3016]
>>> IOThreadMain [networking.c:3016]
>>> readQueryFromClient [networking.c:2979]
>>> processInputBuffer [networking.c:1996]
>>> processGopherRequest [networking.c:1886]
>>> lookupKeyRead [gopher.c:53]
>>> lookupKeyReadWithFlags [db.c:146]
>>> expireIfNeeded [db.c:101]
>>> propagateExpire [db.c:1311]
>>> replicationFeedSlaves [db.c:1233]
>>> feedReplicationBacklog [replication.c:270]

Inside of main in server.c, the function InitServerLast creates IOThreads which can eventually call feedReplicationBacklog and write to server.master_repl_offset through the call stack shown for thread 2:
IOThreadMain -> readQueryFromClient -> ... -> replicationFeedSlaves -> feedReplicationBacklog

Meanwhile, immediately after the main thread finishes spawning IOThreads and returns from InitServerLast, it calls loadDataFromDisk which also writes to server.master_repl_offset.

covid-sim is a COVID-19 CovidSim microsimulation model developed by the MRC Centre for Global Infectious Disease Analysis hosted at Imperial College, London.

Coderrect Scanner found and subsequently confirmed with the development team two data races.

The first race is on nevents. There is a write guarded by a critical section at Update.cpp:215 and an unguarded read at Update.cpp:145.

==== Found a race between:
line 215, column 3 in src/Update.cpp AND line 145, column 8 in src/Update.cpp

Shared variable: at line 287 of src/SetupModel.cpp
287| if (!(nEvents = (int*)calloc(1, sizeof(int)))) ERR_CRITICAL("Unable to allocate events storage\n");

Thread 1:
213|
214| //increment the index of the infection event
>215| (*nEvents)++;
216| }
217|
>>>Stack Trace:
>>>DoInfect(int, double, int, int) [src/Sweep.cpp:717]
>>> RecordEvent(double, int, int, int, int) [src/Update.cpp:147]

Thread 2:
143| if (P.DoRecordInfEvents)
144| {
>145| if (*nEvents < P.MaxInfEvents)
146| {
147| RecordEvent(t, ai, run, 0, tn); //added int as argument to RecordEvent to record run number: ggilani - 15/10/14
>>>Stack Trace:
>>>DoInfect(int, double, int, int) [src/Sweep.cpp:717]

The OpenMP region this bug occurs: /git/covid-sim/src/Sweep.cpp
>701|#pragma omp parallel for schedule(static,1) default(none) \
702| shared(t, run, P, StateT, Hosts, ts)
703| for (int j = 0; j < P.NumThreads; j++)
704| {
705| for (int k = 0; k < P.NumThreads; k++)
706| {
Gets called from:
>>>main
>>> RunModel(int) [src/CovidSim.cpp:409]

The second race we found is on State.cumV.
There is a write guarded by a critical section at Update.cpp:1271 and there are unguarded reads at Update.cpp:617 and Update.cpp:1262.

==== Found a race between:
line 1271, column 3 in src/Update.cpp AND line 617, column 59 in src/Update.cpp
Shared variable: State at line 74 of src/CovidSim.cpp
74|popvar State, StateT[MAX_NUM_THREADS];

Thread 1:
1269|
1270|#pragma omp critical (state_cumV)
>1271| State.cumV++;
1272| if (P.VaccDosePerDay >= 0)
1273| {
>>>Stack Trace:
>>>DoFalseCase(int, double, unsigned short, int) [src/Sweep.cpp:710]
>>> DoDetectedCase(int, double, unsigned short, int) [src/Update.cpp:886]
>>> DoVacc(int, unsigned short) [src/Update.cpp:621]

Thread 2:
615| if (P.DoHouseholds)
616| {
>617| if ((!P.DoMassVacc) && (t >= P.VaccTimeStart) && (State.cumV < P.VaccMaxCourses))
618| if ((t < P.VaccTimeStart + P.VaccHouseholdsDuration) && ((P.VaccPropCaseHouseholds == 1) || (ranf_mt(tn) < P.VaccPropCaseHouseholds)))
619| {
>>>Stack Trace:
>>>DoFalseCase(int, double, unsigned short, int) [src/Sweep.cpp:710]
>>> DoDetectedCase(int, double, unsigned short, int) [src/Update.cpp:886]

The OpenMP region this bug occurs:
/covid-sim/src/Sweep.cpp
>697|#pragma omp parallel for private(i,k) schedule(static,1)
698| for (j = 0; j < P.NumThreads; j++)
699| {
700| for (k = 0; k < P.NumThreads; k++)
701| {
702| for (i = 0; i < StateT[k].n_queue[j]; i++)
Gets called from:
>>>main
>>> RunModel(int) [src/CovidSim.cpp:413]
>>> InfectSweep(double, int) [src/CovidSim.cpp:3062]
Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.

Coderrect Scanner found and subsequently confirmed with the development team two data races.

Detailed reports:

Thread 1:
160| pthread_mutex_lock(&d->lock);
161| d->crawlerstats[slab_cls].end_time = current_time;
>162| d->crawlerstats[slab_cls].run_complete = true;
163| pthread_mutex_unlock(&d->lock);
164|}
>>>Stacktrace:
>>>pthread_create [crawler.c:505]
>>> item_crawler_thread [crawler.c:505]
>>> lru_crawler_class_done [crawler.c:378]
>>> crawler_expired_doneclass [crawler.c:350]

Thread 2:
1462| crawlerstats_t *s = &cdata->crawlerstats[i];
1463| /* We've not successfully kicked off a crawl yet. */
>1464| if (s->run_complete) {
1465| char *lru_name = "na";
1466| pthread_mutex_lock(&cdata->lock);
>>>Stacktrace:
>>>pthread_create [items.c:1703]
>>> lru_maintainer_thread [items.c:1703]
>>> lru_maintainer_crawler_check [items.c:1647]

Although the write @ crawler.c:162 is protected by the lock, the first read on crawler.c:1464 is not protected by the same lock (the lock was acquired two lines after at line 1466).
Similar issues were observed in logger.c

Thread 1:
171| for (l = logger_stack_head; l != NULL; l=l->next) {
172| pthread_mutex_lock(&l->mutex);
>173| l->eflags = f;
174| pthread_mutex_unlock(&l->mutex);
175| }
>>>Stacktrace:
>>>pthread_create [logger.c:562]
>>> logger_thread [logger.c:562]
>>> logger_thread_read [logger.c:536]
>>> logger_thread_write_entry [logger.c:377]
>>> logger_thread_poll_watchers [logger.c:304]
>>> logger_thread_close_watcher [logger.c:462]
>>> logger_set_flags [logger.c:346]

Thread 2:
1510| break;
1511| }
>1512| LOGGER_LOG(l, LOG_SYSEVENTS, LOGGER_CRAWLER_STATUS, NULL,
1513| CLEAR_LRU(i),
1514| lru_name,
>>>Stacktrace:
>>>pthread_create [items.c:1703]
>>> lru_maintainer_thread [items.c:1703]
>>> lru_maintainer_crawler_check [items.c:1647]

Where you acquire a lock @ logger.c:172 before you update the l->eflags, while the LOGGER_LOG macro can read the flag without the lock.
#define LOGGER_LOG(l, flag, type, ...) \
do { \
logger *myl = l; \
if (l == NULL) \
myl = GET_LOGGER(); \
if (myl->eflags & flag) \
logger_log(myl, type, __VA_ARGS__); \
} while (0)
"The Linux kernel, developed by contributors worldwide, is a free and open-source, monolithic, modular (i.e., it supports the insertion and removal at runtime of loadable kernel objects), Unix-like operating system kernel." - from Wikipedia

Coderrect Scanner found and subsequently confirmed with the development team two data races.

There are 3 races in total, all potentially happen when the system call settimeofday is called concurrently:

1. The race is on firsttime in do_sys_settimeofday64() in the kernel/time/time.c. The read of firsttime at line 188 and the write at line 189 are not protected by locks.

2. The race is on persistent_clock_is_local in timekeeping_warp_clock() in kernel/time/timekeeping.c. The write at line 1337 could potentially race with itself, if settimeofday is called concurrently.

3. The race is on vdata[CS_HRES_COARSE] in update_vsyscall_tz() in kernel/time/vsyscall.c. The write at line 125 and 126 could potentially race with themselves, if settimeofday is called concurrently.

Submit your project information to add your open source project to this effort