The most common race condition of that type that I see on this site relates to the handling of process IDs (PIDs) and signalling based on these.
Someone may be using ps
+grep
to get a PID for some named process, and then kill
to signal it. Between getting the PID and signalling it, the PID may have gone away. The use of pkill
minimizes the window for this bug to happen.
In other scenarios, holding on to a PID and expecting it to always refer to the same process over an extended length of time (for example, in a "PID file"), could cause signals to be sent to the wrong processes. This is because all Unix systems reuse PIDs.
Another common issue is file locking, i.e. using the filesystem to provide a locking mechanism for multi process synchronization and critical sections. One may, for example test whether some "lock file" exists, and if it doesn't, create it and thereby "get the lock". In-between checking for file existence and creating the lock file, there is a window of opportunity for some other process to also realize that the lock file doesn't exist:
while [ -e "$lockfile" ]; do
sleep 10
done
touch "$lockfile"
echo 'got lock' # or did I?
# do work
rm -f "$lockfile"
The solution in this case is to use a dedicated file-locking tool like flock
, or to use a lock directory, since mkdir
is atomic:
while ! mkdir "$lockdir" 2>/dev/null; do
sleep 10
done
echo 'got lock'
# do work
rmdir "$lockdir"
This still would not work reliable on networking filesystems such as NFS, as these do not provide atomic operations for directory creation.
There are doubtlessly many other examples.