A guide to writing real-life thread-safe code

When writing code that will be used by multiple processes simultaneously, it is important to make sure that code is thread-safe so that your application functions properly. In this shot, I will explain thread-safety by showing a real-life, thread-unsafe code and explaining ways to make it thread-safe.

How do we know if code is thread-safe?

We can tell that code is thread-safe if it only uses and updates shared resources in a way that guarantees safe execution by multiple threads at the same time. Shared resources can be a counter variable, an array, or anything else.

What does code look like when it’s not thread-safe?

In this example, I won’t show the conventional counter++ example shown in textbooks. Instead, I’ll show an example that is more relatable and can literally happen to anyone writing production code.

This example is in C#, but regardless of the programming language you’re working with, the concept remains the same.

The code above looks fine when you’re in a single-threaded environment. However, in a multithreaded or distributed environment, where multiple processes call your code simultaneously, this could actually be very dangerous. Let me explain why.

In the case where we have 3 processes calling GetOrAddProduct simultaneously, the scenario described below could happen:

Process A & Process C want to get or add Product A to the dictionary.
Process B wants to get or add Product B to the dictionary.
All three processes are started simultaneously.
Process B gets to line 12 and sees that Product B doesn’t exist. It then jumps to line 16, gets a token, and goes to line 17 to upload the product. The upload process takes a long time, so, while Process B is still uploading…
Process A gets to line 12 and sees that Product A doesn’t exist. It then jumps to line 16, gets a token, and then goes to line 17 to upload the product. The upload process takes a long time, so, while Process A is still uploading…
Process B is now done, adds Product B to the dictionary, and exits the method.
Process C gets to line 12 and sees that Product A doesn’t exist. It then jumps to line 16, gets a token, and then goes to line 17 to upload the product. The upload process takes a long time, so, while Process C is still uploading…
Process A is now done – it adds Product A to the dictionary and exits the method.
Process C is now done – it adds Product A to the dictionary and exits the method.

In this scenario, two things can go wrong (and possibly have):

Product A has been uploaded twice (or the second upload threw an Exception depending on how your upload logic is set up).
Product A has been added to the list twice, so the size of the list is three instead of two.

This is called a race condition. In scenarios like this, we might be tempted to replace

storeProducts.Add(product.Name); on line 18 with:

class Store
{
  private double revenue;
  private List<string> storeProducts;
  
  public Store(RevenueGenerator generator)
  {
    revenue = generator.GenerateCurrentRevenue();
    storeProducts = new List<string>();
  }
  
  async Task UpdateStoreRevenue(Product product)
  {
    if(!storeProducts.Contains(product.Name))
    {
       var token = HelperClass.GenerateTokenOrSomething();
       await HelperClass.UpdateStoreRevenue(token, product);
       revenue += product.Price;
       storeProducts.Add(product.Name);
    }
  }
}

A more scalable solution

The more scalable solution is to write thread-safe code by adding synchronization to the part of your code that isn’t thread-safe. This helps protect access to shared resources. If a process owns a lock, then it can access the protected shared resource. If a process does not own the lock, then it cannot access the shared resource.

In our previous example, since Process B gets to the area of the unsafe code first, it acquires the lock and keeps executing. When Process B is done executing, it should release the lock for other processes. If Process A or C tries to acquire the lock when Process B is not done, it will have to wait.

There are a bunch of lockable objects, but I will be explaining a mutex:

Mutex

A mutex is the short form of MUTual EXclusion. A mutex can be owned by one thread at a time. If we had to use a mutex to fix our code, it would look like this:

class Store
{
  private static Mutex mutex = new Mutex();
  private List<string> storeProducts;
  
  public Store()
  {
    storeProducts = new List<string>();
  }
  
  async Task<string> GetOrAddProduct(Product product)
  {
    try
    {
      mutex.WaitOne(); // controls access to code that isn't thread-safe
    
      if(storeProducts.Contains(product.Name))
      {
        return product;
      }
      var token = HelperClass.GenerateTokenOrSomething();
      await HelperClass.UploadProductDetails(token, product);
      storeProducts.Add(product.Name);
    }
    finally
    {
      mutex.ReleaseMutex();
      return product.Name;
    }
  }
}

A guide to writing real-life thread-safe code

How do we know if code is thread-safe?

What does code look like when it’s not thread-safe?

A more scalable solution

Mutex

Conclusion