Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

c#
communitycreator

A guide to writing real-life thread-safe code

Adora Nwodo
widget

When writing code that will be used by multiple processes simultaneously, it is important to make sure that code is thread-safe so that your application functions properly. In this shot, I will explain thread-safety by showing a real-life, thread-unsafe code and explaining ways to make it thread-safe.

How do we know if code is thread-safe?

We can tell that code is thread-safe if it only uses and updates shared resources in a way that guarantees safe execution by multiple threads at the same time. Shared resources can be a counter variable, an array, or anything else.

What does code look like when it’s not thread-safe?

In this example, I won’t show the conventional counter++ example shown in textbooks. Instead, I’ll show an example that is more relatable and can literally happen to anyone writing production code.

class Store
{
   private List<string> storeProducts;
   public Store()
   {
     storeProducts = new List<string>();
   }
  
   async Task<string> GetOrAddProduct(Product product)
   {
     if(storeProducts.Contains(product.Name))
     {
       return product;
     }
    
     var token = HelperClass.GenerateTokenOrSomething();
     await HelperClass.UploadProductDetails(token, product);
     storeProducts.Add(product.Name);
    
     return product.Name;
   }
}

This example is in C#, but regardless of the programming language you’re working with, the concept remains the same.

The code above looks fine when you’re in a single-threaded environment. However, in a multithreaded or distributed environment, where multiple processes call your code simultaneously, this could actually be very dangerous. Let me explain why.

In the case where we have 3 processes calling GetOrAddProduct simultaneously, the scenario described below could happen:

  • Process A & Process C want to get or add Product A to the dictionary.
  • Process B wants to get or add Product B to the dictionary.
  • All three processes are started simultaneously.
  • Process B gets to line 12 and sees that Product B doesn’t exist. It then jumps to line 16, gets a token, and goes to line 17 to upload the product. The upload process takes a long time, so, while Process B is still uploading…
  • Process A gets to line 12 and sees that Product A doesn’t exist. It then jumps to line 16, gets a token, and then goes to line 17 to upload the product. The upload process takes a long time, so, while Process A is still uploading…
  • Process B is now done, adds Product B to the dictionary, and exits the method.
  • Process C gets to line 12 and sees that Product A doesn’t exist. It then jumps to line 16, gets a token, and then goes to line 17 to upload the product. The upload process takes a long time, so, while Process C is still uploading…
  • Process A is now done – it adds Product A to the dictionary and exits the method.
  • Process C is now done – it adds Product A to the dictionary and exits the method.

In this scenario, two things can go wrong (and possibly have):

  • Product A has been uploaded twice (or the second upload threw an Exception depending on how your upload logic is set up).
  • Product A has been added to the list twice, so the size of the list is three instead of two.

This is called a race condition. In scenarios like this, we might be tempted to replace

storeProducts.Add(product.Name); on line 18 with:

if(!storeProducts.Contains(product.Name)){
  storeProducts.Add(product.Name);
}

However, this is not really scalable because you could decide to do this check based on the fact that this code is adding to a list. Imagine instead that we had something like the code snippet below:

class Store
{
  private double revenue;
  private List<string> storeProducts;
  
  public Store(RevenueGenerator generator)
  {
    revenue = generator.GenerateCurrentRevenue();
    storeProducts = new List<string>();
  }
  
  async Task UpdateStoreRevenue(Product product)
  {
    if(!storeProducts.Contains(product.Name))
    {
       var token = HelperClass.GenerateTokenOrSomething();
       await HelperClass.UpdateStoreRevenue(token, product);
       revenue += product.Price;
       storeProducts.Add(product.Name);
    }
  }
}

In the code snippet above, we are updating store revenue before adding the products to our list, and there’s no direct way of checking if we’ve added a price to the overall revenue. This could be a disaster – imagine if a customer’s product worth ($400,000,000.00) gets added twice. Audio money? Now, that’s a problem.

A more scalable solution

The more scalable solution is to write thread-safe code by adding synchronization to the part of your code that isn’t thread-safe. This helps protect access to shared resources. If a process owns a lock, then it can access the protected shared resource. If a process does not own the lock, then it cannot access the shared resource.

In our previous example, since Process B gets to the area of the unsafe code first, it acquires the lock and keeps executing. When Process B is done executing, it should release the lock for other processes. If Process A or C tries to acquire the lock when Process B is not done, it will have to wait.

There are a bunch of lockable objects, but I will be explaining a mutex:

Mutex

A mutex is the short form of MUTual EXclusion. A mutex can be owned by one thread at a time. If we had to use a mutex to fix our code, it would look like this:

class Store
{
  private static Mutex mutex = new Mutex();
  private List<string> storeProducts;
  
  public Store()
  {
    storeProducts = new List<string>();
  }
  
  async Task<string> GetOrAddProduct(Product product)
  {
    try
    {
      mutex.WaitOne(); // controls access to code that isn't thread-safe
    
      if(storeProducts.Contains(product.Name))
      {
        return product;
      }
      var token = HelperClass.GenerateTokenOrSomething();
      await HelperClass.UploadProductDetails(token, product);
      storeProducts.Add(product.Name);
    }
    finally
    {
      mutex.ReleaseMutex();
      return product.Name;
    }
  }
}

The code is wrapped in a try-finally block because, regardless of what happens in our code, we want the code in the finally block to execute. If we do not wrap this in a try-finally block and HelperClass.UploadProductDetails(token, product); throws an exception, the lock is never released, which could cause a deadlockanother concurrency problem. In basic terms, a deadlock means that processes waiting for a particular resource are blocked indefinitely.

Conclusion

There are other ways to write thread-safe code in distributed or multithreaded environments. If you’d like to know more, I found a tutorial series that talks extensively about concurrency problems and fixing thread-safety issues.

RELATED TAGS

c#
communitycreator
RELATED COURSES

View all Courses

Keep Exploring