Distributed Locks with DynamoDB for Python

I was reading this post by AWS and thinking about distributed locks with DynamoDB but for Python instead of Java. Maybe even other languages. I have done distributed system consensus client implementations with Apache Zookeeper before but never with a database like DynamoDB. The post by AWS is compelling and from what they say they use this in production. So be it, I decided to jump in and whip up a Python client for distributed locks. It takes the same concepts as the java version and implements them in Python. There are still more features in the Java version and the Python version has never been run in production but I wanted to get it out into the world to see if it was helpful to any Python developers if nothing else they can pick it up.

The premise is you have a row in the DynamoDB database with a partition key and (optionally) a sort key. That row also has attributes for idle_timeout and record_version_number. Now, when someone acquires the lock they write their idle_timeout and record_version_number to the row. Every other consumer looking to acquire that lock will see it, and wait the idle_timeout. Then after that it will try to upsert on the row and steal the lock if record_version_number they were waiting on is the same as in the database. If it is is acquires the lock through the upsert writing its idle_timeout and record_version_number. If the upsert failed because there is a different record_version_number in the row than we expected then someone else acquired the lock instead of us.

  1. Host A acquires a lock on Moe by writing an item to the lock table on the condition that no item keyed at “Moe” exists yet. Host A acquires the lock with a revision version number (RVN) of UUID1.
  2. Host B tries to get a lock on Moe with a RVN UUID2.
  3. Host B checks to see if a lock already exists with a GetItem call.
  4. In this case, host B finds that host A holds a lock on Moe with a record version number (RVN) of UUID1. The same application runs on hosts A and B. That being so, host B expects host A to heartbeat and renew the lock on Moe in less than 10 seconds, if host A intends to keep the lock on Moe. Host A heartbeats once, and uses a conditional update on the lock keyed at Moe to update the RVN of the lock to UUID3.
  5. Host B checks 10 seconds after the first AcquireLock call to see if the RVN in A’s lock on Moe changed with a conditional UpdateItem call and a RVN of UUID4.
  6. Host A successfully updates the lock. Thus, host B finds the new RVN equal to UUID3 and waited 10 more seconds. Host A died after the first heartbeat, so it never changes the RVN past UUID3. When host B calls tries to acquire a lock on Moe for the third time, it finds that the RVN was still UUID3, the same RVN retrieved on the second lock attempt.
  7. In this case, hosts A and B run the same application. Because host B expects host A to heartbeat if host A is healthy and intends to keep the lock, host B considers the lock on Moe expired. Host B’s conditional update to acquire the lock on Moe succeeds, and your application makes progress!

Thanx =) Joe Stein

https://www.twitter.com/charmalloc

https://www.linkedin.com/in/charmalloc

Leave a Reply

Discover more from Hello BitsNBytes World

Subscribe now to keep reading and get access to the full archive.

Continue reading