How does the absence of file-level locking in S3 affect collaborative applications?

***savas*** · 02-13-2024, 11:25 PM

[Image: drivemaker-s3-ftp-sftp-drive-map-mobile.png]

The lack of file-level locking in S3 creates some interesting challenges, particularly when you're working with collaborative applications that rely on simultaneous access to files. I’ve seen firsthand how this can impact workflows, especially in environments where multiple users are trying to read and write to the same set of files.

You’re probably aware that S3 operates on an object storage model. This means that instead of traditional file systems where you can lock and unlock files, S3 gives you objects that come with metadata. Because there's no explicit file locking mechanism, you're dealing with a scenario where you can get close to simultaneous writes, and that can lead to significant problems.

I remember working on a project where a team was using S3 as the backend for a collaborative document editing application. We all assumed we could just write documents to S3 like we would with a shared file system, but quickly ran into issues. The most straightforward issue was overwrites. If two people were editing the same document at the same time, one person could save their changes, and then the second person’s changes would completely erase the first set of changes. This kind of problem leads to data inconsistency, which is the last thing you want in a collaborative environment.

Imagine if one of your colleagues was working on a presentation file, and they finish and upload it to S3 through their application. But right at that same moment, another colleague is working on their version of the same file and uploads theirs, overwriting the first upload. There’s no way that S3 could prevent this from happening in real-time; it doesn’t have any mechanism to manage these write conflicts. You could easily face situations where changes go missing, and that’s unacceptable in collaborative contexts.

Another problem arises with reading objects, particularly in cases where the application uses versioning. While S3 does offer versioning features, without proper file locking, it can get chaotic. You might find that your application fetches the wrong version or an incomplete version of a file if two users are accessing it simultaneously. For example, if you’re trying to read an object while another person is updating it, you could end up seeing older data or a partially uploaded object.

I’ve also looked into applications that implement optimistic concurrency control to address the lack of locking in S3. This approach relies on the idea that conflicts are rare, and you handle them after the fact. But for that to work effectively, you need to have good error-checking mechanisms in place on the application level, which can complicate your development process. You end up writing additional logic to retry operations or to compare states before performing actions. Not only does this add to your application's complexity, but it also makes it heavier and increases the overhead on developers.

Consider this: if you have a workflow where people are constantly creating and modifying files – maybe you're implementing a Continuous Integration system or working with data pipelines – this can introduce significant delays. Typically, you'll want to use atomic operations for file modification. S3’s eventual consistency model can complicate this further. Data that you've just written might not be immediately available for subsequent reads, which can throw a wrench in workflows that depend on instant data availability.

Since S3 is designed for scale and durability, every time you make small edits, you’re actually creating new object versions rather than overwriting existing data directly. While this helps with version control, it doesn't resolve the problem of multiple users trying to edit the same object at the same time. Combining this limitation with the fact that there's no way to coordinate writes means that you have to think strategically about how your application handles file updates.

It’s also worth mentioning that while some foundational cloud services offer centralized management, S3 places you in a decentralized environment. This means you’re ultimately responsible for orchestrating the synchronization between users. You could pull in tools that try to create a locking mechanism at the application layer, but that can lead to further complications, especially as the number of users grows.

Bear in mind the performance implications as well. When multiple applications are battling for read/write access to a single object, you start to see latency issues. Ideally, you'll want a responsive application, but as more simultaneous requests pile up, the performance can suffer. I’ve seen it happen; a collaborative tool can turn sluggish if dozens of users are trying to work with a high-volume object at once.

Some developers have started to adopt microservices as a workaround. By breaking down applications into smaller, independently deployable services, I’ve noticed that you can often isolate the problems caused by this lack of locking mechanism. But this doesn't completely alleviate the issue. Each microservice still has to implement its own mechanism to ensure that concurrent updates don’t lead to problems.

You might also want to consider designing your application using queues to sequence writes. By funneling all updates through a managed queue service like SQS, you can at least control the flow of information better. I’ve found that it helps synchronize updates with less chaos, but it introduces its own delays. Waiting for a turn to write might not be what you want in a fast-paced collaborative app.

Finally, you can’t ignore the impact on user experience. If users encounter issues with the application due to data conflicts or outdated reads, it diminishes their trust. In a collaborative tool, you need your users to feel secure in the actions they take. Any uncertainty can lead to frustration and make them hesitant to contribute.

The absence of file-level locking in S3 puts a lot of onus on developers to think critically about architecture and design choices in collaborative applications. Whether they choose optimistic concurrency, build custom lock layers, or rely on queues, each approach has trade-offs. I’ve found that understanding these limitations and planning around them has saved many projects from turning into a muddled mess of unsynchronized data.