Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 | Part 7

What happens when you’ve just boarded a train, but your manager is texting you asking for the latest state of the document that countless colleagues have been working on over the past few days? After all, Wi-Fi on trains and planes can be abysmal, and a lack of connectivity can often lead to unfortunate outcomes for even the most well-prepared. Fortunately, with the support of the real-time collaboration framework Yjs and the local database available to browsers known as IndexedDB, you too can implement robust offline-first applications that not only enable peer-to-peer collaborative editing but also represent a wholesale revolution in the ways in which we interact with not only content management systems (CMS) but also web applications at large.

A short time ago, this author (Preston So, Editor in Chief at Tag1 and author of Decoupled Drupal in Practice) conducted a lively discussion with treasured Tag1 colleagues Kevin Jahns (creator of Yjs and Real-Time Collaboration Systems Lead at Tag1), Fabian Franz (Senior Technical Architect and Performance Lead at Tag1), and Michael Meyers (Managing Director at Tag1) about how Yjs and associated projects enable a much more robust approach to building offline-first applications than ever before. In this multi-part blog series, we cover some of the most salient features of Yjs and how it provides for offline editing. In this fifth installment, we spotlight y-webrtc and garbage collection.

Revisiting y-webrtc

I strongly advise readers of this blog series to return to the first, second, third, and fourth installments of this blog series if you have not yet familiarized yourself with foundational offline editing concepts such as Yjs, IndexedDB, Service Workers, and Web Workers. The offline-first world is rich with new technologies, and it can be difficult to juggle all of the terminology. Fortunately, the previous blog posts in this series offer a comprehensive introduction to all of the concepts you need to understand to implement offline-enabled applications with Yjs.

Michael asked during our Tag1 Team Talks episode about the viability of using y-webrtc, the adapter for WebRTC, in conjunction with an offline-first application. Kevin admits that one of the only disadvantages of y-webrtc, which we covered in previous blog posts as well as previous Tag1 Team Talks episodes, is the fact that it takes some time to create a connection to other peers and therefore to synchronize with any updates that need applying from those other peers. Recall from our previous explorations of WebRTC that y-webrtc enables you to create a peer-to-peer network between clients for the same document without a centralized server.

In WebRTC, however, as opposed to this outcome, as soon as a connection is established, the connection is very fast despite the fact that establishing the connection requires some latency. Nonetheless, such an approach enables us to implement some interesting ideas. For instance, consider a typical video chat application such as Zoom. There are many WebRTC-based chat applications on the web that are usable for everyone and provide some inspiration for offline-first collaboration

Some WebRTC-based applications allow users to collaborate on documents directly with one another. Kevin only knows of one that allows for document collaboration, but with the support of IndexedDB, you can successfully store the document in question offline such that the next time you open the collaboration room, it is still present. Even better, there is no need for a server, because all of the data is persisted in a peer-to-peer fashion and decentralized manner amongst all of the other clients collaborating on the same document.

Still more compelling is the notion that you can synchronize with these other document versions even if they are not currently viewing the website where collaborative editing is enabled on an active browser tab. This is because there are background jobs executed by Service Workers that can enable this procedure to proceed uninhibited. And this, in turn, enables many fascinating potential applications with the help of the y-webrtc connector.

Garbage collection

But what about garbage collection? And what about the need to occasionally resolve conflicts that cannot be reconciled by a machine? When it comes to garbage collection, there are two important elements to consider. The first is how we actually resolve all of the potential conflicts that may arise over the course of multiple collaborators working on a document together. As a side note, in a previous Tag1 Team Talks episode, we discuss how Yjs is able to figure out many of these issues on its own thanks to effective conflict resolution techniques. An explanation and proof of how this conflict resolution occurs is available on the Yjs GitHub organization.

Kevin conceived Yjs as a framework that does not necessarily require a server connection. Because Yjs is server-agnostic, it does not need a unique sequence of the messages that were issued to the server. Instead, it only listens to document updates such that it can always synchronize that data as soon as your document receives all updates from other peers. As such, there is no central instance that manages how conflicts are resolved, and the mechanism by which all of this operates is completely decentralized.

A quick reintroduction to WebRTC

During our Tag1 Team Talks episode about WebRTC with Yjs, we extensively discussed the topic of partitioning. When it comes to decentralized collaboration, partitioning ensures that when we are connected across an ocean and have our connection severed, we can still regain synchronicity according to available features in Yjs. After all, when we are collaborating across the Atlantic and see our internet cut, we are, from the relative standpoint of each other’s perspectives, offline.

Fabian contends that this is the key reason why Yjs already handles the offline case, because it must. After all, every peer-to-peer application operates in such a way that it is able to synchronize changes successfully once the connection is reestablished. In a peer-to-peer context, if we close all of our connections with other peers, the document is never lost, because every peer on the network has a copy of the information in question.

Suppose, for instance, that we are collaborating on a document with an ocean separating us with a network connection that goes offline. Discouraged, we all exit our browsers due to the lack of connectivity. Once we regain connectivity and restart our collaboration workflows, there are possible conflicts that arise. If Fabian has his own copy as a peer of the content in question, but Kevin has another copy, this means that we can collaborate all together once again upon reconnection.

Fabian attests during our Tag1 Team Talks episode that this is precisely why IndexedDB so effectively complements the y-webrtc approach described in previous renditions of the show. In a sense, it is a must-have requirement that we not only have the capacity to synchronize data in real time, which is highly effective for nontraditional data such as video assets, but also have the ability to store any in-progress work that is currently available locally in our browsers.

But importantly, it is a rule that garbage always results from conflict resolution. Consider, for instance, the following scenario. If Abby has created a large amount of content, and Bilal has subsequently created a large quantity of content, there is inevitably a considerable amount of garbage generated, and it is difficult to know how a proper synchronization process can occur. In other words, in Git terms, if there are merge conflicts, how do we resolve them? Consider, for example, a scenario in which Abby makes a change in the document that resolves an issue, but Bilal makes another change while disconnected that also corrects the same problem. This results in a conflict and thus garbage. In the next blog post in this series, we’ll resolve how to correct this situation.

Conclusion

In this day and age, it seems there is no shortage of requests for web developers to implement advanced solutions that enable a decentralized, peer-to-peer, offline approach to real-time collaboration that is not only performant but also handles conflict resolution in a graceful way. Thanks to the use of Yjs and IndexedDB, two emerging web technologies that are gaining prominence in the offline collaboration world, this scenario is not only possible but highly realistic. In addition, combined with the features provided by Service Workers, Web Workers, and WebRTC, you too can produce a formidable architecture for offline-first real-time peer-to-peer collaboration.

In this fifth installment of our multi-part blog series covering how to build offline-first applications with the support of Yjs, we covered how WebRTC and Yjs’ connector for WebRTC, y-webrtc, figure in the equation when it comes to facilitating peer-to-peer offline collaboration. We also began to touch on the subject of garbage collection and how the open-source real-time collaboration framework Yjs is particularly well-positioned to handle all problems of offline collaboration that can arise. In the sixth installment of this series, we dive into garbage collection more deeply and discuss other important performance considerations.

Special thanks to Fabian Franz, Kevin Jahns, and Michael Meyers for their feedback during the writing process.

For more Yjs content, see Yjs - Add real-time collaboration to any application.

Part 1 | Part 2 | Part 3 | Part 4 | Part 5 | Part 6 | Part 7


Photo by JJ Ying on Unsplash