Table of Contents

What makes a collaborative editing solution robust?
Decentralized vs. Centralized Architectures in Collaborative Editing
Operational Transformation and Commutative Replicated Data Types (CRDT)
Why Tag1 Selected Yjs
Conclusion

In today’s editorial landscape, content creators can expect not only to touch a document countless times to revise and update content, but also to work with other writers from around the world, often on distributed teams, to finalize a document collaboratively and in real time. For this reason, collaborative editing, or shared editing, has become among the most essential and commonly requested features for any content management solution straddling a large organization.

Collaborative editing has long existed as a concept outside the content management system (CMS). Consider, for example, Google Docs, a service that many content creators use to write content together before copy-and-pasting the text into form fields in a waiting CMS. But in today’s highly demanding CMS landscape, shouldn’t collaborative editing be a core feature of all CMSs out of the box? Tag1 Consulting agreed, and the team decided to continue its rich legacy in CMS innovation by making collaborative editing a reality.

Recently, the team at Tag1 Consulting worked with the technical leadership at a top Fortune 50 company to evaluate solutions and ultimately implement Yjs as the collaborative editing solution that would successfully govern content updates across not only tens of thousands of concurrent users but also countless modifications that need to be managed and merged so that content remains up to date in the content management system (CMS). This process was the subject of our inaugural Tag1 Team Talk, and in this blog post, we’ll dive into some of the common and unexpected requirements of collaborative editing solutions, especially for an organization operating at a large scale with equally large editorial teams with diverse needs.

What Makes a Collaborative Editing Solution Robust?

Collaborative editing, simply put, is the ability for multiple users to edit a single document simultaneously without the possibility of conflicts arising due to concurrent actions—multiple people writing and editing at the same time can’t lead to a jumbled mess. At minimum, all robust collaborative editing solutions need to be able to merge actions together such that every user ends up with the same version of the document, with all changes merged appropriately.

Collaborative editing requires a balancing act between clients (content editors), communication (whether between client and server or peer-to-peer), and concurrency (resolving multiple people’s simultaneous actions). But there are other obstacles that have only emerged with the hyperconnectivity of today’s global economy: The ability to edit content offline or on slow connections, for instance, as well as the ability to resynchronize said content, is a baseline requirement for many distributed teams.

The provision of a robust edit history is also uniquely difficult in collaborative editing. Understanding what occurs when an “Undo” or “Redo” button is clicked in single editors without the need for real-time collaboration is a relatively trivial question. However, in collaborative editors where synchronization across multiple users’ changes and batch updates from offline editing sessions need to be reflected in all users’ content, the definition of undo and redo actions becomes all the more challenging.

Moreover, real-time collaborative editing solutions also need to emphasize the collaboration element itself and afford users the ability to understand where other users’ cursors are located in documents. Two of the most fundamental features of any collaborative editing solution in today’s landscape are indications of presence and remote cursors, both characteristics of free-to-use collaborative editing solutions such as Google Docs.

Presence indications allow for users in documents to see who else is currently actively working on the document, similar to the user thumbnails in the upper-right corner of a typical Google Docs document. Remote cursors, meanwhile, indicate the content a user currently has selected or the cursor location at which they last viewed or edited text.

During Tag1’s evaluation of the collaborative editing landscape, the team narrowed the field of potential solutions down to these four: Yjs, ShareDB, CKEditor, and Collab. See below for a comparison matrix of where these real-time collaborative editing solutions stand, with further explanation later in the post.

 

 

Yjs

ShareDB

CKEditor

Collab

License

MIT

MIT

Proprietary (On-Prem Hosted)

MIT

Offline editing

Decentralized

Network-agnostic

Shared cursors

Presence (list of active users)

Commenting

Sync after server data loss

✖ (sync error)

✖ (Unsaved changes are lost)

Can implement other collaborative elements (e.g., drawing)

Scalable


(Many servers can handle the same document)

✔ 

(Locking via underlying DB)


(Hosted)

(Needs central source of truth - a single host for each doc. Puts additional constraints on how doc updates are propagated to “the right server”).

Currently supported editors 

ProseMirror

Quill
Monaco
CodeMirror
Ace

Quill
CodeMirror
Ace

CKEditor

ProseMirror

Implementation

CRDT

OT

OT

Reconciliation

Demos

Editing, Drawing,

3D model shared state

Sync

Editing

Editing

Editing in Tip Tap


Decentralized vs. Centralized Architectures in Collaborative Editing

Whereas the features within a collaborative editor are of paramount importance to its users, the underlying architecture can also play a key role in determining a solution’s robustness. For instance, many long-standing solutions require that all document operations ultimately occur at a central server instance, particularly in the case of ShareDB and Collab.

While a centralized server does confer substantial advantages as a single source of truth for content state, it is also a central source of failure. If the server fails, the most up-to-date state of the content is no longer accessible, and all versions of the content will become stale. For mission-critical content needs where staleness is unacceptable, centralized servers are recipes for potential disaster.

Furthermore, centralized systems are generally much more difficult to scale, which is an understandably critical requirement for a large organization operating at considerable scale. Google Docs, for example, has an upper limit on users who can actively collaborate. With an increasing number of users, the centralized system will start to break down, and this can only be solved with progressively more complex resource allocation techniques.

For these reasons, Tag1 further narrowed the focus to decentralized approaches that allow for peer-to-peer interactions, namely Yjs, which ensures that documents will always remain in sync irrespective, as document copies live on each user’s own instance as opposed to on a failure-prone central server. This means users can always refer to someone else’s instance in lieu of a single authoritative source that may not be available. Resource allocation is also much easier with Yjs because many servers can store and update the same document. It is significantly easier to scale insofar as there is essentially no limit on the number of users that can work together.

Operational Transformation and Commutative Replicated Data Types

The majority of real-time collaborative editors, such as Google Docs, EtherPad, and CKEditor, use a strategy known as operational transformation (OT) to realize concurrent editing and real-time collaboration. In short, OT facilitates consistency maintenance and concurrency control for plain text documents, including features such as undo/redo, conflict resolution, and tree-structured document editing. Today, it is used to power collaboration features in Google Docs and Apache Wave.

Nonetheless, OT comes with certain disadvantages, namely the fact that existing OT frameworks are very tailored to the specific requirements of a certain application (e.g. rich text editing) whereas Yjs does not assume anything about the communication protocol on which it is implemented and works with a diverse array of applications. Yjs leverages commutative replicated data types (CRDT), used by popular tools like Apple Notes, Facebook’s Apollo, and Redis, among others. As Joseph Gentle, a former engineer on the Google Wave product and creator of ShareDB, once wrote:

“Unfortunately, implementing OT sucks. There’s a million algorithms with different tradeoffs, mostly trapped in academic papers. The algorithms are really hard and time consuming to implement correctly. […] Wave took 2 years to write and if we rewrote it today, it would take almost as long to write a second time.”

The key distinction between OT and CRDT is as follows: Consider an edit operation in which a user inserts a word at character position 5 in the document. In operational transformation, if another user adds 5 characters to the start of the document, the insertion is moved to position 10. While this is highly effective for simple plain text documents, complex hierarchical trees such as the document object model (DOM) present significant challenges. CRDT, meanwhile, assigns a unique identifier to every character, and all state transformations are applied relatively to objects in the distributed system. Rather than identifying the place of insertion based on character count, the character at that place of insertion retains the same identifier regardless of where it is relocated to within the document. As one benefit, this process simplifies resynchronization after offline editing.

If you want to dive deeper, the Conclave real-time editor (which is no longer maintained and therefore was not considered in our analysis) has another great high-level writeup explaining OT and CRDT. Additionally, you can watch or listen to our deep dive on OT vs. CRDT as part of our recent Tag1 Team Talk, “A Deep Dive into Yjs - Part 1”.

Why Tag1 Selected Yjs

While other solutions such as ShareDB, CKEditor, and ProseMirror Collab are well-supported and very capable solutions in their own right, these technologies didn’t satisfy the specific requirements of our client’s project. For instance, ShareDB relies on the same approach as Google Docs, operational transformation (OT), rather than relying on the comparatively more robust CRDT (at least for our requirements). CKEditor, one of the most capable solutions available today, relies on closed-source and proprietary dependencies. Leveraging an open-source solution was strongly preferred by our client for many reasons, foremost among them to meet any potential need by enhancing the software themselves, and they didn’t want to be tied to a single vendor for what they saw as a core technology to their application. Finally, ProseMirror’s Collab module does not guarantee conflict resolution, which can lead to merge conflicts in documents.

Ultimately, the Tag1 team opted to select Yjs, an implementation of commutative replicated data types (CRDT), due to its network agnosticism and conflict resolution guarantees. Not only can Yjs support offline and low-connectivity editing, it can also store documents in local databases on user devices (such as through IndexedDB) to ensure full availability without a stable internet connection. Because Yjs facilitates concurrent editing on tree structures, not just text, it integrates well with view libraries such as React. Also compelling is its support for use cases beyond simple text editing, including collaborative drawing and state-sharing for 3D models. Going beyond text editing to implement other collaborative features is an important future goal for the project.

Furthermore, Yjs performs transactions on objects across a distributed system rather than on a centralized server, the problem of a single point of failure can be avoided, and it’s extremely scalable with no limitations on the number of concurrent collaborators. Moreover, Yjs is one of the only stable and fully tested implementations of CRDT available, while many of its counterparts leverage OT instead.

Finally, because Yjs focuses on providing decentralized servers and connector technology rather than prescribing the front-end editor, there is no dependency on a particular rich text editor, and organizations can opt to swap out the editor in the future with minimal impact on other components in the architecture. It also makes it easy to use multiple editors. For instance, our project uses ProseMirror for collaborative rich text editing and CodeMirror for collaborative Markdown editing (and other text formats can be added easily).

Conclusion

Real-time collaborative editing surfaces unique difficulties for any organization seeking to implement content workflows at a large scale. Over the course of the past decade, many new solutions have emerged to challenge the prevailing approaches dependent on operational transformation. Today, for instance, offline editing and effective conflict resolution on slow connections are of paramount importance to content editors and stakeholders alike. These key requirements have led to an embrace of decentralized, peer-to-peer approaches to collaborative editing rather than a failure-prone central server.

Tag1 undertook a wide-ranging evaluation of available solutions for collaborative editing, including Yjs, ProseMirror’s Collab module, ShareDB, and CKEditor. In the end, Yjs emerged as the winner due to its implementation of CRDT, as well as its scalability and emphasis on network agnosticism and conflict resolution, both areas where the other solutions sometimes fell short. While any robust evaluation of these solutions takes ample time, it’s our hope at Tag1 that our own assessment guides your own thinking as you delve into real-time collaborative editing for your own organization.

Special thanks to Fabian Franz, Kevin Jahns, Michael Meyers, and Jeffrey Gilbert for their feedback during the writing process.