Replication in MongoDB

June 18, 2021

Tags: IT Staff EN 2024

Quick Access

Currently, there are solutions capable of handling large volumes of data and users, such as social networks and banking systems, which must remain operational during any eventualities, such as power outages or network equipment failures. Imagine if such an incident occurred at your bank, causing all your money to disappear, or if all your photos on your favorite social network were suddenly erased.

In an environment prone to failures, these situations can arise and indeed do occur. However, for service providers, the impact is often transparent because they have implemented replication solutions and high availability systems to prevent such issues.

What is Replication?

Replication is the process of copying and maintaining objects in multiple databases to create a distributed system. This enhances performance and ensures the availability of applications by providing alternative access to data. All modern database management systems offer mechanisms for high availability and replication, making them useful in cases of failure. However, many, if not most, require outsourced tools to provide a robust and efficient mechanism. This can complicate matters for programmers, making the setup and testing process somewhat tedious.

Fortunately, the creators and contributors of MongoDB have made it relatively simple to achieve high availability and replication. In MongoDB, replication provides high availability and fault tolerance natively and transparently to the applications that use it as a database manager. This means that programmers do not need to understand what happens behind the scenes; they only need to ensure that the system is robust and efficient.

Understanding Replica Sets

Replication in MongoDB involves a collection of instances or nodes called a replica set. A minimum of three nodes is required to form a replica set, as this allows for a majority to be established during an election process in the event of a primary node failure. If there are only two nodes, there would be no majority to elect a new primary, preventing the system from continuing operations.

Types of Nodes

Regular Nodes: These nodes contain the data and can be either primary or secondary.
Arbiter Nodes: These nodes participate only in elections and do not store data. They help in choosing a new primary in case of a failure.
Delayed Nodes: These user-defined nodes lag behind other nodes and are used for disaster recovery.
Hidden Nodes: Primarily implemented for analytical purposes, these nodes are not used for serving read queries.

The Replication Process

MongoDB implements a special collection called the "oplog" (operation log) that keeps recovery logs for all operations that modify data. Modification operations are first executed on the primary node, and then the secondary nodes asynchronously copy and apply these operations from the oplog. All members of the replica set have a copy of the oplog in the collection local.oplog.rs to keep their databases updated. Heartbeats or pings are used to allow nodes to import records from each other.

In the case of a failure, if a node "A" returns as secondary after a significant period, and the oplog has progressed in the new primary "B", node "A" will copy all oplog data from "B" to stay synchronized. MongoDB also implements two types of synchronization:

Initial Synchronization: This loads new members with all the data in the set.
Replication: This keeps the nodes updated after the initial synchronization.

Write and Acknowledgment Operations

By default, MongoDB scripts are directed to the primary node, but configurations can be adjusted through parameters:

0: Does not expect confirmation of a successful write, always returning a successful status.
1: The default setting, returning a successful status once the primary node recognizes the inserts.
majority: Returns a successful status only if the majority of nodes acknowledge the write operation.
n: Returns a successful status only if a specified number of nodes recognize the write operation.

It is crucial to note that if there is no primary node, writing cannot be completed. There may be situations where MongoDB must roll back data if inconsistencies are detected between the previously active primary and the new primary.

Read Preferences

By default, MongoDB reads data from the primary to ensure strong consistency. However, this behavior can be modified according to the application's needs:

primary: Default mode; all read operations are directed to the primary.
primaryPreferred: Allows read operations from secondary nodes if the primary is unavailable.
secondary: All read operations are directed to secondary nodes.
secondaryPreferred: Reads from the primary if it is available; otherwise, reads from secondary nodes.
nearest: Reads from the member of the replica set with the lowest network latency, regardless of whether it is primary or secondary.

Considerations When Using Replica Sets

When using MongoDB applications, several aspects should be considered:

Node Lists: Drivers must know the members of the replica set to function correctly. These are initialized when loading the MongoDB drivers.
Read Preferences: Applications should be prepared to handle cases where data may be outdated.
Write Acknowledgment: If an error occurs during a write operation, the driver might wait indefinitely for a response, which could be critical.
Error Handling: Applications must be equipped to manage various exceptions, including network errors and MongoDB configuration issues.

Setting Up a Replica Set

To create a replica set from the MongoDB console, follow these steps:

Identify the members of the group by running the command on each node:
mongod --replSet "rs0";
Initiate the replica set from one of the member consoles:
rs.initiate();
Check the status of the replica set:
rs.conf();
You should see a result similar to:
{ "_id" : "rs0", "version" : 1, "members" : [ { "_id" : 1, "host" : "mongodb0.rootstack.com:27017" } ] }
Add remaining instances to the replica set:
rs.add("mongodb1.rootstack.com"); rs.add("mongodb2.rootstack.com"); rs.add("mongodbN.rootstack.com");
Verify that the replica set is fully functional by checking the status:
rs.status();

Conclusion

The high availability system in MongoDB is convenient, easy to deploy, robust, and efficient. It allows for a distributed environment without the need for complex configurations across numerous components. As the project continually improves and evolves with the needs of programmers, MongoDB has gained confidence as a trusted database management solution for companies across various sectors. In future posts, I will discuss the aggregation framework, a feature that enables SQL-like query operations in MongoDB, a NoSQL database.

We recommend you this video

Related Blogs

AI agents boost digital marketing automation

February 21, 2025

Tags: IT Staff EN 2024

This article will focus on explaining how AI agents are revolutionizing marketing automation, their tangible benefits.

How to handle boundary error in React

February 20, 2025

Tags: IT Staff EN 2024

React offers a robust solution: error boundaries. Keep reading this blog and we will explain how to handle error boundaries in React.

Node js vs JavaScript: Key differences

February 20, 2025

Tags: IT Staff EN 2024

Understanding the differences between Node.js vs JavaScript is essential to making informed decisions.

Common marketing automation mistakes

February 20, 2025

Tags: IT Staff EN 2024

One of the most common mistakes is starting a marketing automation project without a well-defined strategy.

How can email marketing be used for lead nurturing?

February 18, 2025

Tags: IT Staff EN 2024

Lead nurturing is a fundamental strategy to convert prospects into loyal customers.

This is how Rootstack guarantees a successful Mailchimp implementation

February 18, 2025

Tags: IT Staff EN 2024

The first step Rootstack takes in implementing Mailchimp is a deep analysis of the client's needs.

Replication in MongoDB

June 18, 2021

Table of contents

Quick Access

What is Replication?

Understanding Replica Sets

Types of Nodes

The Replication Process

Write and Acknowledgment Operations

Read Preferences

Considerations When Using Replica Sets

Setting Up a Replica Set

Conclusion

We recommend you this video

Related Blogs

AI agents boost digital marketing automation

How to handle boundary error in React

Node js vs JavaScript: Key differences

Common marketing automation mistakes

How can email marketing be used for lead nurturing?

This is how Rootstack guarantees a successful Mailchimp implementation