Deploying MongoDB

MongoDB is a NoSQL document-oriented database that uses JSON-like documents with dynamic schemas called BSON. MongoDB provides high availability with replica sets that consist of two or more copies of the data. Each replica set member may act in the role of primary or secondary. All writes and reads are done on the primary replica by default, but since all secondaries contain copy of the data (although it might be stale, since data is only eventually consistent) reads can be load balanced to the secondaries. When a primary replica fails, the replica set automatically conducts an election process to determine which secondary should become the primary.
MongoDB scales horizontally using sharding. The user chooses a shard key, which determines how the data in a collection will be distributed. The data is split into ranges (based on the shard key) and distributed across multiple shards. (A shard is a master with one or more slaves.). [1]
MongoDB provides consistency (if reads are performed on the primary and not the secondaries) and can be roughly placed on the CAP theorem triangle as CP though such classifications are rather rough, and it should only help answering the questoin "what is the default behaviour of the distributed system when a partition happens" [2]:


In this post I'll deploy a three node replica set with sharding.
With the server installed on all nodes, connect to one of the nodes and set up the replica set:
Add the following replication config option to /etc/mongod.conf on all nodes and restart mongo:
With this we now have a simple 3 node replica set. Connect to the master and insert some data:
MongoDB also supports tagged replica sets. Tag sets allow you to target read operations to specific members of a replica set and specifies whether a write operation has succeeded. Write concern allows your application to detect insertion errors or unavailable mongod instances. Read preferences consider the value of a tag when selecting a member to read from.

To enable tags:
In the configuration above we ensure the data gets replicated to at least one server in each DC and the writes propagate to at least two racks in any DC.

The configuration so far does not contain any user authentication or authorization. To enable it let's create an admin user, and two regular users that can read and read/write:
To ensure only trusted servers can join the replica set, let's create a unique key file and reconfigure the replica set, by adding the "security" section in the config:
To set up a sharded cluster I'll use the 3 MongoDB data servers as setup previously, 3 config servers and one mongos (the LB node that will distribute reads and writes to the shards). To setup the 3 config servers:
To configure the mongos server:
Make sure you have the correct users created on all servers:
And finally all configs at a minimum should look something like this:

Resources:

[1]. https://en.wikipedia.org/wiki/MongoDB
[2]. https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html

No comments:

Post a Comment