Linux Administration: Deploying MongoDB

MongoDB is a NoSQL document-oriented database that uses JSON-like documents with dynamic schemas called BSON. MongoDB provides high availability with replica sets that consist of two or more copies of the data. Each replica set member may act in the role of primary or secondary. All writes and reads are done on the primary replica by default, but since all secondaries contain copy of the data (although it might be stale, since data is only eventually consistent) reads can be load balanced to the secondaries. When a primary replica fails, the replica set automatically conducts an election process to determine which secondary should become the primary.
MongoDB scales horizontally using sharding. The user chooses a shard key, which determines how the data in a collection will be distributed. The data is split into ranges (based on the shard key) and distributed across multiple shards. (A shard is a master with one or more slaves.). [1]
MongoDB provides consistency (if reads are performed on the primary and not the secondaries) and can be roughly placed on the CAP theorem triangle as CP though such classifications are rather rough, and it should only help answering the questoin "what is the default behaviour of the distributed system when a partition happens" [2]:

In this post I'll deploy a three node replica set with sharding.

File: gistfile1.txt ------------------- [mongodb-n01,2,3]$ lsb_release -d Description: Ubuntu 14.04.4 LTS [mongodb-n01,2,3]$ apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv EA312927 [mongodb-n01,2,3]$ echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.2 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.2.list [mongodb-n01,2,3]$ apt-get update && apt-get install -y mongodb-org && pkill mongo [mongodb-n01,2,3]$ sudo -u mongodb /usr/bin/mongod --config /etc/mongod.conf --fork

With the server installed on all nodes, connect to one of the nodes and set up the replica set:

File: gistfile1.txt ------------------- [mongodb-n01]$ mongo 10.183.2.172:27017 MongoDB shell version: 3.2.6 connecting to: 10.183.2.172:27017/test Welcome to the MongoDB shell. For interactive help, type "help". For more comprehensive documentation, see http://docs.mongodb.org/ Questions? Try the support group http://groups.google.com/group/mongodb-user > cfg = { ... '_id':'testReplicaSet', ... 'members':[ ... {'_id':0, 'host': '10.183.2.172:27017'}, ... {'_id':1, 'host': '10.183.2.198:27017'}, ... {'_id':2, 'host': '10.183.1.69:27017'} ... ] ... } { "_id" : "testReplicaSet", "members" : [ { "_id" : 0, "host" : "10.183.2.172:27017" }, { "_id" : 1, "host" : "10.183.2.198:27017" }, { "_id" : 2, "host" : "10.183.1.69:27017" } ] } > rs.initiate(cfg) { "ok" : 1 } testReplicaSet:OTHER> rs.status() { "set" : "testReplicaSet", "date" : ISODate("2016-05-31T15:09:30.512Z"), "myState" : 1, "term" : NumberLong(1), "heartbeatIntervalMillis" : NumberLong(2000), "members" : [ { "_id" : 0, "name" : "10.183.2.172:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 450, "optime" : { "ts" : Timestamp(1464707267, 2), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-05-31T15:07:47Z"), "infoMessage" : "could not find member to sync from", "electionTime" : Timestamp(1464707267, 1), "electionDate" : ISODate("2016-05-31T15:07:47Z"), "configVersion" : 1, "self" : true }, { "_id" : 1, "name" : "10.183.2.198:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 114, "optime" : { "ts" : Timestamp(1464707267, 2), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-05-31T15:07:47Z"), "lastHeartbeat" : ISODate("2016-05-31T15:09:29.107Z"), "lastHeartbeatRecv" : ISODate("2016-05-31T15:09:29.356Z"), "pingMs" : NumberLong(0), "syncingTo" : "10.183.2.172:27017", "configVersion" : 1 }, { "_id" : 2, "name" : "10.183.1.69:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 114, "optime" : { "ts" : Timestamp(1464707267, 2), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-05-31T15:07:47Z"), "lastHeartbeat" : ISODate("2016-05-31T15:09:29.106Z"), "lastHeartbeatRecv" : ISODate("2016-05-31T15:09:29.356Z"), "pingMs" : NumberLong(0), "syncingTo" : "10.183.2.172:27017", "configVersion" : 1 } ], "ok" : 1 } testReplicaSet:PRIMARY> testReplicaSet:SECONDARY> rs.printSlaveReplicationInfo() source: 10.183.2.198:27017 syncedTo: Tue May 31 2016 16:39:43 GMT+0000 (UTC) 0 secs (0 hrs) behind the primary source: 10.183.1.69:27017 syncedTo: Tue May 31 2016 16:39:43 GMT+0000 (UTC) 0 secs (0 hrs) behind the primary

Add the following replication config option to /etc/mongod.conf on all nodes and restart mongo:

File: gistfile1.txt ------------------- [mongodb-n01,2,3]$ cat /etc/mongod.conf | grep repl replication: replSetName: testReplicaSet [mongodb-n01,2,3]$ pkill mongo && sudo -u mongodb /usr/bin/mongod --config /etc/mongod.conf --fork

With this we now have a simple 3 node replica set. Connect to the master and insert some data:

File: gistfile1.txt ------------------- [mongodb-n01]$ mongo 10.183.2.172 testReplicaSet:PRIMARY> db.test_db.insert({_id:1, value:'test'}) WriteResult({ "nInserted" : 1 }) testReplicaSet:PRIMARY> db.test_db.findOne() { "_id" : 1, "value" : "test" } testReplicaSet:PRIMARY> [mongodb-n02]$ mongo 10.183.2.198 testReplicaSet:SECONDARY> db.test_db.findOne() 2016-05-31T15:15:20.486+0000 E QUERY [thread1] Error: error: { "ok" : 0, "errmsg" : "not master and slaveOk=false", "code" : 13435 } : ... testReplicaSet:SECONDARY> rs.slaveOk(true) testReplicaSet:SECONDARY> db.test_db.findOne() { "_id" : 1, "value" : "test" } testReplicaSet:SECONDARY>

MongoDB also supports tagged replica sets. Tag sets allow you to target read operations to specific members of a replica set and specifies whether a write operation has succeeded. Write concern allows your application to detect insertion errors or unavailable mongod instances. Read preferences consider the value of a tag when selecting a member to read from.

To enable tags:

File: gistfile1.txt ------------------- testReplicaSet:PRIMARY> var conf = rs.conf() testReplicaSet:PRIMARY> conf.members[0].tags = {'datacenter': 'iad3', 'rack_id': '342543'} { "datacenter" : "iad3", "rack_id" : "342543" } testReplicaSet:PRIMARY> conf.members[1].tags = {'datacenter': 'iad3', 'rack_id': '342544'} { "datacenter" : "iad3", "rack_id" : "342544" } testReplicaSet:PRIMARY> conf.members[2].tags = {'datacenter': 'ord1', 'rack_id': '733421'} { "datacenter" : "ord1", "rack_id" : "733421" } testReplicaSet:PRIMARY> conf.settings = { ... 'getLastErrorModes' : { ... 'MultiDC':{datacenter : 2}, ... 'MultiRack':{rack_id : 2} ... } ... } { "getLastErrorModes" : { "MultiDC" : { "datacenter" : 2 }, "MultiRack" : { "rack_id" : 2 } } } testReplicaSet:PRIMARY> rs.reconfig(conf) { "ok" : 1 } testReplicaSet:PRIMARY> testReplicaSet:PRIMARY> rs.config() { "_id" : "testReplicaSet", "version" : 2, "protocolVersion" : NumberLong(1), "members" : [ { "_id" : 0, "host" : "10.183.2.172:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { "datacenter" : "iad3", "rack_id" : "342543" }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 1, "host" : "10.183.2.198:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { "datacenter" : "iad3", "rack_id" : "342544" }, "slaveDelay" : NumberLong(0), "votes" : 1 }, { "_id" : 2, "host" : "10.183.1.69:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : { "datacenter" : "ord1", "rack_id" : "733421" }, "slaveDelay" : NumberLong(0), "votes" : 1 } ], "settings" : { "chainingAllowed" : true, "heartbeatIntervalMillis" : 2000, "heartbeatTimeoutSecs" : 10, "electionTimeoutMillis" : 10000, "getLastErrorModes" : { "MultiRack" : { "rack_id" : 2 }, "MultiDC" : { "datacenter" : 2 } }, "getLastErrorDefaults" : { "w" : 1, "wtimeout" : 0 }, "replicaSetId" : ObjectId("574da8b80c314d4ba0a53190") } } testReplicaSet:PRIMARY>

In the configuration above we ensure the data gets replicated to at least one server in each DC and the writes propagate to at least two racks in any DC.

The configuration so far does not contain any user authentication or authorization. To enable it let's create an admin user, and two regular users that can read and read/write:

File: gistfile1.txt ------------------- [mongodb-n01]$ mongo 10.183.2.172 testReplicaSet:PRIMARY> use admin switched to db admin testReplicaSet:PRIMARY> db.createUser({ ... user:'admin', pwd:'supersercretpassword', ... customData:{desc:'The admin user for the admin database'}, ... roles:['readWrite','dbAdmin','clusterAdmin'] ... }) Successfully added user: { "user" : "admin", "customData" : { "desc" : "The admin user for the admin database" }, "roles" : [ "readWrite", "dbAdmin", "clusterAdmin" ] } testReplicaSet:PRIMARY> show dbs admin 0.000GB local 0.000GB test 0.000GB testReplicaSet:PRIMARY> show collections system.users system.version testReplicaSet:PRIMARY> db.createUser({ ... user:'read_user', pwd:'readuserpassword', ... customData:{desc:'The read only user for the test database'}, ... roles:['read'] ... }) Successfully added user: { "user" : "read_user", "customData" : { "desc" : "The read only user for the test database" }, "roles" : [ "read" ] } testReplicaSet:PRIMARY> db.createUser({ ... user:'write_user', pwd:'writeuserpassword', ... customData:{desc:'The read/write user for the test database'}, ... roles:['readWrite'] ... }) Successfully added user: { "user" : "write_user", "customData" : { "desc" : "The read/write user for the test database" }, "roles" : [ "readWrite" ] }

To ensure only trusted servers can join the replica set, let's create a unique key file and reconfigure the replica set, by adding the "security" section in the config:

File: gistfile1.txt ------------------- [mongodb-n01,2,3]$ cat /etc/mongod.conf | grep -A4 security security: keyFile: /etc/mongo.key clusterAuthMode: keyFile authorization: enabled [mongodb-n01]$ openssl rand -hex 10 > /etc/mongo.key # copy to node 2 and 3 [mongodb-n01,2,3]$ pkill mongo && sudo -u mongodb /usr/bin/mongod --config /etc/mongod.conf --fork [mongodb-n01,2,3]$ mongo 10.183.2.172:27017 -u admin -p supersercretpassword --authenticationDatabase admin MongoDB shell version: 3.2.6 connecting to: 10.183.2.172:27017/test testReplicaSet:PRIMARY>

To set up a sharded cluster I'll use the 3 MongoDB data servers as setup previously, 3 config servers and one mongos (the LB node that will distribute reads and writes to the shards). To setup the 3 config servers:

File: gistfile1.txt ------------------- [mongo-config-n01,2,3] echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.2 multiverse" > /etc/apt/sources.list.d/mongodb-org-3.2.list [mongo-config-n01,2,3] apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv EA312927 [mongo-config-n01,2,3] apt-get update && apt-get install -y mongodb-org [mongo-config-n01,2,3] pkill mongo && sudo -u mongodb /usr/bin/mongod --config /etc/mongod.conf --fork [mongo-config-n01,2,3] cat /etc/mongod.conf | egrep -A2 "replication|sharding" replication: replSetName: shardReplicaSet sharding: clusterRole: configsvr [mongo-config-n01]$ mongo 10.183.3.6 MongoDB shell version: 3.2.6 connecting to: 10.183.3.6/test > rs.initiate( { ... _id: "shardReplicaSet", ... configsvr: true, ... members: [ ... { _id: 0, host: "10.183.3.6:27017" }, ... { _id: 1, host: "10.183.3.10:27017" }, ... { _id: 2, host: "10.183.3.46:27017" } ... ] ... } ) { "ok" : 1 } shardReplicaSet:PRIMARY> rs.status() { "set" : "shardReplicaSet", "date" : ISODate("2016-05-31T19:27:31.272Z"), "myState" : 1, "term" : NumberLong(1), "configsvr" : true, "heartbeatIntervalMillis" : NumberLong(2000), "members" : [ { "_id" : 0, "name" : "10.183.3.6:27017", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 1122, "optime" : { "ts" : Timestamp(1464722365, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-05-31T19:19:25Z"), "electionTime" : Timestamp(1464721787, 2), "electionDate" : ISODate("2016-05-31T19:09:47Z"), "configVersion" : 2, "self" : true }, { "_id" : 1, "name" : "10.183.3.10:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 485, "optime" : { "ts" : Timestamp(1464722365, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-05-31T19:19:25Z"), "lastHeartbeat" : ISODate("2016-05-31T19:27:29.854Z"), "lastHeartbeatRecv" : ISODate("2016-05-31T19:27:26.689Z"), "pingMs" : NumberLong(0), "configVersion" : 2 }, { "_id" : 2, "name" : "10.183.3.46:27017", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 485, "optime" : { "ts" : Timestamp(1464722365, 1), "t" : NumberLong(1) }, "optimeDate" : ISODate("2016-05-31T19:19:25Z"), "lastHeartbeat" : ISODate("2016-05-31T19:27:30.029Z"), "lastHeartbeatRecv" : ISODate("2016-05-31T19:27:26.702Z"), "pingMs" : NumberLong(1), "configVersion" : 2 } ], "ok" : 1 } shardReplicaSet:PRIMARY>

To configure the mongos server:

File: gistfile1.txt ------------------- [mongos-n01] echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.2 multiverse" > /etc/apt/sources.list.d/mongodb-org-3.2.list [mongos-n01] apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv EA312927 [mongos-n01] apt-get update && apt-get install -y mongodb-org && pkill mongo [mongos-n01] cat /etc/mongos.conf | grep -A2 sharding sharding: configDB: shardReplicaSet/10.183.3.6:27017,10.183.3.10:27017,10.183.3.46:27017 [mongos-n01] sudo -u mongodb /usr/bin/mongos --config /etc/mongos.conf --fork [mongos-n01] cat /var/log/mongodb/mongod.log 2016-05-31T19:39:42.836+0000 I CONTROL [main] ***** SERVER RESTARTED ***** 2016-05-31T19:39:42.843+0000 I SHARDING [mongosMain] MongoS version 3.2.6 starting: pid=3243 port=27017 64-bit host=mongos-n01 (--help for usage) 2016-05-31T19:39:42.843+0000 I CONTROL [mongosMain] db version v3.2.6 2016-05-31T19:39:42.843+0000 I CONTROL [mongosMain] git version: 05552b562c7a0b3143a729aaa0838e558dc49b25 2016-05-31T19:39:42.843+0000 I CONTROL [mongosMain] OpenSSL version: OpenSSL 1.0.1f 6 Jan 2014 2016-05-31T19:39:42.843+0000 I CONTROL [mongosMain] allocator: tcmalloc 2016-05-31T19:39:42.843+0000 I CONTROL [mongosMain] modules: none 2016-05-31T19:39:42.843+0000 I CONTROL [mongosMain] build environment: 2016-05-31T19:39:42.843+0000 I CONTROL [mongosMain] distmod: ubuntu1404 2016-05-31T19:39:42.843+0000 I CONTROL [mongosMain] distarch: x86_64 2016-05-31T19:39:42.843+0000 I CONTROL [mongosMain] target_arch: x86_64 2016-05-31T19:39:42.843+0000 I CONTROL [mongosMain] options: { config: "/etc/mongos.conf", net: { bindIp: "10.183.2.10", port: 27017 }, processManagement: { fork: true }, sharding: { configDB: "shardReplicaSet/10.183.3.6:27017,10.183.3.10:27017,10.183.3.46:27017" }, systemLog: { destination: "file", logAppend: true, path: "/var/log/mongodb/mongod.log" } } 2016-05-31T19:39:42.844+0000 I SHARDING [mongosMain] Updating config server connection string to: shardReplicaSet/10.183.3.6:27017,10.183.3.10:27017,10.183.3.46:27017 2016-05-31T19:39:42.844+0000 I NETWORK [mongosMain] Starting new replica set monitor for shardReplicaSet/10.183.3.10:27017,10.183.3.46:27017,10.183.3.6:27017 2016-05-31T19:39:42.844+0000 I NETWORK [ReplicaSetMonitorWatcher] starting 2016-05-31T19:39:42.846+0000 I SHARDING [thread1] creating distributed lock ping thread for process mongos-n01:27017:1464723582:-798954996 (sleeping for 30000ms) 2016-05-31T19:39:42.852+0000 I ASIO [NetworkInterfaceASIO-ShardRegistry-0] Successfully connected to 10.183.3.6:27017 2016-05-31T19:39:42.854+0000 I ASIO [NetworkInterfaceASIO-ShardRegistry-0] Successfully connected to 10.183.3.6:27017 2016-05-31T19:39:42.861+0000 I ASIO [NetworkInterfaceASIO-ShardRegistry-0] Successfully connected to 10.183.3.46:27017 2016-05-31T19:39:47.051+0000 W SHARDING [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused by :: LockStateChangeFailed: findAndModify query predicate didn't match any lock document 2016-05-31T19:39:47.200+0000 I NETWORK [HostnameCanonicalizationWorker] Starting hostname canonicalization worker 2016-05-31T19:39:47.203+0000 I SHARDING [Balancer] about to contact config servers and shards 2016-05-31T19:39:47.204+0000 I SHARDING [Balancer] config servers and shards contacted successfully 2016-05-31T19:39:47.204+0000 I SHARDING [Balancer] balancer id: mongos-n01:27017 started 2016-05-31T19:39:47.225+0000 I ASIO [NetworkInterfaceASIO-ShardRegistry-0] Successfully connected to 10.183.3.10:27017 2016-05-31T19:39:47.236+0000 I NETWORK [mongosMain] waiting for connections on port 27017 2016-05-31T19:39:47.966+0000 I SHARDING [Balancer] distributed lock 'balancer' acquired for 'doing balance round', ts : 574de88334f6a5aefbb74e5d 2016-05-31T19:39:47.971+0000 I SHARDING [Balancer] distributed lock with ts: 574de88334f6a5aefbb74e5d' unlocked. 2016-05-31T19:39:57.984+0000 I SHARDING [Balancer] distributed lock 'balancer' acquired for 'doing balance round', ts : 574de88d34f6a5aefbb74e5e [mongos-n01] mongo 10.183.2.10 mongos> sh.addShard("testReplicaSet/10.183.2.172:27017") { "shardAdded" : "testReplicaSet", "ok" : 1 } mongos> sh.addShard("testReplicaSet/10.183.2.198:27017") { "shardAdded" : "testReplicaSet", "ok" : 1 } mongos> sh.addShard("testReplicaSet/10.183.1.69:27017") { "shardAdded" : "testReplicaSet", "ok" : 1 } mongos> mongos> sh.status() --- Sharding Status --- sharding version: { "_id" : 1, "minCompatibleVersion" : 5, "currentVersion" : 6, "clusterId" : ObjectId("574de87e34f6a5aefbb74e5b") } shards: { "_id" : "testReplicaSet", "host" : "testReplicaSet/10.183.1.69:27017,10.183.2.172:27017,10.183.2.198:27017" } active mongoses: "3.2.6" : 1 balancer: Currently enabled: yes Currently running: no Failed balancer rounds in last 5 attempts: 0 Migration Results for the last 24 hours: No recent migrations databases: { "_id" : "test", "primary" : "testReplicaSet", "partitioned" : false } mongos>

Make sure you have the correct users created on all servers:

File: gistfile1.txt ------------------- mongo> admin = db.getSiblingDB("admin") mongo> admin.createUser( { user: "konstantin", pwd: "changeit", roles: [ { role: "userAdminAnyDatabase", db: "admin" } ] } ) mongo> db.getSiblingDB("admin").auth("konstantin", "changeit" ) mongo> db.getSiblingDB("admin").createUser( { "user" : "cluster_admin", "pwd" : "changeit", roles: [ { "role" : "clusterAdmin", "db" : "admin" } ] } ) mongo>

And finally all configs at a minimum should look something like this:

File: gistfile1.txt ------------------- [mongodb-n01,2,3] cat /etc/mongod.conf storage: dbPath: /var/lib/mongodb journal: enabled: true systemLog: destination: file logAppend: true path: /var/log/mongodb/mongod.log net: port: 27017 bindIp: 10.183.2.172 security: keyFile: /etc/mongo.key clusterAuthMode: keyFile authorization: enabled replication: replSetName: testReplicaSet sharding: clusterRole: shardsvr [mongo-config-n01,2,3] cat /etc/mongod.conf storage: dbPath: /var/lib/mongodb journal: enabled: true systemLog: destination: file logAppend: true path: /var/log/mongodb/mongod.log net: port: 27017 bindIp: 10.183.3.6 security: keyFile: /etc/mongo.key clusterAuthMode: keyFile authorization: enabled replication: replSetName: shardReplicaSet sharding: clusterRole: configsvr [mongos-n01] cat /etc/mongod.conf systemLog: destination: file logAppend: true path: /var/log/mongodb/mongod.log net: port: 27017 bindIp: 10.183.2.10 security: keyFile: /etc/mongo.key sharding: configDB: shardReplicaSet/10.183.3.6:27017,10.183.3.10:27017,10.183.3.46:27017

Resources:

[1]. https://en.wikipedia.org/wiki/MongoDB
[2]. https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html

Pages

Deploying MongoDB