Skip to main content
Version: 1.0.0

MongoDB Benchmark Report

Legacy Documentation

This report was produced using the original Stroppy framework. The methodology, test infrastructure, and tooling differ from the current CLI version.

Introduction

MongoDB is a schema-less document-oriented database. A pioneer of the document data model, MongoDB defined much of what we understand under "NoSQL" databases today.

MongoDB's building block for horizontal scaling is the replica set — a set of nodes each having a full copy of the data, with one taking the leading role. When data outgrows a single replica set, another is added and data is redistributed (sharding).

Query routing and execution is managed by a load balancer and configuration servers. The configuration servers can form their own replica set for high availability. Different node roles make MongoDB cluster configuration trickier but more flexible than homogenous clusters.

Testing Goals

  • Check if MongoDB transactions are ACID
  • Check if hardware demands are moderate compared to PostgreSQL and FoundationDB
  • Check if the system scales up and out (doubling/quadrupling cores or shards)

General Results

note
  1. All tests used MongoDB defaults. Configuration tuning never had noticeable impact.
  2. All results refer to MongoDB 4.4. Version 5.0 produced results within error margin. Percona MongoDB Server showed no significant difference.
  3. The Community Operator doesn't support sharding. Tests #1-2 used Community Operator; subsequent tests used Percona Operator.
  4. Hash-based sharding was used (more even distribution, stable tail latency for non-range workloads).

Table 1: Key Test Results

#VCPU/NodeRAM (GB)HDD (GB)ShardsReplicasClientsNodesAccounts (M)Transfers (M)TPS
118100131631010340
228100131631010720
3481001364310101,843
44810013128310102,661
528100233261010427
661610012+arb12831001002,725
76161001312831001002,761
861210012+arb12831001002,592
96121001312831001002,551
1028100231286100100575
1128100231286100100445
124810083128121001001,171
138161002312861,00010443
1481610043128121,00010718
1581610043128121,00010670
164810083128121,00010947
1712401001312831,000103,272
18381002312861010653
19381002312861010534

Table 2: Latencies

#Median (ms)Max (ms)p99 (ms)
12240179
247793178
3351,037104
4451,807172
575800217
6472,763173
7462,424171
84926,515169
9502,066221
102224,052801
112876,3721,434
121094,148500
132887,2931,099
141784,137867
151905,917631
161355,578898
17391,824111
181954,990662
192392,972657

Table 3: Data Set Sizes

Accounts (M)Transfers (M)MongoDBFoundationDB
10103 GB3 GB
10010024 GB32 GB
1,00010136 GB127 GB

Key Findings

Scale up, not out. Transaction performance in MongoDB is bound to the throughput of a single node and does not improve much by adding replica sets.

  • For memory-resident workloads (tests #1-#4): doubling cores gives proportional speedup
  • Splitting the same compute across replica sets gives lower performance (test #5)
  • Once data exceeds RAM, performance drops discretely then stabilizes
  • Best result (#17): single replica set, 12 cores, 40GB RAM = 3,272 TPS
  • CPU-bound on master replica (>80% utilization)

Sharding insights:

  • Going from 2 to 4 shards with 4x CPU gave only 62% performance gain
  • XFS was ~22% faster than EXT4 for memory-bound workloads, 7% slower for disk-bound
  • Sharded clusters require separate balancer and config server replica sets (hidden cost)

Optimal concurrency: 128 clients (lower than PostgreSQL at 256 or FoundationDB at 512).

Failover Tests

Tests #18 and #19 used 2-replica-set clusters:

ScenarioResult
Restart any replica on any shard every 2 minFailed — couldn't complete data loading
Restart one predefined replica on one shard every 2 minFailed — couldn't complete data loading
Disconnect one predefined replica on one shard every 2 minFailed — TransactionExceededLifetimeLimitSeconds
Restart one predefined replica on different shards every 2 minPassed (test #18)
100% network degradation between two replicas on same shard every 2 minPassed (test #19)

Brief network partitions don't cause noticeable degradation. However, more aggressive failure scenarios prevented test completion.

Final Remarks

  1. Default durability must be elevated to w:majority with wtimeout on. Without this, inconsistent data was observed.
  2. Multi-document transactions must read from the master and use writeConcernMajorityJournalDefault.
  3. WiredTiger uses MVCC, but MongoDB's query layer adds multi-granularity locking (possibly legacy from MMAPv1).

Conclusions

Over 100 test runs in 3 months. Comparison with FoundationDB and PostgreSQL:

ComparisonMongoDBCompetitorRatio
Test #1 (minimal config)340 TPSFDB: 2,263 TPS6.7x slower
Test #5 (small scale-out)427 TPSFDB: ~2,200 TPS~5x slower
Test #7 (medium, best single RS)2,761 TPSFDB: 5,782 TPS2.1x slower
Test #7 (medium, best single RS)2,761 TPSPG: ~4,663 TPS1.7x slower
Test #17 (big, best overall)3,272 TPSFDB: 3,369 TPSComparable
Test #17 hardware36 cores, 120GB RAMFDB: 5 cores, 60GB RAM7x more resources

For multi-document transaction workloads, MongoDB scales best vertically. Even in large clusters, CPU and disk were near 100% utilization. Horizontal scaling showed sub-linear improvement insufficient to justify the added complexity.