MongoDB: From beginner to advanced optimization
Learn MongoDB setup, CRUD operations, indexing, aggregation, sharding, replication, and optimization with this guide—and boost your MongoDB performance fast.
Apr 28, 2025 • 7 Minute Read

MongoDB has become one of the go-to databases for developers who need flexibility, scalability, and performance. Whether you're just dipping your toes into NoSQL databases or looking to fine-tune your MongoDB setup for high-traffic applications, this guide will walk you through both the basics and advanced optimization techniques.
Getting started: What is MongoDB?
MongoDB is a NoSQL database that stores data in a flexible, JSON-like format. Unlike traditional relational databases that rely on rigid tables and schemas, MongoDB organizes data into documents, making it easier to scale and adapt to changing application needs.
Why choose MongoDB?
- Flexible schema: No need to define a rigid schema upfront; documents can evolve with your application.
- Highly scalable: Distributes data across multiple servers to handle massive workloads.
- Optimized performance: Features like indexing and aggregation ensure fast query execution.
- Reliable and fault-tolerant: Built-in replication keeps your data safe and accessible.
- Rich query capabilities: Supports complex queries, including full-text search and geospatial queries.
- Developer friendly: Supports multiple programming languages and frameworks.
Setting up MongoDB
Step 1: Install MongoDB
Download MongoDB:
Go to MongoDB's official website.
Select the version and platform (Windows, macOS, or Linux).
Follow the installation instructions for your operating system.
Start MongoDB:
On Linux and macOS, you can use the terminal to start MongoDB with the following command:
mongod --dbpath /your/data/directoryOn Windows, MongoDB should automatically start as a service after installation, or you can manually start it via the command prompt.
Note: If you don’t specify a --dbpath, MongoDB will use the default data directory (usually /data/db).
Step 2: Connect to MongoDB
- Mongo shell: After starting MongoDB, open a new terminal or command prompt and connect using the Mongo shell: mongo
- This opens the MongoDB shell and connects to the MongoDB server running on localhost with the default port 27017.
If you are using MongoDB Compass (the GUI tool), you can connect by launching Compass and entering your connection information (default is localhost:27017).


NOTE: For this tutorial, we will be using the MongoDB Compass shell. However, the commands are similar for those using Git Bash or CMD.
Step 3: Create a new database
Create a database:
- In MongoDB, you don’t need to explicitly create a database before using it. Instead, MongoDB will create the database when you first insert data into it.
To use a new database, simply issue the use command. For example, to create and use a database called myDatabase, run:
> use myDatabase;
MongoDB will switch to the myDatabase database. If it doesn't exist, it will be created when data is inserted.
< switched to db myDatabase
myDatabase >
2. Verify the database:
- To confirm that you are using the correct database, run: > db
- This will show the name of the database you're currently using. If myDatabase is selected, the output should be myDatabase.
- List all databases: After creating a database and inserting data, you can check the databases present: > show dbs
CRUD operations in MongoDB
1. Create collections and inserting documents:
MongoDB does not require you to predefine collections. A collection will automatically be created when you first insert data into it.
To insert data, you can use the insertOne() or insertMany() method.
Example: Create a collection named users and insert a single document:
> db.users.insertOne({ name: "John Doe", age: 30, city: "Chicago"})
What it does: The insertOne() method is used to add a single document to a collection. This command inserts a new user document with name, age, and city fields into the users collection.
Output:
< {
acknowledged: true,
insertedId: ObjectId('67c7488b463ebaf610a7d57b')
}
To insert multiple documents at once, use insertMany():
> db.users.insertMany([
{ name: "Alice", age: 25, city "New York" },
{ name: "Bob", age: 28, city: "Los Angeles" }
])
What it does: This command inserts two new user documents into the users collection.
Output:
く{
acknowledged: true,
insertedIds: {
'0': ObjectId ('67c748ed463ebaf610a7d57c'),
'1': ObjectId ('67c748ed463ebaf610a7d57d')
}
}
2. Read: Retrieving documents from a collection
To fetch all documents in a collection, use find(): > db.users. find()
What it does: This command retrieves and displays all user documents in a formatted JSON format.
Output:
< {
_id: Objectid '67c7488b463ebaf610a7d57b'),
name: 'John Doe',
age: 30,
city: 'Chicago'
}
{
_id: Objectid '67c748ed463ebaf610a7d57c'),
name: 'Alice', age: 25,
city: 'New York'
}
{
_id: Objectid('67c748ed463ebaf610a7d57d'),
name: 'Bob', age: 28,
city: 'Los Angeles"
}
Find specific documents: You can query for documents with specific conditions. For example, to find a user with the name "Alice":
› db.users.find({ name: "Alice" })
< {
_id: ObjectId ('67c748ed463ebaf610a7d57c') ,
name: 'Alice' , age: 25,
city: 'New York'
}
3. Update: Modifying documents in a collection
To update a specific document, use updateOne():
› db.users.update0ne({ name: "John Doe" }, { $set: { age: 31 } })
What it does: This command finds the first document where name is "John Doe" and updates the age field to 31.
Output:
< {
acknowledged: true,
insertedId: null,
matchedCount: 1,
modifiedCount: 1,
upsertedCount: 0
}
Update multiple documents: To increase the age of all users by 1 year:
> db.users.updateMany(
{},
{ $inc: { age: 1 } }
)
Output:
< {
acknowledged: true,
insertedId: null,
matchedCount: 3,
modifiedCount: 3,
upsertedCount: 0
}
4. Delete: Removing documents from a collection
To delete a single document, use deleteOne():
› db.users.deleteOne({ name: "John Doe" })
What it does: This command deletes the first document in the users collection where name is "John Doe."
Output:
< {
acknowledged: true,
deletedCount: 1
}
Delete multiple documents: To delete all users older than 30:
› db.users.deleteMany ({ age: { $gt: 30 } })
Output:
< {
acknowledged: true,
deletedCount: 0
}
Drop a collection: To delete a collection: › db.users.drop()
Output: < true
Drop a Database: To delete a database: > db.dropDatabase()
Output: ‹ { ok: 1, dropped: 'myDatabase' }
Note: If you're using CMD or Git Bash, simply type exit to leave the MongoDB shell.
Advanced MongoDB: Optimizing performance and scalability
1. Indexing for faster queries
Indexing is one of the most crucial techniques for optimizing MongoDB performance. It allows MongoDB to efficiently search for documents, significantly reducing query execution time.
- Compound indexes: Create indexes on multiple fields to optimize queries that filter on more than one field.
› db.users.createIndex({ name: 1, city: 1 })
Output: < name_1_city_1
- Text indexes: MongoDB provides text indexes for full-text search across string fields.
› db.users.createIndex({ name: "text", city: "text" })
Output: ‹ name_text_city_text - Hashed indexes: Hashed indexes are useful for sharding. They ensure that the documents are distributed evenly across shards.
> db.users.createIndex({ _id: "hashed" })
Output: < _id_hashed - Index TTL (Time to Live): Use TTL indexes to automatically delete documents after a certain time period. Useful for data like session information or logs.
› db. sessions. createIndexf createdAt: 1 }, l expireAfterSeconds: 3600 })
Output: ‹ createdAt_1
2. Aggregation for data processing
MongoDB’s aggregation framework allows you to process data in stages and perform complex transformations and computations efficiently.
- Pipeline stages: The aggregation pipeline allows you to build a series of transformations on your data.
› db.users. aggregate([
{ $match: { age: { $gte: 25 } } },
{ $group: { _id: "$city", totalUsers: { $sum: 1 } } },
{ $sort: { totalUsers: -1 } }
])
Output:
< {
_id: 'Chicago',
totalUsers: 2
}
{
_id: 'New York',
totalUsers: 1
}
{
_id: 'Los Angeles',
totalUsers: 1
}
- $lookup (Join-like Operation): MongoDB supports joins through $lookup to merge data from multiple collections.
> db.orders. aggregate([
{ $lookup: { from: "products", localField: "productId",
foreignField: "_id", as: "productDetails" }
])
3. Sharding for horizontal scaling
Sharding is MongoDB's method for distributing data across multiple servers, which is essential when dealing with very large data sets. Sharding is particularly useful for scaling horizontally and ensuring high availability.
Sharding keys: When you shard a collection, you need to select a shard key. The choice of shard key can significantly impact performance. It should have high cardinality and be frequently used in queries.
Note: Sharding commands must be executed from the mongos router in a sharded cluster:
sh.shardCollection("myDatabase.users", { city: 1 })
Balancing data: MongoDB automatically balances the data between shards to ensure that each shard stores an equal portion of the data. You can monitor and adjust the balancer’s behavior using sh.status() and sh.setBalancerState().
4. Replication for high availability
Replication in MongoDB provides fault tolerance by duplicating data across multiple nodes. MongoDB uses replica sets, where one node is the primary and others are secondaries.
Replica set configuration: A replica set can have one primary node and multiple secondary nodes. If the primary node fails, one of the secondaries can be automatically promoted to primary:
rs.initiate()
rs.add("secondaryNode1:27017")
Read preferences: In a replica set, you can configure read preferences to direct read queries to primary or secondary nodes:
db.getMongo().setReadPref("secondary")
5. Query optimization
Efficient queries are essential for good performance. Here are a few tips for optimizing your queries:
Use projection: Only retrieve the fields you need to reduce data transfer:
db.users.find({ city: "Chicago" }, { name: 1, age: 1 })
Explain queries: Use the .explain() method to analyze query performance and determine whether an index is being used efficiently:
db.users.find({ city: "Chicago" }).explain("executionStats")
Avoid $nin operator: The $nin operator can be inefficient, especially on large datasets. Whenever possible, use other operators like $in, $eq, or $ne for better performance.
Limit data scan: Use limit() to restrict the number of documents that MongoDB scans, which reduces query execution time:
db.users.find().limit(100)
6. Disk and memory management
WiredTiger storage engine: The default storage engine, WiredTiger, provides high-performance compression and caching. Monitor memory usage using MongoDB's mem.mapped and mem.resident metrics:
db.serverStatus().memDatabase size: Regularly monitor the database size to identify any unusually large collections that may need optimization (e.g., archiving or deleting old data): db.users.stats()
7. Monitoring and alerts
MongoDB provides tools like MongoDB Atlas and Ops Manager for continuous monitoring of your deployment. These tools offer insights into query performance, disk usage, memory utilization, and other important metrics. You can set up custom alerts for performance thresholds to proactively address issues before they become critical.
Final thoughts
MongoDB offers the power and flexibility modern applications demand—but to truly unlock its potential, you need more than just the basics. From smart indexing to sharding strategies and performance tuning, optimization is where good MongoDB setups become great.
Ready to go further?
Take your skills to the next level with the MongoDB Deep Dive course on Pluralsight, where you’ll explore advanced topics like data modeling, aggregation pipelines, and performance best practices.
Or, get hands-on experience with a lab like Configuring MongoDB Atlas with BigQuery Dataflow Templates, and learn how to integrate MongoDB with Google Cloud services in a real-world scenario.
Advance your tech skills today
Access courses on AI, cloud, data, security, and more—all led by industry experts.