Aug 06, 2022 By Team YoungWonks *
Introduction
A database is an organized collection of data. This data can be organized into tables and indexed to easily find relevant information. SQL databases make use of such tables. Storing the data into tables is convenient but might not be a good choice if its data structure needs any changes later on.
MongoDB, developed by MongoDB Inc, is an open source document-based cross-platform high performance distributed database designed for ease of application development. Unlike relational databases, MongoDB does not make use of tables. Since it is a NoSQL database (also known as non-relational database), there are no complicated SQL queries. We can choose MongoDB over an SQL database as it provides us the flexibility to modify the schema of the database. The schema of a table in SQL is more rigid as compared to the data structure used in NoSQL databases like MongoDB.
Data Structure in MongoDB
With billions of people using Internet, it becomes a challenge to organize huge chunks of data collected from their devices. To understand how big data can be handled, let's dive into data structure used in MongoDB.
In MongoDB, we have collections of documents. Such documents use JavaScript Object Notation (JSON) that have key-value pairs. These values can be any of BSON types like integer, double, boolean, string, regular expression etc. The data model in MongoDB supports arrays to be stored as values. We can even nest arrays inside arrays to store data of complex form. Consider the following MongoDB document that stores data of a student studying in grade 8:
{"name": "Alex", "age": 13, "grade": 8}
To make sure that each mongo document in the database can be uniquely identified, MongoDB automatically adds an identifier to it, if not provided. So, when the above document is inserted into the mongo database, it will look similar to the following document:
{"_id": ObjectId("6285341b578d38d2f2c48da2"), "name": "Alex", "age": 13, "grade": 8}
In future, the data structure of such documents may change. For example, if we need to store address of the new students, then the new mongo documents should look like:
{"_id": ObjectId("5185341b578d38d2f2c48aa2"), "name": "Bob", "age": 14, "grade": 8, "address": "XYZ Street, California"}
If we use a relational database for the above scenario, we have to modify the data structure of the table by adding a new column and then add this record to the table. This proves that non-relational databases are better in scenarios when the data structure changes frequently.
If there are documents that have same keys, they should be placed in the same collection. On the other hand, if there are documents that have different set of keys, they are supposed to be in a different collection to organize and manage the document database efficiently.
So far, we have seen:
- NoSQL is a not a structured query language but a document-based query language.
- MongoDB is a schema-less NoSQL database that stores data in the form of json-like documents.
- Documents can have different number of key-value pairs.
- A unique identifier to each document is automatically added by MongoDB if not provided.
- Group of documents is called a collection
Local MongoDB vs Cloud MongoDB
When setting up a mongo database, we have two options to choose from:
- We can store data on the computer by installing required mongo database management system and running the database server locally. This way, the data is only available to computers that are on the same network. The mongo website offers community server and enterprise server that can be downloaded and installed on the computer based on the operating system and features required by your application.
- We can use MongoDB Atlas that offers cloud database as a service. The database access can be controlled by creating temporary users or permanent users with different levels of access. Also, we can put a list of IP addresses allowed to access the mongo database. MongoDB Atlas also offers auto-scalable clusters on AWS, Google Cloud and Microsoft Azure servers. Depending on the purpose of the apps, the developers can decide if they want to opt for a local database or a cloud database for their mongo documents.
Features of MongoDB
Alongside schema-less documents, MongoDB offers additional features. Some of those features are discussed in this section.
1. Replication
The replication process provides high availability of data by putting multiple copies of it on different servers. This data redundancy
- prevents loss of data in case of hardware failure
- improves read scaling since it has extra copies to read from.
MongoDB performs replication using replica set which is a group of at-least three mongod instances (mongod is the primary daemon process for a MongoDB system). In the replica set, there is a primary node and the others are secondary nodes. Data replicates from the primary node to the secondary nodes.
2. Durability
In the event of failure, MongoDB uses write ahead logging to on-disk journal files to provide durability. Journaling writes data
- first to the journal
- then to the core data files
Checkpoints are used to provided a consistent view of the database on the disk. If the database shuts down between checkpointing, then journals are used to recover the information from last checkpoint.
3. Sharding for Scalability
The method to distribute or partition data across multiple machines is called sharding. It is very useful when a machine cannot handle large amount of data. Sharding allows to scale the data horizontally (also called scale-out) by adding machines to
- share big data
- handle workloads by load balancing
4. Authentication
Authentication is a security feature in MongoDB to ensure that only authorized users can access the documents. The default security mechanism in MongoDB is Salted Challenge Response Authentication Mechanism (SCRAM) that asks the user to provide
- authentication database
- username
- password
5. Ad-Hoc Queries
At the time of schema design, it is not possible to know all the queries that will be performed by the users. An ad-hoc query is a command whose value depends on variables. Every time such a query is executed, the output might be different because of the variables of the query. MongoDB Atlas allows the developers to update and run ad-hoc queries for optimized real time results. MongoDB can be used to
- store geographical data (also known as geospatial data)
- execute geo queries and regular expression searches along with aggregations
6. Indexing
Indexes are intended to improve search speed. If not done correctly, there can be problems with query execution and load balancing. MongoDB offers a broad range of indexes that support complex access patterns.
7. Works with Hadoop
Companies mix the power of Hadoop with MongoDB. It is done as described below.
- Hadoop takes data from MongoDB and other sources.
- It blends the data and generates complex analytics and machine learning models.
- These results are then fed back to MongoDB to serve smartly with better predictions in the business.
MongoDB Use Cases
In this section, we will discuss some of the areas where we can use MongoDB as a database.
1. Single View Applications
Single view applications bring data from different sources to one repository to create a single view. For example, an application that shows single view of the user across products to know their purchasing habits and demands.
2. Internet of Things
IoT devices can be used to collect data. As we get data from more and more devices, an optimized database like MongoDB is required. Companies like Bosch have built their IoT suite on MongoDB.
3. Mobile Apps
Because of replication, sharding, durability and other amazing features of MongoDB, it is one of the best data platforms to build mobile apps.
Language Support
MongoDB supports almost all popular programming languages like C, C++, C#, Python, Java, JavaScript (through Node.js), PHP, Ruby, Go, Rust, Scala and Swift. It also has many community written API drivers that can be used to link MongoDB with other languages like R.
Summary
In this blog turned tutorial, we have discussed the importance of mongo databases. These open-source high performance NoSQL databases offer replication and high availability of data without asking for predefined schema. The applications connected to MongoDB store data in the form of documents that have key-value pairs. Such documents are grouped together to create collections. MySQL by Oracle, one of the most popular relational database management system (RDBMS), is used to manage data stored in rows and columns of tables. However, because of the rigid structure of tables, developers are moving towards non-relational databases like MongoDB. For detailed comparison between MongoDB and MySQL, refer to the following cheatsheet:
https://www.youngwonks.com/resources/mongodb-vs-mysql-cheatsheet
The MongoDB driver APIs are available for all the popular programming languages to let the developers connect apps of their choice to the mongo database. To access the database from multiple locations, one can use the MongoDB Atlas offered as a database service by MongoDB Inc.
Further Learning Opportunities with YoungWonks
To fully grasp the power and potential of MongoDB, especially for young learners eager to step into the world of coding, Coding Classes for Kids at YoungWonks provide an excellent starting point. For those specifically interested in one of the most popular programming languages today, our Python Coding Classes for Kids offer a deep dive into Python programming, building a strong foundation that makes learning technologies like MongoDB more accessible. Additionally, our Full Stack Web Development Classes cover not only databases such as MongoDB but also how they integrate within full-stack projects, ensuring students gain comprehensive knowledge in web development.
*Contributors: Written by Rohit Budania; Lead image by Shivendra Singh