What is MongoDB?
MongoDB is a document-oriented database that uses the C++ programming language. It differs from relational databases in that objects are stored in documents as opposed to tables. All data about the same object is found in the same document, which can be as complex as necessary through the use of nested data. MongoDB is innately schemaless, making it a popular choice for projects using large amounts of unstructured data.
History of MongoDB
In 2007 a group of DoubleClick executives founded 10gen. They tried to get a variety of projects off the ground but kept running into scalability problems. After several failures they switched strategies and began work on an easily scalable application stack.
In the early days they simply called their new database “p” for platform, but by 2008 it needed a name. Insiders like to joke that MongoDB got its name from a Blazing Saddles character. DoubleClick do-founder Dwight Merriman admitted that they chose the name from a deck generated by a naming consultant because it meant “big”. Regardless, the name gained popularity as the database did, and in 2013 10gen changed their name to MongoDB, Inc.
- BSON/JSON format
- Load balancing
- Horizontal scaling through Sharding
- Native replication
- Automatic failover
- Supports dynamic queries
- Data duplication
- High availability
The schemaless structure of MongoDB allows users to store a lot of unstructured data (emails, videos, social media posts, etc) and perform complicated operations on it. Since as much as 85% of corporate data is unstructured, that covers a lot of ground.
Because the data is typically stored in one document, MongoDB queries are much faster. Users who create complex relational structures inside their documents may see slower returns, but this is mainly the fault of the user trying to apply the wrong tool for a job.
Another reason for MongoDB’s speed is that it uses sharding to distribute datasets and loads across numerous machines. Data is automatically balanced among the machines to avoid an asymmetric load. Sharding bypasses physical limitations of a single machine for faster, more complex operations. It’s also the secret behind MongoDB’s scalability: users can increase capacity by simply adding machines.
MongoDB doesn’t support ACID transactions above the document level. That makes it less than ideal as a structure for applications “requiring multi-object commits with rollback”. MongoDB is not the best choice for accounting software or any other write-heavy application. (It does shine when an application is read-heavy, though.)
MongoDB has no native data validation; the user is entirely responsible for data integrity. Difficulty managing this has led to data loss scenarios for users without robust maintenance practices. In fact, maintenance in general is more demanding with MongoDB.
Real Life Applications
- Metlife, the insurance provider, uses MongoDB to compile millions of customer policies.
- MongoDB powers Scratchpad, Expedia’s innovative multi-platform travel planner.
- Cisco uses MongoDB to power its collaborative workspace.
MongoDB is currently ranked fifth among 330 database management systems. It’s capacity to handle unstructured data and lower infrastructure overhead have won it many loyal adherents. CEO Dev Ittycheria announced expansion into Europe, Asia, and Latin America this year, aiming to take market share from competitor Oracle.
However, Microsoft’s DocumentDB (now CosmosDB) has made a play for MongoDB’s customer base by offering support for their wire protocol. CosmosDB is too new to assess how many customers will be tempted by that; many clients prefer MongoDB as a more thoroughly-tested database.