What is Cosmos DB?
Cosmos DB (formally called Azure Cosmos DB) is an index-free database designed to manage data on a global scale. Its main goal is to make data available consistently wherever users are located. The database was created as a service more than a product, with features aimed at meeting the data needs of international businesses.
History of Cosmos DB?
In 2010, Microsoft engineer Dharma Shukla founded Cosmos DB to address the shortcomings in global data availability. Shukla tapped top talent for the project including Turing Award winner Leslie Lamport. As a stepping stone along the way Microsoft released DocumentDB, a proto-Cosmos product with less scalability and fewer consistency models (three to Cosmos’ five). Cosmos DB was launched at the Microsoft Build conference in May of 2017. All DocumentDB users have been transitioned to Cosmos.
- Horizontal scaling
- Graph database support
- Automatic indexing
- Multi-API support
- 4 data storage models: Graph, key-value pairs, documents, and column-family
- Web-based data exploration tool
- Reported 99.9% uptime availability
- Single-digit latency
- Backed by comprehensive service-level agreement (SLAs)
As new as it is, the scalability of Cosmos DB has it poised to become a leader in cloud-hosted databases. It’s geo-distributed across Azure’s data center regions, with custom distribution options available for each region. Software developers looking to build apps for international companies can make good use of Cosmos.
Cosmos supports graph databases, which are used by recommender engines and social networks to track their user’s behavior.
The database doesn’t lock users into one consistency model, either. Users can choose from strong, bounded staleness, session, consistent prefix, and eventual models depending on workload.
Total cost of ownership seems comparable to OSS options. In fact, Microsoft is claiming that Cosmos DB is at least five times more cost-effective. Time will tell whether that estimate holds true. For now users can enjoy an SLA promising 99.99% latency, consistency, availability, and throughput.
Cosmos shares most of the general weaknesses of NoSQLs. Specifically, it’s new, untested, and lacks the body of knowledge backing RDBMS systems. Microsoft has been using it pre-launch, but there are sure to be unexpected issues. What are they
Also, while operation is surprisingly intuitive there aren’t many users familiar with Cosmos yet. Sourcing talent may present a challenge. (This isn’t a concern for those planning to outsource their software development and maintenance, of course.)
There’s no native local testing environment, meaning users have to be online. DocumentDB had the same flaw. There is an Azure emulator, though, and a host of Azure features that address other weaknesses of NoSQL databases.
Microsoft Azure only offers Cosmos DB as a PaaS. While this does allow them to keep the product updated and integrate new features, it also locks users into subscribing to the service. A system built on Cosmos can’t easily be moved.
Because Cosmos DB is so new, there isn’t a long record of users. Jet.com, an online marketplace competing with Amazon for market share, is the early adopter cited by Microsoft at the build conference. Jet records over 100 trillion queries per day. At the time of the press conference Jet was experiencing the single-digit latency expected by Cosmos.
The digitization of the global marketplace is driving a need for databases that can keep meet demand. There are comparable products to Cosmos- Google Cloud Spanner, for one- but if it lives up to its promise Cosmos definitely deserves attention.