NoSQL in not only a buzzword, it has now became an integral part of the software stack for new application. It is not just some new piece of technology that an architect uses just for the kicks of it, it is that piece of versatile software that can be used in almost all layers of an architecture. With the invent of Bigdata, there are various NoSQL options available at our disposal, Cassandra and MongoDB clearly tops the list one would consider for a NoSQL implementation.
In essence, all of them solve the same basic problem, to allow us to model data close to what the application requires. But they have different ways of solving the problem and this is the reason one would need to know the best fit in their application stack.
As part of this blog we will try to compare the top two nosql database so that you have have a better understanding on which option to choose in a specific scenario.
I have considered the following key aspects around writing of data for comparison and have given an overview on how we feel the different options fair, will try to cover read aspect in the next post:
- Writing data
- Writing data in a transaction
- Write Consistency/Write concern :
If you are not concerned with anything else in the world and your sole aim is to ingest data as quickly as possible, then Cassandra is the way to go. Write operations are slower in MongoDb mainly because a write operation locks the database and no other write/read operation can happen during this time and as part of a replica set write operation are only done on the master node. MongoDb also has a concept of sharding where your write operation can be sent on n different master machine, thus increasing your write speed. The caveat here is that this approach is highly dependent on how you design you shard key and if the shard key is not correct then you might end up sending all rights to a single node. In case of Cassandra this option does not need to be configured explicitly. In case of Cassandra a write operation is sent to any random node and then that node replicates the data on to other nodes based on how the replication has been configured. Hence Cassandra is the better option here.
Writing data in transaction
Both Mongo and Cassandra do not have a concept of transaction but both of them have atomic update. The scale is titled in favor of MongoDb in this scenario. Being a document oriented database it has more flexibility here. If you have modeled your document correctly than you can mimic as transaction very easily here. All update for a document will be atomic that is if either it will succeed or it will fail and if you have modeled your document structure correctly then this will help you in mimicking a transaction like functionality.
Write Consistency/Write concern
Using Write Consistency we can specify on how many replicas a write operation must succeed before returning an acknowledgment to the client application. In case of MongoDb this is referred to as Write concern. Both the database are at par here and provide similar functionality.
So to summarise my analysis, here is how I feel the two options fare.
Consistent write speed.
Out of the box sharding and replication.
Due to column based design, nested structure is not possible.
Document based approach allows flexibility on design side.
Allows replication and sharding of data to increase write speed.
Due to master slave model, write speed takes a hit.
As part of the next blog in this series we will try see how both these Database fare in the Data read category. If you have any queries feel free to drop us a note in the comments section below.