SQL vs NoSQL

SQL vs NoSQL

Introduction

Database is a component required to persist data. In the world of databases, there are two main types of solutions: SQL(relational) and NoSQL(non-relational) databases.

SQL

SQL stands for structured query language. As the names suggests, SQL or relational databases are structured and have predefines schemas.

  • They save the data containing relationships like One to One, One to Many, Many to Many etc in normalized format.

Relationships.png

  • Relational databases store data in defined rows and columns where row denotes an entity or record and the column denotes separate data points or fields of one entity.

  • Ensure ACID transactions where the states of data are consistent and durable ensuring a transaction to either be executed with perfection without any other process or transaction being affected or to be rolled backed if anything fails.

NoSQL

NoSQL stands for non-structured query language. As the names suggests, NoSQL or non-relational databases are unstructured and have dynamic schemas which can change.

  • They are more like JSON based databases
  • Built for high frequency reads and writes.
  • Have ability to add new server nodes on the fly and continue to work without human intervention.
  • Sacrifices ACID and Strong Consistency and follow BASE(Basically available, Soft State, Eventual Consistency). They are more eventually consistent as opposed to strongly consistent

Different types of NoSQL DB

Key-Value Stores

  • Data is stored in an array of key-value pairs.
  • Quickly fetch data with low latency(O(1) constant time).
  • Primary use case of caching in applications.

Usecase

  • Data retrieval with minimum latency
  • Caching.
  • Persisting user state and sessions.
  • Real life implementations including API rate-limiters in Distributed Systems.

Eg: Redis, Hazelcast, Memcached, Amazon DynamoDB

Document Databases

  • Store data in a document oriented model in independent documents which are grouped together in collections.
  • Data is semi-structured and stored in JSON format.
  • Suitable for agile software development methodology.
  • Provide Horizontal scalability
  • Performant read-writes.

Usecase

  • Semi-structured data
  • Need a flexible schema that can change often.
  • Quick persistence and retrieval of data
  • Scalable search feature on websites by persisting all search related data in elastic store
  • Real life implementations including storing user review comments, product catalogues in Amazon.

Eg: MongoDB, Couchbase

Wide column Databases

  • Column families which are containers for rows
  • Each row doesn't have same number of columns
  • Handle massive amount of data(Big Data)
  • High performance and scalable architecture.

Usecase

  • Data Analytics
  • Manage Big Data ensuring scalability and performance.
  • Real life implementations including Netflix streaming service.

Eg: Cassandra, Hbase

Graph Databases

  • Store data whose relations are best represented in a Graph.
  • Store data(entities) in nodes/vertices and edges in form of relationships
  • Relationships are stored in forms of edges and hence are faster as compared to relational databases where the same are persisted in form of tables.
  • Helps in data visualization.

Usecase

  • Building social, knowledge and network graphs
  • Recommendation engine
  • Real life implementations including Google Maps and recommendations in Walmart.

Eg: Neo4J, InfiniteGraph

SQL vs NoSQL

SQLvsNoSQL.png

Choosing Database based on usecase

When to use SQL database:

  • ACID compliance : Ensure ACID(Atomicity, Consistency, Isolation, Durability) compliance in our DB servers. ACID reduces anomalies and protects data integrity. Preferred for financial transactions(like banking and stock markets) where data consistency is the need and the data has to be laser accurate.

  • Storing relationships : Need of storing relationships between data across tables like Facebook.

  • Structured data : Data is quite structured (datatypes and columns are predefined)and unchanging (i.e. all the rows have data for each column) where we know what data we would be dealing with before the actual data entry.

When to use NoSQL database:

  • High data volume : We have to store high volume of heterogeneous type of data consisting of text, images, files, videos, files, pdf, blob objects etc which is not very structured and dont have predefined datatypes. Eg: Data analytics data from multiple sources like social networks, web portals, IOT devices to be fed in our machine learning models after data preparation(cleaning) logic.

  • Scalability : Databases has to be scaled across multiple data centers with the requirement to add nodes on the fly with minimal manual intervention.

  • Rapid Development : Rapid development with quick iterations on our systems requiring frequent updates to the data structures/data model. When database design is not finalized and things are expected to change at a rapid pace they NoSQL provides us the flexibility.

  • Eventual Consistency : When its okay to be eventual consistent where accuracy of value does not matter and we can give up strong consistency(ensured by ACID). Eg : Like/Dislike count on a post on a social networking site can be eventually consistent if not strongly consistent at every point of time.

Choosing DB.png

Conclusion

As evident above, there is no one-size-fits-all solution and so many business today rely on both relational and non-relational databases for different need. All large-scale online services use a mix of both to implement their systems and achieve the desired behaviour. This term of leveraging the power of multiple databases is called polyglot persistence.

With the advent of multi-module databases, we have the ability to use different data models in a single database system supporting multiple data modules like Graph, Document-Oriented, Relational etc. as opposed to supporting only one data model.

Thankyou for your time. Hope this article helped to understand the difference and usecase of SQL vs NoSQL.

Stay tuned for more such articles.