Key Considerations for Comparing NoSQL and SQL
Data models conceptually describe the logical structure of data storage inside a database.
While there are many data models, we will mainly talk about two kinds in this guide:
1. Document data models
The document data model uses a simpler key-value pairs storage form. The key itself typically takes the form of a string that identifies a stored data object or entity. This data model type lets you store data in any format, but XML or JSON are generally used.
2. Relational data models
The relational data model stores data as a collection of entities, usually rows inside a table, with each entity containing delineated collections of attributes. Each attribute maps out columns in a table. Relations between entities in different tables are also delineated, linking different entities (tables) and their specific attributes (columns), letting the user flexibly access data without fundamentally disrupting the database’s structure—data storage via tables.
So in terms of the data model concept, we can see one clear difference between the two: Relational data models contain delineated relations or attributes for an entity, whereas the document data model does not. We’ll dive deeper into this in this article by section.
Related articles:
Databases and Data Models
Computer scientists term well-structured data collections as databases, generally stored on an individual computer, or in a cluster. Different types of databases have their own idiosyncratic capabilities. Roughly speaking, all databases allow you one central function: retrieving and storing data.
We can classify each database type according to a conceptual data model. For the relational model, databases are referred to as relational or SQL databases. For the document model, databases are referred to as Non-SQL, non-relational or document databases.
The issue with the name “NoSQL”
Middlware bridges the gap between the two worlds of SQL and NoSQL. It can therefore be misleading to refer to databases that are outside of the relational model as “NoSQL.” Today, many NoSQL databases include analytics and reporting tools that give you SQL-like querying capabilities (see the link for the most up-to-date NoSQL databases).
Here’s an interesting factoid: The term “NoSQL” actually popped up organically by about 2009. Previously to this, between the early-to-mid 2000s, SQL databases in the relational model were the most typically used. Then a Twitter hashtag became associated with an open-source group who was developing non-relational databases. The term gained traction in the developer community, then people referencing non-SQL databases as NoSQL.
The most refined term since then is “Not Only SQL.”
When to Use SQL and NoSQL
The document model was mentioned earlier. NoSQL databases can take other data model forms, such as graph databases. Because the document data model is most popularly used, we will focus on it in our comparison guide.
It’s important to note that your data model choice is only one aspect of your overall database workflow. You will find capabilities and features in different databases that are connected to the underlying data structure, and others that are not. In order to focus our comparison, we will consider five functionalities that are often explored when comparing SQL and NoSQL databases:
1. Querying
Retrieving data from SQL databases is mostly done through SQL queries. Note that SQL is short for Structured Query Language, the standard language for handling data inside a relational database.
With the document model, most NoSQL databases use the relevant Document ID or Primary Key to retrieve specific data. Many modern NoSQL databases allow for SQL-like querying however.
NoSQL databases primarily rely on Secondary Indexes when handling unstructured data. We won’t explore this in detail but when doing NoSQL queries, portions of the document that you want to query can be stored in a Secondary Index, wherein that query itself is carried out. This allows for querying in an SQL fashion but with unstructured data, as long as the correct Secondary Indexes are used.
Specific indexing and querying features will be different for different NoSQL databases, and not every database will have all of the capabilities available across this class of database. Nonetheless, in most instances you should reliably find yourself able to achieve the type of querying that you require by using NoSQL databases.
2. Schema
SQL databases have a delineated design or schema. In other words, their data is well structured via collections of relations and attributes relating to different entity types. On the other hand, NoSQL databases are commonly known as schema-less.
This however, like the idea that NoSQL databases cannot do relational queries, isn’t entirely true. NoSQL typically stores its data in the XML or JSON format which is typically passed by an application when it’s being read, into a set schema. It’s therefore more appropriate to say that NoSQL databases have schema-on-read, rather than being schema-less.
Likewise, modern SQL databases can perform with more flexibility regarding its schema, similar to NoSQL. Let’s consider the ALTER TABLE SQL command which lets you adjust the schema as needed. And the more recent SQL databases allow for the JSON attribute, for storing unstructured data in relational tables.
3. Replication
You will hear this feature in relation to Distributed Systems. I won’t get into this as an overall topic as it’s quite complicated, but it is worth going over both replication and partitioning in order to highlight further key differences between SQL and NoSQL.
Replication conceptualises the process of storing multiple copies of a single partition across different machines. This improves the availability and durability of a single machine in the event of a fatal error.
4. Partitioning
Partitioning conceptualises the fractionating of a database into separate components called partitions, so that we can distribute these partitions across several machines. Partitioning allows for a high throughput and the handling of databases that would otherwise be too large to store on a single hard drive.
Note that because the majority of NoSQL databases were created in accordance with distributed data systems and the sensibilities that come with this concept, they have fewer features as standard, while most SQL databases were made to work on a single machine.
Nonetheless, there’s no reason why an SQL database cannot have replication and partitioning as capabilities. Indeed, Teradata, a relational database created in the 1970s, included partitioning as one of its core features. And today, thanks to middleware, many relational databases include capabilities from distributed data systems such as replication and partitioning.
5. Transaction Isolation
Because transaction isolation is a complicated topic, I will not be able to cover it fully in this article. But I will try to give you enough of an overview so that we can do a meaningful comparison.
Think of a database transaction as a single unit of work or logic carried out on a database. This can be as straightforward as a write or a read, or something more complicated with multiple reads, queries and writes.
Transaction isolation becomes tricky when multiple transactions run simultaneously on a database. This issue arises due to multithreaded execution and is exacerbated by replication and partitioning. In short, the main question is how to run multiple transactions simultaneously while maintaining logical fidelity. It’s fixed via transaction isolation guarantees.
Theoretically, both SQL and NoSQL databases can handle this fix.
SQL and NoSQL Similarities and Differences
Overall, will NoSQL replace SQL? They have so much crossover. And does NoSQL support SQL?
While NoSQL and SQL databases show an underlying difference in their default schema (SQL is rigid whereas NoSQL isn’t), in practice they are very similar. So “mostly yes” as an answer to the second question and “perhaps, but not necessarily” for the first.
You could say that these two different databases are different but have a lot in common and can be used to achieve many of the same aims.
The key factors to consider when deciding to use NoSQL or SQL in your application are:
- Scalability
- Performance
- Durability
- Cost
- Functionality
- Transaction isolation guarantees
First, consider whether your choice of database can achieve your specific aim, and then examine SQL versus NoSQL as a secondary factor.