visit
At the very beginning of most development endeavors lies an important question: What database do I choose? There is such an abundance of database technologies at this moment, it’s no wonder many developers don’t have the time or energy to research new ones. If you are one of those developers and you aren’t very familiar with graph databases in general, you’ve come to the right place!
In this article, you will learn about the main differences between a graph database and a relational database, what kind of use-cases are best suited for each database type, and what are their strengths and weaknesses.In a typical social network graph, the nodes represent people in different social groups and their connections with one another. Every person is represented with a node that’s labeled as Person. These nodes contain the properties
name
, gender
, location
and email
. The relationships between people in this network are of the type FRIENDS_WITH
and contain a yearsOfFriendship
property to specify the duration of the friendship connection. Each person is assigned a location through LIVES_IN
relationships with nodes labeled Location
.For example, each person is connected to other people through friendships, and to model this relationship, we have to add another table. If there were different kinds of connections (related to, no longer friends…) we would have to change the schema accordingly. A relational database isn’t suited for this specific use case because the focus isn’t on the data itself but rather on the relationships within it.
Graph solutions are focused on highly-connected data that comes with an intrinsic need for relationship analysis. If the connections within the data are not the primary focus and the data is of a transactional nature, then a graph database is probably not the best fit. Sometimes it’s just important to store the data and complex analysis isn’t needed.
In our example, if we were to store only people without their relationships, then we would end up with a sparsely connected graph. Yes, a number of simpler graphs would remain because of the connections between nodes
Person
and Location
, but this degree of connectedness and the consistency of the data structure is well suited for a relational database.Graph databases are optimized for data retrieval and if you choose one, then you should probably use this functionality often. If your focus is on writing to the database and you’re not concerned with analyzing the data, then a graph database wouldn’t be an appropriate solution. A good rule of thumb is, if you don’t intend to use JOIN operations in your queries, then a graph is not a must-have.
In our example, if you only store data for the sake of logging interactions and you don’t intend to analyze it later on, then a graph database isn’t particularly helpful. However, if there are numerous connections within the data being stored, then a graph might be worth considering.
If your data model is inconsistent and demands frequent changes, then using a graph database might be the way to go. Because graph databases are more about the data itself than the schema structure, they allow a degree of flexibility.
On the other hand, there are often benefits in having a predefined and consistent table that’s easy to understand. Developers are comfortable and used to relational databases and that fact cannot be downplayed.For example, if you are storing personal information such as names, dates of birth, locations… and don’t expect many new fields or a change in data types, relational databases are the go-to solution. On the other hand, a graph database could be useful if:If you need to run frequent table scans and searches for data that fits defined categories, a graph database wouldn’t be very helpful. Graph databases are well equipped to traverse relationships when you have a specific starting point or at least a set of points to start with (nodes with the same label). They are not suited for traversing the whole graph often. While it’s possible to run such queries, other storage solutions may be more optimized for such bulk scans.
If the majority of the queries in our example include searches by property values over the entire network, then a graph database wouldn’t be the right fit.Very often, databases are used to lookup information stored in key/value pairs. When you have a known key and need to retrieve the data associated with it, a graph database is not particularly useful.
For example, if the sole purpose of your database is storing a user’s personal information and retrieving it by name or ID, then refrain from using a graph. But if there were other entities involved (visited locations for example), and a large number of connections is required to map them to users, then a graph database could bring performance benefits. A good rule of thumb is, if most of your queries return a single node via a simple identifier (key), then just skip graph databases.If the entities in your model have very large attributes like BLOBs, CLOBs, long texts… then graph databases aren’t the best solution. While you can store those objects as nodes and link them to other nodes to utilize the power of traversing relationships, sometimes it just makes more sense to store them directly with the entities they are connected to.
In our example, if each person had a long biography that needed to be included in the same database, a graph wouldn’t be the answer. However, if you needed to connect these biographies to other entities in the database (for example people that are mentioned in them), then the strengths of a graph database could outway the limitations.Also published at