Can someone explain to me in lamens terms what a graph database is?

mcphage · on Nov 29, 2017

It's a database that's designed to store relationships between objects instead of just facts. It has efficient methods of following long chains of associations. So think of how you store tree structures in a relational database—there are a lot of different ways of doing it, and they're all frustrating. Storing trees is something graph databases do naturally.

kchoudhu · on Nov 29, 2017

I've been playing around with graph databases for a while (I am writing one that turns Postgres into one of my own for kicks[1], [2]) and one of the things that became obvious after using the project in production was that it promotes functional reactive programming in a way that most other database paradigms don't.

Even propagation and node invalidation are awesome for rapid what-if style experimentation, and I am so psyched to see more and more attention being paid to graph computing in general.

[1] https://www.github.com/kchoudhu/openarc [2] https://www.anserinae.net/whats-cooking-openarc-edition.html...

michaelbuckbee · on Nov 29, 2017

Trying to get this straight in my mind here.

Is it fair to say that traditional RDBMS/SQL are for storing different "sets" of related information (tables for products, users, orders).

Graph databases are for storing data about the _same_ set of data as it interrelates to itself.

- a User and and all their Friends (who are also users) - a Keyword and all associated Terms (which are also keywords)

Is that right?

InverseFalcon · on Dec 2, 2017

I think you're concentrating on the wrong thing, here.

Just as RDBMS can have tables about different things (Products, Users, Orders), graph databases can use labels on nodes for different things (so you can have :Product nodes, :User nodes, :Order nodes). Though with graph databases, there is often less rigidity in the associated data than in RDBMS, as there is no requirement for explicit schema for properties on nodes of different types in a graph db (plus you can multi-label nodes).

The real differentiator is how relationships are modeled, and how they're traversed in queries.

With RDMBS/SQL you're going to be working with data in tables, and use join tables as the relationships between them. You're likely going to need to be explicit about what is being joined together, so the relationship chain is likely to be very rigid.

With graph databases, relationships and relationship traversal is used in place of join tables and table joins, which gives much more flexibility over how to traverse. You can certainly do friend-of-friend-of-friend queries much more easily, but you can also perform variable-length traversals using custom logic for which nodes are in the path and which relationships are traversed (type, direction, and count), and that can be very well-defined, or very loosely defined, or a mix, as needed. I don't believe there are good ways to do that kind of ad-hoc table joining in RDBMS.

As an example of very loosely defined traversals in queries, you can ask for a shortest path between two nodes, knowing nothing about the nodes or relationships that could be between them, and get a path back showing the connecting nodes, with the relationships between the nodes providing context.