Someone asked me “what is the difference or the relation between Databases, Data Mining, Big Data, and Graph Mining?”
A long time ago, there was what we call Database everyone knows what it is, and everyone knows how to use it.
Then some computer scientists came and said, “We found that there is a new usage for Database beside its original purpose (storing data in an organized way). We can learn new information; we can find patterns by applying some statistical analysis on the data inside the database”. They called this discovery “Data Mining”.
The data stored in Database become huge, massive in size, grow very fast like lightning, and then traditional analysis techniques for Data Mining could not handle the enormous volume, the velocity, and the variety (aka Three Vs. of Big Data). Then, a new set of procedures, methods, computational platforms were born ( e.g., MapReduce, Hadoop, Spark, etc.) They called this new generation Big Data
In the same time, the scientists also figured another problem with Database. This time it was related to how the data are stored. They found that flat representation where the data are stored as a collection of records/rows limits the ability to mine the data and discover interesting patterns. They discovered that there is interesting knowledge to be mined from the relationships between records or entities in the data. To solve this issue, they proposed a new presentation, which is a Graph Database. With the new presentation, the need for new techniques to mine the graph became obvious, and this gives birth to new sub-discipline which they called Graph Mining.