The Story of GraphGen

This is the story behind the really useful and ingenious Neo4j example graph data generator developed by Christophe Willemsen.

I don’t just want to show you the tool but also tell the story how it came to be.

First of all: The Neo4j Community is awesome.
There are so many enthusiastic and creative people, that it is often humbling for me to be part of it.

So October 1st, Christophe tweeted out a short screencast he recorded, about a new tool (NeoGen) he was developing which converted a YAML domain specification into Cypher statements to populate a Neo4j database.

It was already pretty cool, but being me, I had a bulk of ideas that I’d love to see in such a tool.

Some of these were:

  • online tool

  • visualization

  • cypher like specification language (inspired by

  • download cypher script and GraphJSON for later usage

  • populate Neo4j database from online tool

  • generate Neo4j console link

So I contacted him and we had some really good discussions. He was very enthusiastic and eager to continue to work on this. So he first added the Cypher spec, then the online tool GraphGen including visualization, the download of GraphJSON & Cypher and finally the database population. And he continued further and further adding model types (pre-defined combinations of labels with properties) and adding the Neo4j console link. All while expanding the documentation.

Really impressive work.

Christophe also added some documentation and a neat teaser to quickly get started.

BzrUpthIQAA Bl7

And so today, a mere 5 days later, GraphGen was out there, very usable and useful and waiting for you to generate your graph domain model.

No wonder he won the 2014 GraphConnect Graphies Award for best community contribution.

graphie graphgen

Here is a quick screencast I made (which is already superceded by new features):

The web-based GraphGen tool is of course on GitHub too. Here is a quick explanation on how it works internally:

  1. Rendering your Cypher-Spec statement colorfully with CodeMirror Cypher highlighting

  2. The PHP backend parses the Cypher statement and generates and intermediate model, which is the same as from the YAML input

  3. then generates the graph model internally according to the counts, cardinalities and property types you provided

  4. the property values are created using the Faker library to generate realistic data for names, dates, creditcards etc.

  5. then it generates Cypher Statements (CREATE, MERGE, MATCH) to generate your data and GraphJSON

  6. the cypher and GraphJSON are made available for download and the GraphJSON is also used to render a visualization of your graph with Alchemy.js

  7. you can open a new, shared Neo4j console link which is generated using its API

  8. if you choose to, a publicly available Neo4j database can be populated with the graph model by posting Cypher statements to the transactional HTTP endpoint using Ajax requests

In a recent blog post Christophe wrote about his view of this story.

And my amazing colleague Rik showed how to utilize GraphGen in his Simulating the IDMS EmpDemo blog post.

This is just one of the many examples in which a member of the Neo4j community got behind a good idea and created a very useful tool, driver, framework or documentation.

I can’t thank all the contributors enough and can only pledge to support the community to the best of my abilities.

I want to invite you all to join our community and contribute in writing and coding around Neo4j and be supported by others when you need it.

And even if you don’t have an idea that you currently want to pursue, just helping others by answering questions and providing feedback to Neo4j and all the tools and content in its ecosystem is extremely valuable and helpful.

Cheers, Michael

P.S: Christophe is also the author of the more widely used Neo4j PHP library NeoClient, translator of the Neo4j manual into French and importer of GitHub repository data into Neo4j.

Meet him and me in London for the GraphDay on Nov 13 and the GraphHack - Meetup on Nov 12.