We want to import data into Neo4j, there are too many resources with a lot of information which makes it confusing. Here is the minimal thing you need to know.
Imagine the data coming from the export of a relational or legacy system, just plain CSV files without headers (this time). One for the "people" and one for the "friendships"-table.
1,"John"
10,"Jane"
234,"Fred"
4893,"Mark"
234943,"Anne"
1,234
10,4893
234,1
4893,234943
234943,234
234943,1
Graph Model
Our graph Model would be very simple:

(p1:Person {userId:10, name:"Anne"})-[:KNOWS]->(p2:Person {userId:123,name:"John"})
Import with Neo4j Server & Cypher
-
Download, install and start Neo4j Server.
-
Run the following statements one by one:
I used http-urls here to run this as an interactive, live Graph Gist.
CREATE CONSTRAINT ON (p:Person) ASSERT p.userId IS UNIQUE;
LOAD CSV FROM "https://gist.githubusercontent.com/jexp/d8f251a948f5df83473a/raw/people.csv" AS row
CREATE (:Person {userId: toInt(row[0]), name:row[1]});
USING PERIODIC COMMIT
LOAD CSV FROM "https://gist.githubusercontent.com/jexp/d8f251a948f5df83473a/raw/friendships.csv" AS row
MATCH (p1:Person {userId: toInt(row[0])}), (p2:Person {userId: toInt(row[1])})
CREATE (p1)-[:KNOWS]->(p2);
Note
|
You can also use file-urls.
Best with absolute paths like file:/path/to/data.csv , on Windows use: file:c:/path/to/data.csv
|
If you want to find your people not only by id but also by name quickly, also run:
CREATE INDEX ON :Person(name);
For instance all second degree friends of "Anne" and on how many ways they can be reached.
MATCH (:Person {name:"Anne"})-[:KNOWS*2..2]-(p2)
RETURN p2.name, count(*) as freq
ORDER BY freq DESC;
Bulk Data Import
For tens of millions up to billions of rows.
Shutdown the server first!!
Create two additional header files:
userId:ID,name
:START_ID,:END_ID
Execute from the terminal:
path/to/neo/bin/neo4j-import --into path/to/neo/data/graph.db \ --nodes:Person people_header.csv,people.csv --relationships:KNOWS friendships_header.csv,friendships.csv
After starting your database again, run:
CREATE CONSTRAINT ON (p:Person) ASSERT p.userId IS UNIQUE;