Giter Site home page Giter Site logo

edge list support? about batch-import HOT 7 OPEN

jexp avatar jexp commented on July 18, 2024
edge list support?

from batch-import.

Comments (7)

redapple avatar redapple commented on July 18, 2024

hi @sheymann
you'll have to prepare your data a bit beforehand.

For example, you could:

  • generate a list of all distinct node names (Michael, Selina, Rana, Selma...)
  • assign them a sequential incremental ID, starting from 1
  • write your nodes.csv file with the nodes in the same sequence order, similarly to
USERNAME
Michael
Selina
Rana
Selma
...

you can add additional columns/properties to your nodes if necessary

  • write your relations.csv file with at least 3 columns: a source node, a target node, and relation type

The first 2 columns should reference the nodes using the sequential ID you chose before
For the 3rd column, I'm assuming simple friendship

SOURCE  TARGET  RELTYPE
1   2   friend
3   4   friend
1   4   friend
...

Hope this helps

from batch-import.

redapple avatar redapple commented on July 18, 2024

hm... looking at your profile @sheymann and your work on http://linkurio.us/
I guess your point was more about batch-import supporting only an edge-list as input than how to convert the input data (something you surely have all figured out)
Anyway, it may help others

from batch-import.

sheymann avatar sheymann commented on July 18, 2024

Hey yes my question was focused on pure edge lists, as many complex networks datasets are encoded this way.

from batch-import.

jexp avatar jexp commented on July 18, 2024

@redapple Thanks for chiming, in. I think it would make sense to also support edge-only csv data and also allowing to use indexable keys in the start/end columns. Just thought about using https://github.com/jankotek/MapDB as an in memory cache.

from batch-import.

jexp avatar jexp commented on July 18, 2024

@sheymann Would you then just leave off the node file and assume that it is meant this way? This would also probably mean to support multiple relationship-files as for one file only one property-value mapping for nodes could then be realized.

from batch-import.

sheymann avatar sheymann commented on July 18, 2024

Well, this is an extreme case where we only know the graph structure, and we don't care about node properties (we may have edge properties though) :)
e.g. all of these datasets:
http://snap.stanford.edu/data/

from batch-import.

redapple avatar redapple commented on July 18, 2024

btw, I started a python helper module to export RDB data dumps into Neo4J
https://github.com/redapple/sql2graph
For now, it uses quite a lot of memory (when experimenting with MusicBrainz data)

I could write something similar to convert pure edge-lists into nodes.csv; rels.csv, index.csv... but it'd be in Python ;)
and having that support directly in batch-import would be easier/cleaner

from batch-import.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.