vgm_utils.py contains several utility functions that are used in the other files.
aggregate_gen.py generates the aggregate graph files for importing into Neo4j. It requires two sqlite3 databases named objects.db and relations.db. They contain:
-
synset_count
in objects.db : Contains counts for (more importantly) each unique noun synset present in the Visual Genome dataset, as well as their unique IDs,. These unique noun synsets are imported into a pythondict
for rapid ID assignment. -
synset_count
in relations.db : Contains counts for (more importantly) each unique relation/predicate synset present in the Visual Genome dataset, as well as their unique IDs,. These unique predicate synsets are imported into a pythondict
for rapid ID assignment.
The syntax for running the program is:
python [aggregate_path] [scene_path] [aggregate_path]
where
aggregate_path
is the path to the aggregate_gen.py filescene_path
is the path to the scene_graphs.json file from the Visual Genome Datasetout_path
is the name of the output file. Should beaggregate_graph.vgm
json_explorer.py is used for streaming json files (in the same way it is implemented in vgm_utils
) for prototyping and testing purposes.
This generates the requisite small and large graphs for the database.
This generates files for all synsets. NEED TO EDIT MAY NOT BE NECESSARY