Giter Site home page Giter Site logo

snap-stanford / snap Goto Github PK

View Code? Open in Web Editor NEW
2.1K 178.0 794.0 159.19 MB

Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library.

License: Other

Makefile 0.10% Python 0.03% C++ 67.78% Elixir 0.01% C 32.09% Euphoria 0.01%

snap's Introduction

========================================================================
  SNAP : Stanford Network Analysis Platform
	http://snap.stanford.edu
========================================================================

Stanford Network Analysis Platform (SNAP) is a general purpose, high
performance system for analysis and manipulation of large networks.
SNAP is written in C++ and it scales to massive graphs with hundreds
of millions of nodes and billions of edges.

/////////////////////////////////////////////////////////////////////////////

Directory structure:
  http://snap.stanford.edu/snap/description.html

  snap-core: 
        the core SNAP graph library
  snap-adv: 
        advanced SNAP components, not in the core, but used by examples
  snap-exp:
        experimental SNAP components, still in development
  examples:
        small sample applications that demonstrate SNAP functionality
  tutorials:
        simple programs, demonstrating use of various classes
  glib-core: 
        STL-like library that implements basic data structures, like vectors
        (TVec), hash-tables (THash) and strings (TStr), provides
        serialization and so on
  test:
        unit tests for various classes
  doxygen:
        SNAP reference manuals

Code compiles under Windows (Microsoft Visual Studio, CygWin with gcc) and
Linux and Mac (gcc). Use the SnapExamples*.sln or provided makefiles.

Some of applications expect that GnuPlot and GraphViz are installed and
accessible -- paths are in the system PATH variable or they reside in the
working directory.

/////////////////////////////////////////////////////////////////////////////

Example applications for advanced SNAP functionality are available
in the examples directory and described at:
  http://snap.stanford.edu/snap/description.html.

To compile from the command line, execute:
  make all	# compiles SNAP and all sample applications

To compile on Mac OS X, using Xcode:
  1. From the Toolbar, select Scheme (e.g. 'bigclam').
  2. Product -> Build.  (or Cmd + B).
  3. Run executable via the command line; or
     Choose the scheme's executable (Product -> Edit Scheme -> Run -> Info)
     and run: Product -> Run (or Cmd + R). 
     Note: If using Gnuplot, add the PATH to the scheme's environment variables.
     or create symlink to /usr/bin:
     sudo ln -s <gnuplot_dir>/gnuplot /usr/bin/
  For code completion, the "docs" target has been created which includes all
  Snap-related files and example programs.

Description of examples:
  agmfit :
        Detects network communities from a given network by fitting
	AGM to the given network by maximum likelihood estimation.
  agmgen :
	Implements the Affiliation Graph Model (AGM). AGM generates
        a realistic looking graph from the community affiliation of the nodes.
  bigclam :
	Formulates community detection problems into non-negative matrix
	factorization and discovers community membership factors of nodes.
  cascadegen :
        Identifies cascades in a list of events.
  cascades :
  	Simulates a SI (susceptible-infected) model on a network and computes
	structural properties of cascades.
  centrality :
	Computes node centrality measures for a graph: closeness, eigen,
	degree, betweenness, page rank, hubs and authorities.
  cesna :
        Implements a large scale overlapping community detection method
        for networks with node attributes based on Communities from
        Edge Structure and Node Attributes (CESNA).
  circles :
	Implements a method for identifying users social circles.
  cliques :
	Finds overlapping dense groups of nodes in networks,
	based on the Clique Percolation Method.
  coda :
        Implements a large scale overlapping community detection method 
        based on Communities through Directed Affiliations (CoDA), which
        handles directed as well as undirected networks. The method is able
        to find 2-mode communities where the member nodes form a bipartite
        connectivity structure.
  community :
	Implements network community detection algorithms: Girvan-Newman,
	Clauset-Newman-Moore and Infomap.
  concomp :
	Computes weakly, strongly and biconnected connected components,
	articulation points and bridge edges of a graph.
  flows :
        Computes the maximum network flow in a network.
  forestfire : 
	Generates graphs using the Forest Fire model.
  graphgen : 
	Generates undirected graphs using one of the many SNAP graph generators.
  graphhash : 
	Demonstrates the use of TGHash graph hash table, useful for
	counting frequencies of small subgraphs or information cascades.
  infopath :
        Implements stochastic algorithm for dynamic network inference from
        cascade data, see http://snap.stanford.edu/infopath/.
  kcores :
  	Computes the k-core decomposition of the network and plots
	the number of nodes in a k-core of a graph as a function of k.
  kronem : 
  	Estimates Kronecker graph parameter matrix using EM algorithm.
  kronfit : 
  	Estimates Kronecker graph parameter matrix.
  krongen : 
  	Generates Kronecker graphs.
  localmotifcluster :
	Implements a local method for motif-based clustering using MAPPR.
  lshtest :
	Implements locality sensitive hashing.
  magfit :
	Estimates Multiplicative Attribute Graph (MAG) model parameter.
  maggen : 
	Generates Multiplicative Attribute Graphs (MAG).
  mkdatasets :
	Demonstrates how to load different kinds of networks in various
	network formats and how to compute various statistics of the network.
  motifcluster : 
  	Implements a spectral method for motif-based clustering.	
  motifs : 
  	Counts the number of occurrence of every possible subgraph on K nodes 
  	in the network.
  ncpplot : 
	Plots the Network Community Profile (NCP).
  netevol :
  	Computes properties of an evolving network, like evolution of 
  	diameter, densification power law, degree distribution, etc.
  netinf :
	Implements netinf algorithm for network inference from
	cascade data, see http://snap.stanford.edu/netinf.
  netstat :
  	Computes statistical properties of a static network, like degree
	distribution, hop plot, clustering coefficient, distribution of sizes
	of connected components, spectral properties of graph adjacency
	matrix, etc.
  randwalk :
        Computes Personalized PageRank between pairs of nodes.
  rolx :
        Implements the rolx algorithm for analysing the structural
        roles in the graph.
  testgraph :
	Demonstrates some of the basic SNAP functionality.
  temporalmotifs :
	Counts temporal motifs in temporal networks.
  zygote :
        Demonstrates how to use SNAP with the Zygote library, which
        significantly speeds up computations that need to process the
        same large graph many times.

/////////////////////////////////////////////////////////////////////////////

SNAP documentation:
  http://snap.stanford.edu/snap/doc.html

The library defines Graphs (nodes and edges) and Networks (graphs with data
associated with nodes and edges).

Graph types:
  TNGraph : 
  	directed graph (single directed edge between a pair of nodes)
  TUNGraph : 
  	undirected graph (single undirected edge between a pair of nodes)
  TNEGraph : 
  	directed multi-graph (multiple directed edges can exist between
        a pair of nodes)

Network types:
  TNodeNet<TNodeData> : 
  	like TNGraph, but with TNodeData object for each node
  TNodeEDatNet<TNodeData,TEdgeData> :
        like TNGraph, but with TNodeData object for each node and TEdgeData
        object for each edge
  TNodeEdgeNet<TNodeData, TEdgeData> : 
  	like TNEGraph but with TNodeData object for each node and TEdgeData
	object for each edge
  TNEANet :
        like TNEGraph, but with attributes on nodes and edges. The attributes
        are dynamic in that they can be defined at runtime
  TBigNet<TNodeData> : 
  	memory efficient implementation of TNodeNet (avoids memory
	fragmentation)

To generate reference manuals, install doxygen (www.doxygen.org), and execute:
  cd doxygen; make all    # generates user and developer reference manuals

/////////////////////////////////////////////////////////////////////////////

SNAP tutorials

Sample programs demonstrating the use of foundational SNAP classes and
functionality are available in the tutorials directory.

To compile all the tutorials, execute the following command line:
  cd tutorials; make all    # generates all the tutorials

/////////////////////////////////////////////////////////////////////////////

SNAP unit tests

Unit tests are available in the test directory.

To run unit tests, install googletest (code.google.com/p/googletest) and
execute:
  cd test; make run    # compiles and runs all the tests

/////////////////////////////////////////////////////////////////////////////

Description of SNAP files:
  http://snap.stanford.edu/snap/description.html

snap-core:
  alg.h : Simple algorithms like counting node degrees, simple graph
        manipulation (adding/deleting self edges, deleting isolated nodes)
        and testing whether graph is a tree or a star.
  anf.h : Approximate Neighborhood Function: linear time algorithm to
        approximately calculate the diameter of massive graphs.
  bfsdfs.h : Algorithms based on Breath First Search (BFS) and Depth First
        Search (DFS): shortest paths, spanning trees, graph diameter, and
        similar.
  bignet.h : Memory efficient implementation of a network with data on
        nodes. Use when working with very large networks.
  casc.h : Computes cascades from a list of events.
  centr.h : Node centrality measures: closeness, betweenness, PageRank, ...
  cmty.h : Algorithms for network community detection: Modularity,
        Girvan-Newman, Clauset-Newman-Moore.
  cncom.h : Connected components: weakly, strongly and biconnected
        components, articular nodes and bridge edges.
  ff.h : Forest Fire model for generating networks that densify and have
        shrinking diameters.
  flow.h: Maximum flow algorithms.
  gbase.h : Defines flags that are used to identify functionality of graphs.
  ggen.h : Various graph generators: random graphs, copying model,
        preferential attachment, RMAT, configuration model, Small world model.
  ghash.h : Hash table with directed graphs (<tt>TNGraph</tt>) as keys. Uses
        efficient adaptive approximate graph isomorphism testing to scale to
        large graphs. Useful when one wants to count frequencies of various
        small subgraphs or cascades.
  gio.h : Graph input output. Methods for loading and saving various textual
        and XML based graph formats: Pajek, ORA, DynNet, GraphML (GML), 
        Matlab.
  graph.h : Implements graph types TUNGraph, TNGraph and TNEGraph.
  gstat.h : Computes many structural properties of static and evolving networks.
  gsvd.h : Eigen and singular value decomposition of graph adjacency matrix.
  gviz.h : Interface to GraphViz for plotting small graphs.
  kcore.h : K-core decomposition of networks.
  network.h : Implements network types TNodeNet, TNodeEDatNet and TNodeEdgeNet.
  randwalk.h : Computing random walk scores and personalized PageRank
	between pairs of nodes
  Snap.h : Main include file of the library.
  statplot.h : Plots of various structural network properties: clustering,
        degrees, diameter, spectrum, connected components.
  subgraph.h : Extracting subgraphs and converting between different
        graph/network types.
  timenet.h : Temporally evolving networks.
  triad.h : Functions for counting triads (triples of connected nodes in the
        network) and computing clustering coefficient.
  util.h : Utilities to manipulate PDFs, CDFs and CCDFs. Quick and dirty
        string manipulation, URL and domain manipulation routines.

snap-adv:
  agm*.h : Implements the Affiliation Graph Model (AGM).
  cliques.h : Maximal clique detection and Clique Percolation method.
  graphcounter.h : Performs fast graph isomorphism testing to count the
        frequency of topologically distinct sub-graphs.
  kronecker.h : Kronecker Graph generator and KronFit algorithm for
        estimating parameters of Kronecker graphs.
  mag.h : Implements the Multiplicative Attribute Graph (MAG).
  motifcluster.h : Implements motif-based clustering algorithms.
  ncp.h : Network community profile plot. Implements local spectral graph
        partitioning method to efficiently find communities in networks.
  rolx.h : Node role detection.
  subgraphenum.h : Enumerates all connected induced sub-graphs of particular
        size.

snap-exp:
  arxiv.h : Functions for parsing Arxiv data and standardizing author names.
  dblp.h : Parser for XML dump of DBLP data.
  imdbnet.h : Actors-to-movies bipartite network of IMDB.
  mxdag.h  Finds the maximum directed-acyclic subgraph of a given
        directed graph.
  signnet.h : Networks with signed (+1, -1) edges that can denote
        trust/distrust between the nodes of the network.
  sir.h : SIR epidemic model and SIR parameter estimation.
  spinn3r.h : Past parser for loading blog post data from Spinn3r.
  trawling.h : Algorithm of extracting bipartite cliques from the network.
  wgtnet.h : Weighted networks.
  wikinet.h : Networks based on Wikipedia.



snap's People

Contributors

agrimgupta92 avatar arbenson avatar arijit91 avatar averywang21 avatar bpedrood avatar chentai-kao avatar davidlizeng avatar dsardina avatar farzaank avatar jayang avatar jcccf avatar julianmcauley avatar martinraison avatar nikhilkhadke avatar nnathur avatar nshelly avatar pararthshah avatar profjure avatar rchengyue avatar richardhsu avatar roks avatar ruth-ann avatar scoutsaachi avatar shubhamg31 avatar sramas15 avatar tbq avatar vikeshkhanna avatar viswajithiii avatar visweshk avatar yonathanp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

snap's Issues

macros in glib-core/bd.h causing problems

I use snap library without any problems for sometime now but lately I stumbled upon the following:

There are trwo macros in: glib-core/bd.h which sometimes cause problems

define min(a,b) ((a)<(b)?(a):(b))

define max(a,b) ((a)>(b)?(a):(b))

The problem is described here http://wordaligned.org/articles/macros-with-halos
I have the same exact issue. In some cases changing the include order fixes the problem but sometimes not. In order to work with snap, before I include my files I had to do:

undef min
undef max

The above solution works but it's still dirty.

Average Pathlength and all-to-all distance

Hi, this is an enhancement request.

Please include a function to compute the all-to-all distance matrix and also the average pathlength of a graph.

I am testing packages because I need to compute the pathlength of graphs of size 10^4 and larger. My own package will struggle at that size. I came across SNAP and I found that the only manner to compute the average pathlength is to use the function GetShortPath() to compute the distance for one node to all others and iterate that for all nodes. While this is already a waste of computational effort, in my case it is even worse because I tried the Python interface and having to do that for-loop in Python spoils any enhancement the library has to offer.

Best.

build fails

build (make all) fails:
make all
/Applications/Xcode.app/Contents/Developer/usr/bin/make -C snap-core
g++ -c -std=c++98 -Wall -O3 -DNDEBUG -DNOMP Snap.cpp -I../glib-core
Snap.cpp:29:10: fatal error: 'triad.cpp' file not found

include "triad.cpp" // triad calculations

     ^

1 error generated.
make[1]: *** [Snap.o] Error 1
make: *** [MakeAll] Error 2

Which function gives me Log Likelihood for MagFit?

Which function in the MAGFit classes gives me the Log Likelihood (measures the possibility that the probabilistic adjacency matrix P generates network A)? I can find this in KronFit to be LogLike() but cannot figure this out for MAGFit. Please help! Thanks!

clang compiler error

Due to problem with gcc ( Issue #26 ), I tried to compile snap with clang.
I got error in bigclam target because clang didn't support openmp.

bigclam.cpp:6:10: fatal error: 'omp.h' file not found

$ clang++ -v
Ubuntu clang version 3.2-1~exp9ubuntu1 (tags/RELEASE_32/final) (based on LLVM 3.2)
Target: x86_64-pc-linux-gnu
Thread model: posix

examples/rolx does not build on OS X 10.10.3

Russells-MacBook-Pro-OLD:rolx rjurney$ make
g++ -fopenmp -o testrolx testrolx.cpp ../../snap-adv/rolx.cpp ../../snap-core/Snap.o -I../../snap-core -I../../snap-adv -I../../glib-core -I../../snap-exp  
ld: library not found for -lgomp
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [testrolx] Error 1

Problem with genSmallWorld

I am trying to make a graph with GenSmallWorld function with different probabilities.
When the probability is more than 0, the graph has some missing edges.
For instance: in a graph with 12 nodes and 4 neighbours there should be 24 edges, but I get 22.
What I am doing wrong? It's not like it should be in the Watts-Strogatz model. Edges should be rewired, but not be deleted.

snap quicktest.py

on snap.stanford.edu the file quicktest.py is broken. Line 12 needs to be altered to:

print ("SUCCESS, your version of Snap.py is ",version)

and line 14 needs parenthesis (Although I think this is only for python version 3+)

I have uploaded the fixed copy.

quick_test.py.txt

Consider upgrading the build system to Autoconf/ Automake

It might be worth considering upgrading SNAP to take advantage of the Autoconf/Automake system to allow SNAP to more easily utilize third party dependencies like libgefx and libhdfs.

It would also allow SNAP to be more portable, and use the standard make command:

./configure && make && make install

import snap Fatal Python error: PyThreadState_Get: no current thread

Python 2.7.10 (v2.7.10:15c95b7d81dc, May 23 2015, 09:33:12)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import snap
Fatal Python error: PyThreadState_Get: no current thread
Abort trap: 6

Please help...............
error log see below:

Process: Python [11381]
Path: /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python
Identifier: Python
Version: 2.7.10 (2.7.10)
Code Type: X86-64 (Native)
Parent Process: bash [11371]
Responsible: iTerm [7379]
User ID: 501

Date/Time: 2015-10-16 20:26:42.407 -0500
OS Version: Mac OS X 10.11 (15A284)
Report Version: 11
Anonymous UUID: D5054451-BC61-CA61-9BF6-89966A16F02C

Sleep/Wake UUID: 3BF6D8EC-E1CA-4FBD-9843-44EF0D90D3C8

Time Awake Since Boot: 11000 seconds
Time Since Wake: 1500 seconds

System Integrity Protection: enabled

Crashed Thread: 0 Dispatch queue: com.apple.main-thread

Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000

Application Specific Information:
abort() called

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x00007fff8894b0ae pthread_kill + 10
1 libsystem_pthread.dylib 0x00007fff969c7500 pthread_kill + 90
2 libsystem_c.dylib 0x00007fff8cef937b abort + 129
3 org.python.python 0x0000000105a69ce2 Py_FatalError + 49
4 org.python.python 0x0000000105a691f0 PyThreadState_Get + 28
5 org.python.python 0x0000000105a6603a Py_InitModule4_64 + 62
6 _snap.so 0x000000010443697f init_snap + 559
7 org.python.python 0x00000001000deba1 _PyImport_LoadDynamicModule + 177
8 org.python.python 0x00000001000dcc58 imp_load_module + 184
9 org.python.python 0x00000001000c357d PyEval_EvalFrameEx + 24829
10 org.python.python 0x00000001000c467e PyEval_EvalFrameEx + 29182
11 org.python.python 0x00000001000c58e3 PyEval_EvalCodeEx + 2115
12 org.python.python 0x00000001000c5a06 PyEval_EvalCode + 54
13 org.python.python 0x00000001000da0a0 PyImport_ExecCodeModuleEx + 208
14 org.python.python 0x00000001000db2a2 load_source_module + 626
15 org.python.python 0x00000001000dd28b import_submodule + 315
16 org.python.python 0x00000001000dd73a load_next + 234
17 org.python.python 0x00000001000dda30 PyImport_ImportModuleLevel + 336
18 org.python.python 0x00000001000bafe3 builtin___import
+ 131
19 org.python.python 0x000000010000c612 PyObject_Call + 98
20 org.python.python 0x00000001000bc1c7 PyEval_CallObjectWithKeywords + 87
21 org.python.python 0x00000001000c0432 PyEval_EvalFrameEx + 12210
22 org.python.python 0x00000001000c58e3 PyEval_EvalCodeEx + 2115
23 org.python.python 0x00000001000c5a06 PyEval_EvalCode + 54
24 org.python.python 0x00000001000e9f4c PyRun_InteractiveOneFlags + 380
25 org.python.python 0x00000001000ea1ae PyRun_InteractiveLoopFlags + 78
26 org.python.python 0x00000001000ea9c1 PyRun_AnyFileExFlags + 161
27 org.python.python 0x000000010010187d Py_Main + 3101
28 org.python.python 0x0000000100000f14 0x100000000 + 3860

Thread 0 crashed with X86 Thread State (64-bit):
rax: 0x0000000000000000 rbx: 0x0000000000000006 rcx: 0x00007fff5fbfe638 rdx: 0x0000000000000000
rdi: 0x0000000000000a0b rsi: 0x0000000000000006 rbp: 0x00007fff5fbfe660 rsp: 0x00007fff5fbfe638
r8: 0x0000000000000040 r9: 0x00007fff758f91e0 r10: 0x0000000008000000 r11: 0x0000000000000206
r12: 0x0000000000000000 r13: 0x0000000105429efb r14: 0x00007fff76e9a000 r15: 0x00000001036355d4
rip: 0x00007fff8894b0ae rfl: 0x0000000000000206 cr2: 0x00007fff758f7038

Logical CPU: 0
Error Code: 0x02000148
Trap Number: 133

Binary Images:
0x100000000 - 0x100000fff +org.python.python (2.7.10 - 2.7.10) /Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python
0x100003000 - 0x100170fff +org.python.python (2.7.10, [c] 2001-2015 Python Software Foundation. - 2.7.10) <5E0C1150-83D5-6364-A820-E7AD67962D79> /Library/Frameworks/Python.framework/Versions/2.7/Python
0x1002f2000 - 0x1002f4ff7 +_locale.so (???) <3C1429AD-B0EF-96BF-9E7E-2F7B48975B36> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/_locale.so
0x1020f0000 - 0x1020f2ff7 +readline.so (???) /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload/readline.so
0x102240000 - 0x102294fe7 +libncursesw.5.dylib (5) <3F0079C0-01C1-3CB8-19CA-F9B49AA4F4A4> /Library/Frameworks/Python.framework/Versions/2.7/lib/libncursesw.5.dylib
0x103680000 - 0x1055f6fe7 +_snap.so (0) <3D2FCCCC-4937-3BC0-AF45-3FAAE655B204> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/_snap.so
0x1059c8000 - 0x105ab9ff7 org.python.python (2.7.10 - 2.7.10) <5A7838D3-24D4-395B-BE96-ADD007C86E55> /System/Library/Frameworks/Python.framework/Versions/2.7/Python
0x7fff65a4e000 - 0x7fff65a84f5f dyld (360.14) /usr/lib/dyld
0x7fff8576f000 - 0x7fff8579cfff libdispatch.dylib (500.1.5) <6B38497E-9448-3433-9D6B-6223F2A99431> /usr/lib/system/libdispatch.dylib
0x7fff85a25000 - 0x7fff85a27ff7 libsystem_configuration.dylib (801.1.1) /usr/lib/system/libsystem_configuration.dylib
0x7fff85c2c000 - 0x7fff85c4affb libedit.3.dylib (43) <744915BA-9B98-3256-8DBB-5C760132623F> /usr/lib/libedit.3.dylib
0x7fff85c4b000 - 0x7fff85c52ff7 libcompiler_rt.dylib (62) <253B36E5-572D-377D-AE99-A02CE32590E5> /usr/lib/system/libcompiler_rt.dylib
0x7fff86670000 - 0x7fff866beff7 libstdc++.6.dylib (104.1) <77780A99-22DB-35AA-BD9E-ADB83417E4BD> /usr/lib/libstdc++.6.dylib
0x7fff866d2000 - 0x7fff866eeff7 libsystem_malloc.dylib (67) <1B57A614-3D60-3F87-876F-7DB4AF38120F> /usr/lib/system/libsystem_malloc.dylib
0x7fff86d02000 - 0x7fff86d03ffb libSystem.B.dylib (1225.1.1) /usr/lib/libSystem.B.dylib
0x7fff86f0d000 - 0x7fff8711afff libicucore.A.dylib (551.24) /usr/lib/libicucore.A.dylib
0x7fff873c0000 - 0x7fff87406ff7 libauto.dylib (186) <460B0167-C89B-37EC-823C-52F684B31C26> /usr/lib/libauto.dylib
0x7fff8745f000 - 0x7fff878d3ff7 com.apple.CoreFoundation (6.9 - 1253) /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
0x7fff878d4000 - 0x7fff87c376d7 libobjc.A.dylib (680) <7C5FAD04-2C01-3ED6-AA40-78925C12A456> /usr/lib/libobjc.A.dylib
0x7fff87ca2000 - 0x7fff87caafef libsystem_platform.dylib (73.1.1) <3F4D2390-E3DE-3C24-A515-95DFAC8671C4> /usr/lib/system/libsystem_platform.dylib
0x7fff87cab000 - 0x7fff87d22fc7 libcorecrypto.dylib (334) <4E1B969F-8449-3B21-9880-51AD58E25AA6> /usr/lib/system/libcorecrypto.dylib
0x7fff87d23000 - 0x7fff87d3afff libsystem_asl.dylib (322) <97D794DA-8CE5-3676-AC5E-364F6D172BDA> /usr/lib/system/libsystem_asl.dylib
0x7fff881c4000 - 0x7fff88217ff7 libc++.1.dylib (120.1) <54190E1B-EE49-3D6D-AC29-2813D7380BA5> /usr/lib/libc++.1.dylib
0x7fff887e4000 - 0x7fff88813fc3 libsystem_m.dylib (3105) <07D50372-30ED-3B03-9FA0-0662BF8F0098> /usr/lib/system/libsystem_m.dylib
0x7fff88934000 - 0x7fff88952fff libsystem_kernel.dylib (3247.1.106) <7DD242A1-E2BF-39D1-8787-B174046E4F15> /usr/lib/system/libsystem_kernel.dylib
0x7fff88960000 - 0x7fff88965ff3 libunwind.dylib (35.3) /usr/lib/system/libunwind.dylib
0x7fff89504000 - 0x7fff89504ff7 liblaunch.dylib (755.1.19) <7EC0F297-43CC-3D11-B46B-7E72E372648A> /usr/lib/system/liblaunch.dylib
0x7fff8a6a5000 - 0x7fff8a6b6fff libz.1.dylib (60) <43317BEA-ACA2-34C2-AF37-902AA926C83A> /usr/lib/libz.1.dylib
0x7fff8a6f1000 - 0x7fff8a6faffb libsystem_notify.dylib (149) <56ABC155-CB99-30A8-A8B1-C204B5615092> /usr/lib/system/libsystem_notify.dylib
0x7fff8a98f000 - 0x7fff8a992fff libsystem_sandbox.dylib (459.1.8) <2F36D536-482C-39EC-BAFD-72297728F0A4> /usr/lib/system/libsystem_sandbox.dylib
0x7fff8afb6000 - 0x7fff8afb6ff7 libunc.dylib (29) /usr/lib/system/libunc.dylib
0x7fff8b280000 - 0x7fff8b281fff libDiagnosticMessagesClient.dylib (100) /usr/lib/libDiagnosticMessagesClient.dylib
0x7fff8ce9b000 - 0x7fff8cf28fe7 libsystem_c.dylib (1081.1.3) /usr/lib/system/libsystem_c.dylib
0x7fff8d9bd000 - 0x7fff8d9c2ff7 libmacho.dylib (875.1) /usr/lib/system/libmacho.dylib
0x7fff8d9c3000 - 0x7fff8d9cbfff libcopyfile.dylib (127) /usr/lib/system/libcopyfile.dylib
0x7fff8e038000 - 0x7fff8e096fff libsystem_network.dylib (582.1.4) <14ECA259-D471-3E47-A843-FF0990577893> /usr/lib/system/libsystem_network.dylib
0x7fff8e097000 - 0x7fff8e097ff7 libkeymgr.dylib (28) <47080280-8B57-3D75-8A20-9E100864DE27> /usr/lib/system/libkeymgr.dylib
0x7fff923a8000 - 0x7fff923acfff libcache.dylib (75) <4948E2C8-867F-3E9D-AAE7-2F30F0B345C6> /usr/lib/system/libcache.dylib
0x7fff925a5000 - 0x7fff925a6fff libsystem_secinit.dylib (20) <932ED582-E80F-39DA-B0FA-F1BC5F1AD2F8> /usr/lib/system/libsystem_secinit.dylib
0x7fff925b8000 - 0x7fff925b9ffb libremovefile.dylib (41) /usr/lib/system/libremovefile.dylib
0x7fff92851000 - 0x7fff9287afff libc++abi.dylib (125) /usr/lib/libc++abi.dylib
0x7fff92901000 - 0x7fff9292afff libxpc.dylib (755.1.19) <3E09C275-A33B-357A-B0AB-A2DDF88EC9D5> /usr/lib/system/libxpc.dylib
0x7fff93025000 - 0x7fff93056ff7 libncurses.5.4.dylib (46) <766F2188-F523-3FAA-AC1F-49447F09E133> /usr/lib/libncurses.5.4.dylib
0x7fff93137000 - 0x7fff93139fff libsystem_coreservices.dylib (19) <692631A0-1923-32CA-9BD5-044B1382FFDE> /usr/lib/system/libsystem_coreservices.dylib
0x7fff932c3000 - 0x7fff932ecff7 libsystem_info.dylib (476) <65D0643A-C8AE-3E8D-9F6E-E4AD823F16B2> /usr/lib/system/libsystem_info.dylib
0x7fff95a07000 - 0x7fff95a09ff7 libquarantine.dylib (80) <1693C5FE-EA0A-3122-85EB-7950ECC7435A> /usr/lib/system/libquarantine.dylib
0x7fff95d22000 - 0x7fff95d2afff libsystem_networkextension.dylib (384.1.2) <4736FCC5-9DBA-31F4-AAC8-CD0A177CF502> /usr/lib/system/libsystem_networkextension.dylib
0x7fff9694e000 - 0x7fff96962fff libsystem_coretls.dylib (82) <21EDACF1-D9B3-3086-9821-60EB75E7F965> /usr/lib/system/libsystem_coretls.dylib
0x7fff96966000 - 0x7fff96967fff libsystem_blocks.dylib (65) <1B4F1F10-823E-3781-8162-6884D14DF0D6> /usr/lib/system/libsystem_blocks.dylib
0x7fff969c1000 - 0x7fff969caff7 libsystem_pthread.dylib (137.1.1) <1373D0F1-C6CA-364E-A6BA-8BDBD0D34670> /usr/lib/system/libsystem_pthread.dylib
0x7fff96a45000 - 0x7fff96a56ff7 libsystem_trace.dylib (200) /usr/lib/system/libsystem_trace.dylib
0x7fff97794000 - 0x7fff97797ffb libdyld.dylib (360.14) /usr/lib/system/libdyld.dylib
0x7fff97798000 - 0x7fff977a3ff7 libcommonCrypto.dylib (60074) /usr/lib/system/libcommonCrypto.dylib
0x7fff98089000 - 0x7fff98091ffb libsystem_dnssd.dylib (624.1.2) /usr/lib/system/libsystem_dnssd.dylib

External Modification Summary:
Calls made by other processes targeting this process:
task_for_pid: 0
thread_create: 0
thread_set_state: 0
Calls made by this process:
task_for_pid: 0
thread_create: 0
thread_set_state: 0
Calls made by all processes on this machine:
task_for_pid: 18946
thread_create: 0
thread_set_state: 0

VM Region Summary:
ReadOnly portion of Libraries: Total=140.2M resident=0K(0%) swapped_out_or_unallocated=140.2M(100%)
Writable regions: Total=60.0M written=0K(0%) resident=0K(0%) swapped_out=0K(0%) unallocated=60.0M(100%)

                            VIRTUAL   REGION 

REGION TYPE SIZE COUNT (non-coalesced)
=========== ======= =======
Activity Tracing 2048K 2
Kernel Alloc Once 4K 2
MALLOC 49.7M 18
MALLOC guard page 16K 4
STACK GUARD 56.0M 2
Stack 8192K 2
VM_ALLOCATE 4K 2
__DATA 4184K 57
__LINKEDIT 91.4M 10
__TEXT 48.8M 55
__UNICODE 552K 2
shared memory 8K 3
=========== ======= =======
TOTAL 260.5M 147

PNEANetMP: Floating point exception (immediate)

Hi,

I am trying to test the new multithreaded PNEANetMP. However, I cannot even add a node. Here is code that reproduces the bug:

#include "Snap.h"
using namespace std;

int main() {
    PNEANetMP gg = TNEANetMP::New();
    gg->AddNode(1);
}

the gg->AddNode(1) line results in the following error and crash: "Floating point exception (core dumped)".

Let me know if I can help with debugging/testing, or if my code does something wrong. Thanks!

Node Iterator Misbehavior

I am using the C++ version of SNAP. I have an undirected graph with 174093228 nodes and 309401867 edges. The maximum node id in the graph is 174093228. The node iterator is behaving strangely when I input it nodes ids that do not exist in the graph. Please see the commands and the outputs below.

printf("%d\n", Graph->IsNode(500000000)); //output: 0
TUNGraph::TNodeI NI = Graph->GetNI(500000000);
printf("%d\n", NI.GetOutDeg()); //output: 1
printf("%d\n", NI.GetOutNId(0)); //output: 111788745
printf("%d\n", Graph->IsNode(NI.GetOutNId(0))); //output: 1
printf("%d\n", Graph->IsEdge(500000000, NI.GetOutNId(0))); //output: 0

R6010 -abort() has been cancelled error

Hi, When I tried the centrality code example, I tried another datasets which has about 6 million nodes and 9 million edges, and the program just crashed, the error message was as follows

R6010
- abort() has been called

My OS is windows server 2008, with 192G RAM, VS10. Thx.

LoadEdgeListStr does not work in Python

Have been trying to use the 'LoadEdgeListStr' function so that I could map the actual node value to the assigned node ID. Didn't manage to do so however.

One probable suspect could be because the syntax "LoadEdgeList(PGraph, InFNm, SrcColId, DstColId, Separator)" is not the same as the code.

For example, if I run the command "snap.LoadEdgeListStr(snap.PUNGraph, "", 0, 1, H)", I keep getting an error message as below:

" TypeError: in method 'LoadEdgeListStr_PUNGraph', argument 4 of type 'TStrHash< TInt > &' "

But if I simply run "snap.LoadEdgeListStr( "", 0, 1, H)" (as referenced from the source code in http://snap.stanford.edu/snap/doc/snapdev-ref/db/d68/gio_8h_source.html) I get no error at all.

Is this actually a bug or am I missing something? Currently I'm using SNAP distribution "snap-1.1-2.3-Win-x64-py2.7"

Integer overflow bug in TSnap::GenRndGnm

Location:
snap-core/ggen.h: 218

Assert checks that the desired number of edges to be created in the generated graph does not exceed the maximum number of edges possible, however since this arithmetic occurs on variables of type 'int', which are generally compiled to 32-bit integers even on x86_64 architectures, it is possible that even though there are enough nodes to fit the desires edges, the assertion will fail due to integer overflow. For instance, the error would be encountered when trying to generate a graph of size 1 million nodes and 1 million edges.

Replacing these with uint64_t will solve the problem, and some similar modifications within the function would be necessary to eliminate the resulting type casting warnings.

Problem with netstat, hop plots and large graphs

Hi Jure,

I recently downloaded the latest Version of Snap and tried to feed it with the latest Wikipedia (en Version) link network dump using netstat. It looks like the hop plots operation seems to fail for some reason if being applied (all other flags work fine). Any ideas why this happens?

Cheers,
Chris

Edit: I am running Mac OS X 10.8.2

Here my output:

./netstat -i:/Users/xt/wikipedia_links.txt -p:cdwsh

GraphInfo. build: 15:39:09, Nov 31 2012. Time: 01:18:39 [Nov 31 2012]

Input graph (one edge per line, tab/space separated) (-i:)=/Users/xt/wikipedia_links.txt
Directed graph (-d:)=Yes
Output file prefix (-o:)=graph
Title (description) (-t:)=
What statistics to plot string:
c: cummulative degree distribution
d: degree distribution
h: hop plot (diameter)
w: distribution of weakly connected components
s: distribution of strongly connected components
C: clustering coefficient
v: singular values
V: left and right singular vector

(-p:)=cdwsh

Loading...directed graph (TXT format)
/Users/xt/wikipedia_links.txt: Directed
Nodes: 27534914
Edges: 596307448
Zero Deg Nodes: 0
Zero InDeg Nodes: 15266343
Zero OutDeg Nodes: 353159
NonZero In-Out Deg Nodes: 11915412
Creating plots...
*** Error: Execution stopped: 0<=_Vals, file ../../glib-core/ds.h, line 443
*** Error: Execution stopped: ((this!=GetNullRStr())&&(Refs==0))||((this==GetNullRStr())&&(Refs==1)), file ../../glib-core/dt.h, line 351

Node iterator on graph resulted from GetBfsTree behaves buggy

Iterating through the non-root nodes and displaying tree->GetNI(node)->GetInDeg() sometimes shows values larger than 1 (meaning that a node would have more than one parent).

Iterating through the values contained in GetInNId() I see the same value multiple times (so it seems that for some reason there are multiple edges from the parent to the child node).

an error about community detection using the CNM method

I have an error when I use the method provided by snap to detect the community of a social network dataset. You can see the detail in the appendix, which includes the file named "cnm_test.py" that the source code about my test program and another file named "ca-CondMat.txt" which is a social network dataset and the last file named "log.txt" recording the output error information in my computer. I am looking forward to your reply. Thank you for your help!

the text in the file named "cnm_test.py"

#!/usr/bin/python
import sys
import snap

def create_graph_from_file(filename):
sf = open(filename)
lines = sf.readlines()
TG = snap.TUNGraph.New()
for item in lines:
if '#' in item:
continue
item = item[:-1]
item = item.split('\t')
if TG.IsNode(int(item[0])) == False:
TG.AddNode(int(item[0]))
if TG.IsNode(int(item[1])) == False:
TG.AddNode(int(item[1]))
TG.AddEdge(int(item[1]),int(item[0]))
print "load nodes : %d" % TG.GetNodes()
sf.close()
return TG

def main():
print "create graph start!!"
G = create_graph_from_file("./ca-CondMat.txt");
print "create graph finish!!"

CmtyV = snap.TCnComV()
modularity = snap.CommunityCNM(G,CmtyV)
ret_list = []
for Cmty in CmtyV:
	temp = []
	for NI in Cmty:
		temp.append(NI)
	ret_list.append(temp)
print "community size: %d " % len(ret_list)
return 0;

if name == "main":
main()

the context of the file "log.txt"

create graph start!!
load nodes : 23133
create graph finish!!
Traceback (most recent call last):
File "./cnm_test.py", line 40, in
main()
File "./cnm_test.py", line 29, in main
modularity = snap.CommunityCNM(G,CmtyV)
File "/usr/local/lib/python2.7/dist-packages/snap.py", line 35949, in CommunityCNM
return _snap.CommunityCNM(*args)
RuntimeError: Execution stopped: (0<=ValN)&&(ValN<Vals) [Reason:'Index:-1 Vals:23133 MxVals:23133 Type:4TVecI11THashKeyDatI4TIntN5TSnap11TSnapDetail11TCNMQMatrix8TCmtyDatEEiE'], file ../../snap/glib-core/ds.h, line 469

Node (Edge) betweenness centrality for directed graphs ?

Hi, guys

Did SNAP provides algorithms for calculating betweenness centrality for directed graphs ?

I have found 4 related functions, but all of them only accept undirected graphs as parameters, as followings:

void TSnap::GetBetweennessCentr (const PUNGraph &Graph, TIntFltH &NodeBtwH, const double &NodeFrac)'

void TSnap::GetBetweennessCentr (const PUNGraph &Graph, TIntPrFltH &EdgeBtwH, const double &NodeFrac)

void TSnap::GetBetweennessCentr (const PUNGraph &Graph, TIntFltH &NodeBtwH, TIntPrFltH &EdgeBtwH, const double &NodeFrac)

void TSnap::GetBetweennessCentr (const PUNGraph &Graph, const TIntV &BtwNIdV, TIntFltH &NodeBtwH, const bool &DoNodeCent, TIntPrFltH &EdgeBtwH, const bool &DoEdgeCent)

TStr thread safety

I noticed that TStr is thread unsafe. It would be useful to have some comments warning about this. An example that I ran into is the following:

Let C be a TStr and consider these two lines running in parallel:
TStr A = C;
TStr B = C;

C.RStr->MkRef() would be run concurrently which can cause the reference count to be incorrect.

Log Likelihood Modification in Kronfit

I was also wondering what changes should I make in the code if i want to compute the log likelihood in Kronfit a way such that it considers the log likelihood of the graph considering the probability of the edges for nodes which are non existent in the initial graph as zero. I tried the KronEM version but it scales the initial parameters after 1 iteration and doesn't run (gives an error) for a single iteration for both E and M.

Thanks!

defining max and min in Snap conflicts with other uses

Currently, because SNAP uses min and max as define directives, if another library or developer code makes use of min / max functions with other signatures, it generates a conflict.

The solution I came with is un-defining min/max after SNAP inclusion:

include "snap/snap-core/Snap.h"

undef max

undef min

Otherwise, trying to use, for example, numeric_limits::min() will generate a compile error. This also implies that some standard library headers or custom headers need to be included afterwards.

Win x64 snap.py multiple erros on setup.py

Hi,

I had a little trouble with installing the snap.py win x64 from snap.stanford.edu. Turns out the setup.py file had a few errors. I've corrected them for you though, please see attached file and replace in the zip file on the snap site :)

The changes are lines 94, 158-161, 163.
94 had tabs in, so I removed these with 12 spaces.
the remaining lines required parenthesis.

Thank you for the files though and data sets, I'm now going to have a play!

setup.py.txt

import error

Hi all,

I just upgraded my snap.py to the newest version, but has below problems when importing the package

/Users/danqing0703/anaconda/lib/python2.7/site-packages/snap.py in <module>()
   4726 TDbStr.__eq__ = new_instancemethod(_snap.TDbStr___eq__,None,TDbStr)
   4727 TDbStr.__lt__ = new_instancemethod(_snap.TDbStr___lt__,None,TDbStr)
-> 4728 TDbStr.GetStr = new_instancemethod(_snap.TDbStr_GetStr,None,TDbStr)
   4729 TDbStr.GetPrimHashCd = new_instancemethod(_snap.TDbStr_GetPrimHashCd,None,TDbStr)
   4730 TDbStr.GetSecHashCd = new_instancemethod(_snap.TDbStr_GetSecHashCd,None,TDbStr)

AttributeError: 'module' object has no attribute 'TDbStr_GetStr'

Would really appreciate it if someone could take some time to answer this , thanks!

MSVC2015: "Operator-Definitions" in bd.h lead to "error C2593: 'operator >=' is ambiguous"

When trying to use both SNAP and spdlog (https://github.com/gabime/spdlog), the compiler complains about ambiguous operators:

d:\develop\git\tspe\deploy\include\spdlog\details../sinks/file_sinks.h(169): error C2593: 'operator >=' is ambiguous
1> C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\include\thread(246): note: could be 'bool std::operator >=(std::thread::id,std::thread::id) noexcept' [found using argument-dependent lookup]
1> d:\develop\git\tspe\deploy\include\snap\bd.h(423): note: or 'bool operator >=std::chrono::system_clock::time_point(const TRec &,const TRec &)'
1> with
1> [
1> TRec=std::chrono::system_clock::time_point
1> ]
1> C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\include\chrono(683): note: or 'bool std::chrono::operator >=std::chrono::system_clock,std::chrono::system_clock::duration,std::chrono::system_clock::duration(const std::chrono::time_pointstd::chrono::system_clock,std::chrono::system_clock::duration &,const std::chrono::time_pointstd::chrono::system_clock,std::chrono::system_clock::duration &)'
1> d:\develop\git\tspe\deploy\include\spdlog\details../sinks/file_sinks.h(169): note: while trying to match the argument list '(std::chrono::system_clock::time_point, std::chrono::system_clock::time_point)'
1> d:\develop\git\tspe\deploy\include\spdlog\details../sinks/file_sinks.h(168): note: while compiling class template member function 'void spdlog::sinks::daily_file_sinkspdlog::details::null_mutex::_sink_it(const spdlog::details::log_msg &)'
1> C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\include\type_traits(391): note: see reference to class template instantiation 'spdlog::sinks::daily_file_sinkspdlog::details::null_mutex' being compiled
1> C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\include\memory(526): note: see reference to class template instantiation 'std::is_convertible<_Ty2 _,_Ty *>' being compiled
1> with
1> [
1> _Ty2=spdlog::sinks::daily_file_sink_st,
1> Ty=spdlog::sinks::sink
1> ]
1> d:\develop\git\tspe\deploy\include\spdlog\details/spdlog_impl.h(49): note: see reference to function template instantiation 'std::shared_ptrspdlog::logger spdlog::create<spdlog::sinks::daily_file_sink_st,std::string,const char
,int,int,bool>(const std::string &,std::string,const char *,int,int,bool)' being compiled
1>

netevol reading files in random order

The input for netevol is like graph*.txt, which is supposed to read the evolving graphs as in order of graph1.txt, graph2.txt, graph3.txt
but in linux the readdir gives the filename in random order, which produce incorrect results.

Temporary work around: get file in list and sort them

THashSet::Defrag()

I tried to defrag a THashSet (TIntSet) and got an compile-error:
....
/snap/glib-core/shash.h:1364:7: error: no matching function for call to ‘THashSet::GetKey(int&, TInt&)’

I suppose in function: void THashSet<TKey, THashFunc>::Defrag()
GetKey(KeyId, Key); should be GetKey(KeyId);
File: shash.h in glib-core
Line: 1364


//Example:

include "../snap/snap-core/Snap.h"

int main(int argc, char* argv[]) {

//TIntSet is a typedef for THashSet
TIntSet* a_set = new TIntSet();

//do something with set that leaves it fragmented (IsKeyIdEqKeyN() will be false)
a_set->AddKey(4);
a_set->AddKey(2);
a_set->DelKey(2);

//compile error
//a_set->Defrag();

//this will not work with a fragmented set
a_set->DelKeyId(a_set->GetRndKeyId(TInt::Rnd));

delete a_set;
return 0;
}

TNEANet node attribute retrieval

For TNEANet nodes, attribute-values can be added to nodes as key-values to a hash:
TNEANet::TNodeI NI = GU->BegNI();
GU->AddIntAttrDatN(NI, 10, "test");
GU->AddIntAttrDatN(NI, 20, "test1");
GU->AddIntAttrDatN(NI, 30, "test2");

Then when attributes need to be retrieved (all of them) for a node, they can only be retrieved as a vector, instead of a hash. So, the access at values for each of the attributes associated to a node is positional now, rather than key-based. Given that adding attribute values is hash-like, how can one know where to retrieve in the vector the value for a certain attribute (key)? For example:
NId = NI.GetId();
GU->AttrValueNI(NId, NIdAttrValue);
std::string curTestString = NIdAttrValue2;
// value at index 2 corresponds to what attribute (test, test1, or test2)?
int curTest = atoi(curTestString.c_str());
printf("%d\n", curTest);

Is there a way to retrieve all attributes for a node in a hash (i.e. based of attribute names) instead of in a vector? Would an iterator for each attribute (vertical retrieval in the dataset) be sufficient – meaning, multiple iterators, each for one attribute: if advancing iterators in parallel, but independently, will attribute values be retrieved for the same object?

wrong k setting in clique community detection example

file cliquesmain.cpp in cliques folder of examples
The default value set for k in cliquesmain is k=2
which then is incremented by one here..
TCliqueOverlap::GetCPMCommunities(G, OverlapSz+1, CmtyV);
This is fine
but the problem is when user argument suppose k=5
then the algorithm finds cliques of k bcoz of increment

Thus, need to change default value of k to 3 (line 11)
and remove the increment (+1) in this line (line number 37). So change lines to these...

[line 11] const int OverlapSz = Env.GetIfArgPrefixInt("-k:", 3, "Min clique overlap");
[line 37] TCliqueOverlap::GetCPMCommunities(G, OverlapSz, CmtyV);

Large backgrounds get resized or blurry

When my students import a background that was larger than the stage, the background renders as "blurry." I think it's automatically resizing to the width? She imported an image that 960px wide. Here's the project.

Thanks for your help.

node2vec on Wikipedia

I am working on training node2vec embeddings on Wikipedia using snap/examples/node2vec. I am wanting to understand what is causing high memory usage and slow runtime to see if there is something I can do to improve performance.

System: AWS EC2 x1.32xlarge instance, 2TB RAM, 128 cores
Dataset:

  • Raw node list: s3://entilzha-us-west-2/wiki-network/titles-sorted.txt
  • Raw edge list: s3://entilzha-us-west-2/wiki-network/links-simple-sorted.txt
  • These are processed using https://github.com/Pinafore/qb/blob/master/wiki_network/prepare_data.py via preprocess_titles and n2v_edge_list
  • This filters the dataset down and produces an edge list to feed in using node2vec -i:edge_list.txt -o:wiki.emb -v -dr

Final dataset info (I can post this input if its helpful):

  • 4,427,968 nodes
  • 82,876,900 edges

When this is run it uses 750GB of RAM and it has gotten to learning the word2vec embeddings, but the process seems very slow at about 1% per hour for having ~100% utilization on all 128 cores.

My general questions are:

  1. Is there any guidance for training node2vec on wikipedia or more generally on a graph this large? The paper references one wiki dataset but seems like that is NLP centric and is first million bytes of text.
  2. On a related note, are there by chance pre trained embeddings available from some other project?
  3. Looking at the node2vec algorithm in 3.2.3, the observed memory usage seems quite high. The main things that consume memory seem like: the data structure holding walks and the data structure holding embeddings. The embeddings should be emb_dim*number_of_nodes~500,000,000 and assuming that each entry is a double should be 500,000,000*8 bytes = 4GB. For walks it should be something like n_walks*walk_size*number_of_nodes=10*80*4,000,000= 3,200,000,000 entries and again assuming each entry is a double would lead to 25GB. Memory usage seems to go up here https://github.com/snap-stanford/snap/blob/master/snap-adv/n2v.cpp#L14. Any thoughts on why it's using 30x that amount of memory?
  4. Is there a way that I could get just the random walk output and train the SGD step separately (on a GPU)?

From what I can tell things that should affect these would be:

  1. Context size would make SGD faster
  2. Number and length of walks per source would use less memory, but I don't know how small these should be for something like wikipedia (or if the defaults themselves are too low even)

Thanks!

cannot install snap 3.0.0 on Corn

I am downloading snap-3.0.0 version, but cannot install it on Corn using pip install -user setup.py. It produces the following error:
"Collecting setup.py
/usr/local/lib/python2.7/dist-packages/pip/vendor/requests/packages/urllib3/util/ssl.py:318: SNIMissingWarning: An HTTPS request has been made, but the SNI (Subject Name Indication) extension to TLS is not available on this platform. This may cause the server to present an incorrect TLS certificate, which can cause validation failures. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#snimissingwarning.
SNIMissingWarning
/usr/local/lib/python2.7/dist-packages/pip/vendor/requests/packages/urllib3/util/ssl.py:122: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. You can upgrade to a newer version of Python to solve this. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
InsecurePlatformWarning
Could not find a version that satisfies the requirement setup.py (from versions: )
No matching distribution found for setup.py
You are using pip version 8.1.2, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
"

Am I not using the right command?

Krogen

Hi,

I am not able to find the kronecker.h file in this. Can you please provide me the full code for the krogen which will be helpful for me in my research.

Thanks,

Narendra.

Execution stopped: !IsNode(NId) [Reason:'NodeId 1 already exists']

I have a very simple code that the AddNode function creates the following error for me:

*** Error: Execution stopped: !IsNode(NId) [Reason:'NodeId 1 already exists'], file graph.cpp, line 12

The problematic line :
if (!Graph->IsNode(srcNId)) Graph->AddNode(srcNId);

This totally fine by compiling under Windows. But, I must compile my program on a cluster that works with linux (sharcnet.ca), unfortunately SNAP does not fit with their system, there are several problems I found out. For example, using just GenGrid function gave me segmentation fault.

This is the whole code:

PUNGraph Graph = TUNGraph::New();
TSsParser parser("grid.txt", ssfWhiteSep, true, true, true);
while (parser.Next()) {
const int srcNId = parser.GetInt(0);
const int dstNId = parser.GetInt(1);
if (!Graph->IsNode(srcNId)) Graph->AddNode(srcNId);
if (!Graph->IsNode(dstNId)) Graph->AddNode(dstNId);
if (!Graph->IsEdge(srcNId, dstNId))
Graph->AddEdge(srcNId, dstNId);
}

Thanks in advance.

RuntimeError: Execution stopped while running CNM on DBLP dataset from the SNAP database

The dataset is DBLP dataset.

Code -

import snap,sys

UGraph = snap.LoadEdgeList(snap.PUNGraph, "../dataset/new_graph.txt",0,1)
CmtyV = snap.TCnComV()
modularity = snap.CommunityCNM(UGraph, CmtyV)

Error message -

Traceback (most recent call last):
  File "cnm.py", line 5, in <module>
    modularity = snap.CommunityCNM(UGraph, CmtyV)
  File "/amd/hamsa.cs.iiests.ac.in/users1/student/be2014/sarbajits/Graph/CNM/snap.py", line 35949, in CommunityCNM
    return _snap.CommunityCNM(*args)
RuntimeError: Execution stopped: (0<=ValN)&&(ValN<Vals) [Reason:'Index:-2147403659 Vals:1073789418 MxVals:1298759680 Type:4TVecI7TTripleI4TFlt4TIntS2_EiE'], file /home/rok/include/glib/ds.h, line 469

Specifications of machine used -

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                24
On-line CPU(s) list:   0-23
Thread(s) per core:    2
Core(s) per socket:    6
Socket(s):             2
NUMA node(s):          2
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 45
Stepping:              7
CPU MHz:               1200.000
BogoMIPS:              3791.94
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              15360K
NUMA node0 CPU(s):     0-5,12-17
NUMA node1 CPU(s):     6-11,18-23

If any more information is needed, please let me know. Thanks!

Finding Progression Stages in Time-evolving Event Sequences

A colleague recently showed me the WWW 2014 paper “Finding Progression Stages in Time-evolving Event Sequences” recently, and it could be just the thing I need for analyzing some health claims data.

In the paper, it says the code is available in the snap package, but I couldn’t find it in the repo (https://github.com/snap-stanford/snap-dev ). I did find some implementation (https://github.com/m-ochi/progression_stage_model ), and it seems to work, but I’m hoping yours is faster.

Is it around here somewhere?

Malloc error: pointer being freed was not allocated

When my program finishes executing I get the following error (with backtrace):

malloc: *** error for object 0x100000010: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug

Program received signal SIGABRT, Aborted.
0x00007fff8a66ad46 in __kill ()
(gdb) backtrace
#0  0x00007fff8a66ad46 in __kill ()
#1  0x00007fff8fd5fdf0 in abort ()
#2  0x00007fff8fd339b9 in free ()
#3  0x0000000100003763 in TVec<TInt, int>::~TVec (this=0x101dbffe8) at ds.h:438
#4  0x0000000100003ba2 in TNEANet::TNode::~TNode (this=0x101dbffe0) at network.h:1466
#5  0x0000000100003bc0 in THashKeyDat<TInt,      TNEANet::TNode>::~THashKeyDat (this=0x101dbffd4) at hash.h:7
#6  0x0000000100003c3c in TVec<THashKeyDat<TInt, TNEANet::TNode>, int>::~TVec (this=0x100609830) at ds.h:438
#7  0x0000000100003190 in THash<TInt, TNEANet::TNode, TDefaultHashFunc<TInt> >::~THash (this=0x100609820) at hash.h:88
#8  0x000000010000518a in TNEANet::~TNEANet (this=0x100609810) at network.h:1461
#9  0x00000001000051ed in TPt<TNEANet>::UnRef (this=0x7fff5fbff8c0) at bd.h:472
#10 0x000000010000412a in TPt<TNEANet>::~TPt (this=0x7fff5fbff8c0) at bd.h:480

Duplicate Communities in BigClam Output

Using BigClam with the default parameters, I received duplicate communities in the output. Is this the intended behavior? Perhaps, duplicate communities should be filtered?

Graph
0 5
0 9
0 16
0 17
0 23
0 26
1 18
1 11
1 21
2 25
2 18
2 14
3 4
3 7
3 16
3 18
3 19
3 29
4 8
4 10
5 18
5 19
5 7
6 8
6 24
6 26
6 27
7 18
7 27
7 23
8 9
8 13
8 22
8 29
9 27
10 20
10 27
10 29
11 18
11 21
12 16
12 17
13 18
14 21
14 22
16 20
16 29
17 20
20 21
21 28

Output
3 29 16
11 1 18
21 11 1
3 7 18
18 3 7
18 7 5
18 11 1
18 5 7
21 11 1
5 7 18
5 7 18
5 7 18
5 7 18

Example run time seems to report CPU and not wall-clock time.

When running the examples the runtime reported increases as more threads are used. Example output:

Run with single thread:
$ ./bigclam 
ragm. build: 14:58:44, Apr  8 2016. Time: 17:38:36 [Mar 24 2016]
================================================================
[...]
Graph: 6474 Nodes 13895 Edges
conductance computation completed [0.02s]
MLE completed with 550 iterations(4 secs)6, Diff: 16.180445 [4 secs]]]]]
run time: 11.27s (17:38:40)

Run with multiple threads:
$ ./bigclam -nt:24
ragm. build: 14:58:44, Apr  8 2016. Time: 17:38:49 [Mar 24 2016]
================================================================
[...]
Graph: 6474 Nodes 13895 Edges
conductance computation completed [0.02s]
MLE completed with 319 iterations(1 secs)Community vector generated. 2 communities are ommitted

run time: 26.06s (17:38:51)

Unable to use polymorphism with Snap Graph Pointers?

I'm tryint to call TSnap::IsConnected(PGraph &) with a TPt<TNodeEDatNet<Tpair<int,int>,Tpair<int,int>>> as argument, and I thought it would implicitly convert it to a pointer to Graph, the base class. This is not the case and however I try I'm not able to assign a derived class to a snap base pointer. Why is that? How can I call IsConnected on a TNodeEDatNet graph?

Segmentation Fault in Magfit

Hi there,

I'm running into a segmentation fault when using the magfit executable. I'm on Ubuntu 12.04LTS 64-bit. The error is as follows:


EM Iteration : 2

EStep iteration : 1

Program received signal SIGSEGV, Segmentation fault.
0x000000000040e8ac in TMAGFitBern::GetProbMu(int const&, int const&, int const&, int const&, int const&, bool, bool) const ()

A backtrace reveals:

(gdb) backtrace
#0 0x000000000040e8ac in TMAGFitBern::GetProbMu(int const&, int const&, int const&, int const&, int const&, bool, bool) const ()
#1 0x000000000040eb25 in TMAGFitBern::GetAvgThetaLL(int const&, int const&, int const&, bool, bool) const ()
#2 0x00000000004164da in TMAGFitBern::UpdateApxPhiMI(double const&, int const&, int const&, double&, TVVec&) ()
#3 0x00000000004189e9 in TMAGFitBern::DoEStepApxOneIter(TVec<TFlt, int> const&, TVVec&, double const&) ()
#4 0x00000000004209fd in TMAGFitBern::DoEMAlg(int const&, int const&, int const&, double const&, double const&, double const&, double const&, int const&) ()
#5 0x00000000004039b5 in main ()

Gnuplot dependency

According to the documentation, Gnuplot is not a hard dependency of the SNAP library and is expected to be in the system path only when you try to plot structural properties of networks. But this doesn't seem to be the case, because even the act of linking against Snap.o results in an error when starting the program if the system doesn't contain Gnuplot.

Example aa.cpp:

#include <stdio.h>
int main() {
        printf("foobar\n");
        return(0);
}

Compile without Snap.o:

$ g++ -std=c++98 -Wall -O3 -DNDEBUG -o aa aa.cpp -I../snap/snap-core -I../snap/snap-adv -I../snap/glib-core -I../snap/snap-exp  -lrt
$ export PATH=""
$ ./aa
foobar

Compile with Snap.o:

$ g++ -std=c++98 -Wall -O3 -DNDEBUG -o aa aa.cpp ../snap/snap-core/Snap.o -I../snap/snap-core -I../snap/snap-adv -I../snap/glib-core -I../snap/snap-exp  -lrt
$ export PATH=""
$ ./aa
sh: 1: gnuplot: not found
foobar

In my opinion some initialization code of some objects is being executed before main() and it tries to run something like system("gnuplot"). Tested with g++ (Debian 4.7.2-5) 4.7.2.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.