pometry / raphtory Goto Github PK

View Code? Open in Web Editor NEW

323.0 323.0 54.0 138.67 MB

Scalable graph analytics database powered by a multithreaded, vectorized temporal engine, written in Rust

Home Page: https://raphtory.com

License: GNU General Public License v3.0

Python 17.51% Rust 82.33% HTML 0.01% JavaScript 0.08% Dockerfile 0.03% Shell 0.03%

analytics database embedded-database graph graph-database neo4j olap python rust temporal time-series

raphtory's People

Contributors

Stargazers

Watchers

Forkers

marcolotz aleskandro lagordamotoneta chorograph imasgo imanehaf barrell12 narnolddd emanuelebonaventura wuliaososhunhun niallroche jingmouren chaoyue729 aniampio dorely103 abdheshkumar knuz michalmonselise hoangqngo miratepuffin whiz-tuhin felixcdr calzadaharo lastin pometry-team frgomes ricopinazo brainzhong peijie-zhong nrs1729 lejohnyjohn codewithutkarsh aw4309 jamesscottbrown aliyassin4 btcwfd dorsa-arezooji-reply w31rdm4ch1nz back7 benosullivan08 dullaz sgoggins russellwmy shivam-880 if-hyf xmtx2036 jamestiotio haoxins manlius arfon d4rkisek adi2k brandon-haugen hallofstairs

raphtory's Issues

Convert ClusterUp container to watchdog which receives keep alive messages

Fine grained clearing of temporary analytical state

Currently after the completion of a flattening algorithm the algorithm state of vertices is wiped by dumping the whole map and creating a new one. This is because the removal of objects from the map is drastically slow and requires tracking any of property names added. This, however, means that flattenings running in parallel may remove the state of another algorithm. There needs to therefore be a higher level controller that flushes this data when there is no algorithms currently running (i.e. return results has been completed on all flattenings).

Improve garbage collection -- test different JVM options

Update build.sbt to create fat JARs for local execution

Currently Raphtory is only built into a docker image, need to update build.sbt to allow for assembly into fat jar.

Update Traits to be more generic

Current API for traits is a bit confusing, needs to be cleaned up and made to better fit all datasources

Update main readme to reference to new docs

Main readme is very out of date and needs to reference the new docs at https://raphtory.github.io/ as well slack and other Raphtory related links.

implement Caching profiles for Archivist -- Temporal priority vs Spacial priority

Tidy up compile at runtime analyser submission

Current analysis submission for non compiled classes is POC (proof of concept not piece of crap, though this is also true) and needs to be revamped.

Fix complex entity filtering

Currently, if a user were to filter vertices based on entity type/property, some vertices may be removed, but their neighbours will be unaware of this. Any messages sent to the vertex will then raise an error. This is also important in instances of edge counting etc.

Dynamic scaling of cluster

Can currently monitor the utilisation of resources within a cluster, and deploy more containers automatically via Docker Swarm. Not currently implemented, but Routers can currently be brought into a cluster as they are stateless. New Spouts may join to push/pull new data. Live Analysis Managers may join but Analyser must already be present (see #6). Partition Managers are currently static as entities cannot migrate (#31)

Add iterative functionality to Live Analysis managers

Currently Live Analysis Managers can only probe the graph, requesting a Analyser be run by the Partition managers. These receive the Vertex and Edge map, but cannot save data to the partition manager.

Therefore, need to add area within each entity to store processing information (such as PageRank etc.)

Speed up vertex mailbox

The current implementation of the vertex mailbox/multi-queue is clearly using the wrong data-structures as it is by far the slowest part of analysis (even in a local deployment where we are passing by reference and the network stack can be ignored). This must initially be improved on by a more appropriate choices of structure, but may require a full redesign.

Merge all old branches

Finalise conversion of Routers to pull based

Initial testing of Routers in more of a pull based model from the spouts largely reduces their crashing. Need to remove the current buffering of data within the Spout as this is now having issue.

Move example LAM, Router and Updater to example files, implementing traits.

The original Actors do not implement their traits, this should change as these are the most basic example for new users. Additionally these should be moved to a more suitable location inside of the examples folder.

Add Precision Time Protocol Synchronisation (Or vector clocks) to Spout and Router

Currently Raphtory can only ingest data which has a time field embedded within it. By syncing the clocks of either the spouts or routers we can ingest non-temporal data by assigning events a timestamp upon ingestion.

Docker Swarm compatibility

Reduce memory usage of graph entities

Rewrite readme to be up to date with current version

Global Analytical State

For many algorithms such as pagerank and Hub/authority their needs to be a small global state maintained. This should be completed by allowing the analysis manager to aggregate and report back with the next analyse superstep request.

Vertex messaging state not flushing

Similar (and possibly related to) issue #82 the vertex messaging state seems to be taking up an absurd amount of memory and in many instances does not get flushed out of the partition manager for some time (if at all).

Refactor project to rename com.gwz package to com.raphtory

Improve logging to include functions taking up CPU time

Logging currently measures the top 20 objects taking up the most heap space, would be nice to also record which functions are taking up the most of the actors time.

Singularity compatibility

Currently there are some basic scripts to convert the docker image into singularity, but this need to be expanded to run fully on the QM HPC cluster.

Full gab ingestion and user/post/topic ranking

Broken Pipe error on background thread when CPU maxing out

When any component is running at maximum CPU it seems that some background process is timing out for its TCP connection and printing a broken pipe stacktrace over and over. Seems to be related to Prometheus/kamon scraping, but needs to be investigated further.

GC overhead management

Long running instances of Raphtory seem to be suffering from GC issues with the collector not being able to clean up any heap. This would initially appear to be because of the graph state, however, on profiling the graph itself is a fraction of the memory. Currently running on shenandoah GC inside of the docker image to run GC in parallel, but needs to be investigated fully.

Get a new way to identify entities (Int for VertexID is not enough)

Update documentation following dev branch merge

Current citation (dev) branch has made some large changes to API, therefore, before it is merged with the master the documentation on the Raphtory site needs to be updated to include these changes.

Remove LOCAL argument

Currently if running locally you must set a flag, this is because the analysis task will otherwise send the analyser object by reference and cause the readers to all refer to the same one. Analysis task should, therefore, be updated to serialise this first, removing the need for the argument.

Clean up generation scripts

Current mutlimachinesetup is a bit of mess -- need to clean this up, incorperating seednode.sh into machine0 and to print less crap to terminal.

Edge visitors and how to send messages to non-neighbours

current live analysis conversion into temporal ranking

Fix zookeeper image to allow instant detach

PreviousStates periodic compression

We can store another (null) map (let's say toCleanMap) for each entity; periodically a thread will:

Assign to toCleanMap the reference of the previousHistory
Make a new instance of TreeMap and assign it to previousHistory
The worker thread will clean up the toCleanMap
toCleanMap and PreviousHistory have to be concatenated and reassigned to PreviousHistory at the end of the process. toCleanMap will be collected by the GC.

Add CI/CD pipeline

Once unit tests have been added via issue #88 this should be expanded to test new PRs as well as testing the dockerised version of Raphtory to ensure no issues occur in a distributed environment. To be assigned to Matt.

Develop clear API for temporal queries

Back-pressure on kafka spout

Current kafka spout on a pull from the broker seems to send everything it has on the topic, which for some can be hundreds of GB's. Seems to be an issue with the scala library, but needs to be fixed to make the spout viable.

Watchdog and fault tolerance improvements

Currently if container crashes, it will be rebooted and watchdog will assign a new ID to it. Incrementing PM/Router counter (This never decreases i.e. bad)

Add Snapshotting features

Add rolling snapshotter which removes the oldest entity history when a partition managers memory is close to full. This data should be the stored in an offline format which can be read back in if the Analysis Managers require it.

Fix Akka Mailbox Crashing

Currently when an actors mailbox is full the container will crash. There are two proposed solutions for this. Firstly, we can drop messages when this occurs, incorporating a middle man (Kafaka etc.) which will allow us to recover lost messages. The second option is to have communication between Graph Routers and Partition Managers to control the speed at which messages are sent. Possible third option ???

Slow down of analysis over a range of flattenings

Currently if a range is set running with many flattenings of the graph the first will run in milliseconds, but this will soon increase. This makes some sense as obviously the graph is larger in later flattenings. However, picking later flattening and running a singular analysis job on often shows a 10x reduction in run time. This suggests some state is not being removed causing a slowdown over time.

Create Spout for either graphtides or LDBC

Add range queries to Entity Retrieval Proxy

Currently entities which are fully archived may be retrieved from Redis, however, we need to additionally be able to pull a given range of history and merge that with an entity which is already in-memory. Additionally need to track the first appearance of an entity to know how far backs its history goes.

Implement Temporal Windows and Entity Decay

Add unit testing to current deployment

Need to add some basic unit testing following the integration of the local deployment, discussed in issue #78.

Allow Analysis Mangers to send new Analysers to running clusters

Currently to run an Analyser it must already be part of the image which the running cluster was established with. This means if you create a new function to run and request the partition managers to execute it a ClassNotFoundException will be thrown. Thus the whole cluster must be taken down and rerun with the new image.

To fix this, the LAM should be able to sent new Analysers to the partition managers where it can be compiled and placed into the correct directory. The function may then run as required.