Comments (9)
one way to split this up would be to write stuff to S3 once a single session closes. This will not change the model much, just make it more difficult to correlate GUM (which can still be contained in files for individual sessions)
from rtcstats-server.
I looked into this a little more. The big question is if we can/should move from the current data structure which has separate event streams for each peerconnection to one that has just one event stream.
We could easily write that event stream to a file by appending. For serialization, this does not make much of a difference.
For feature generation this adds an extra step of extracting the data for one peerconnection. That is O(number-of-events) both in terms of time and storage.
wdyt @ggarber?
from rtcstats-server.
https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying -- worth reading re log style, in particular the section titled "Stateful Real-Time Processing"
from rtcstats-server.
Thx for the link, I had read it already. I like these posts too: http://www.confluent.io/blog/stream-data-platform-1/
from rtcstats-server.
In my opinion we should store a single file per Client and then generate multiple rows in the database with the features extracted from that file (one per stream) from that file.
That single file per client could be stored locally (appending logs as they are received from the websocket) and then pushed to S3.
Is that what you were proposing @fippo ?
from rtcstats-server.
yes. but that requires rewriting because we currently don't do a single log but one log for GUM and one per peerconnection
from rtcstats-server.
so I deployed this to production and it crashed 81 times within the first twelve hours (OOM)... and I don't give a f*** because this doesn't affect the client or service :-)
Will work on this though since i might be loosing quite some data.
from rtcstats-server.
clustering support (#105) helps a bit since on t2.medium instances you can easily run two processes. The core issue is still that we keep stuff around in memory for too long. We could do delta storage in-memory as well instead of blowing things up. Then we could also just do the expansion during feature extraction.
from rtcstats-server.
scales in production now
from rtcstats-server.
Related Issues (20)
- number of different host candidates in onicecandidate
- location features HOT 2
- feature: candidate types after ice restart HOT 1
- write peerconnections to different files HOT 4
- work with spec-stats HOT 3
- chrome 51 broke stuff
- receivingvideo10s is broken
- extractTrack should not rely on onaddstream HOT 3
- Ability to start/end session outside opening/closing tab HOT 2
- inconsistent internal use of .timestamp HOT 1
- move the repository HOT 5
- feature: delta(qp) HOT 2
- ipv6 obfuscation HOT 1
- fileFormat 2 breaks time series in webrtc-dump-importer HOT 2
- gcp/bigquery support HOT 5
- use materialized views
- port publicLocation HOT 5
- maxmind postinstall won't work anymore HOT 3
- navigator.userAgent deprecation
- BigQuery ignore unknown values HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rtcstats-server.