the current model is to keep stuff in memory until the client closes the connection. T

Thx for the link, I had read it already. I like these posts too: <a href="http://www.c

clustering support (<a class="issue-link js-issue-link" data-error-text="Failed to loa

scale it about rtcstats-server HOT 9 CLOSED

fippo commented on July 19, 2024

scale it

from rtcstats-server.

Comments (9)

fippo commented on July 19, 2024

one way to split this up would be to write stuff to S3 once a single session closes. This will not change the model much, just make it more difficult to correlate GUM (which can still be contained in files for individual sessions)

from rtcstats-server.

fippo commented on July 19, 2024

I looked into this a little more. The big question is if we can/should move from the current data structure which has separate event streams for each peerconnection to one that has just one event stream.
We could easily write that event stream to a file by appending. For serialization, this does not make much of a difference.

For feature generation this adds an extra step of extracting the data for one peerconnection. That is O(number-of-events) both in terms of time and storage.

wdyt @ggarber?

from rtcstats-server.

fippo commented on July 19, 2024

https://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying -- worth reading re log style, in particular the section titled "Stateful Real-Time Processing"

from rtcstats-server.

ggarber commented on July 19, 2024

Thx for the link, I had read it already. I like these posts too: http://www.confluent.io/blog/stream-data-platform-1/

from rtcstats-server.

ggarber commented on July 19, 2024

In my opinion we should store a single file per Client and then generate multiple rows in the database with the features extracted from that file (one per stream) from that file.

That single file per client could be stored locally (appending logs as they are received from the websocket) and then pushed to S3.

Is that what you were proposing @fippo ?

from rtcstats-server.

fippo commented on July 19, 2024

yes. but that requires rewriting because we currently don't do a single log but one log for GUM and one per peerconnection

from rtcstats-server.

fippo commented on July 19, 2024

so I deployed this to production and it crashed 81 times within the first twelve hours (OOM)... and I don't give a f*** because this doesn't affect the client or service :-)

Will work on this though since i might be loosing quite some data.

from rtcstats-server.

fippo commented on July 19, 2024

clustering support (#105) helps a bit since on t2.medium instances you can easily run two processes. The core issue is still that we keep stuff around in memory for too long. We could do delta storage in-memory as well instead of blowing things up. Then we could also just do the expansion during feature extraction.

from rtcstats-server.

fippo commented on July 19, 2024

scales in production now

from rtcstats-server.

scale it about rtcstats-server HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent