Hi, Sorry for positing a question here but I am not sure where else

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

thank you <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

Batched remove() - Using remove() less frequently than peek() about tape HOT 6 OPEN

roytmana commented on July 28, 2024

Batched remove() - Using remove() less frequently than peek()

from tape.

Comments (6)

swankjesse commented on July 28, 2024 1

Tape wasn't designed to be super high throughout; it was designed to be simple and safe.

If you want to ramp up throughput, encode items in batches. It's a more complicated API but you'll save fsync calls which dominates performance.

from tape.

roytmana commented on July 28, 2024

I retract my question. I did not realize that pick()/iterate does not keep its own position in the queue - always start from head which is only advanced by remove. So it looks like in a multi-threaded environment there is no way to read from queue process data and remove of successful except keeping the lock for entire period from starting iteration to end of processing and removal. So if consumer for some reason stuck producers will not be able to put anything on the queue till the consumer released the lock.

I guess in case of a single consumer and multiple producers I still could lock, peek, unlock, process then lock and remove assuming the only one consumer performing any remove operation

from tape.

JakeWharton commented on July 28, 2024

I wouldn't hold the lock during processing. I would lock to retrieve items, process the items, and then lock to remove the items. So long as you have only a single consumer this won't cause concurrency issues as producers are only able to append.

…

On Sun, Apr 15, 2018, 6:01 PM Alex Roytman ***@***.***> wrote: I retract my question. I did not realize that pick()/iterate does not keep its own position in the queue - always start from head which is only advanced by remove. So it looks like in a multi-threaded environment there is no way to read from queue process data and remove of successful except keeping the lock for entire period from starting iteration to end of processing and removal. So if consumer for some reason stuck producers will not be able to put anything on the queue till the consumer released the lock — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#189 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEEEVNrxh7DOZAKWMtJUjsL-bO8DO_iks5to8M-gaJpZM4TVpR7> .

from tape.

roytmana commented on July 28, 2024

Thanks @JakeWharton I have switched to a single consumer for the queue with fork/join multiple parallel processors and then removal from the queue when all processors are done.
I am curious what performance to expect on adding small items (20 bytes). For me it is pretty slow - about 1500 items/sec on fast machine with SSD. I can pump date to a remote database faster. I tried to turn antivirus off but does not seem to make much difference. What is your typical insertion rate with small items?

from tape.

roytmana commented on July 28, 2024

thank you @swankjesse I was thinking about it but adding does not lend itself to batching in my case - the whole point is to put it on a persistent queue immediately for reliability not to accumulate it in memory. As far as reading I do drain whole the queue and then remove all drained items at once but to avoid extra fsync but iteration I think reads records one by one. I do not know if there is anything else I can do beyond bulk removal.
I really liked Tape when I found it and would prefer not to roll my own solution although my case is simpler - I just need ability to recover queue on crash/restart and dumping overflow data if consumer is not available. I could do with an in memory queue backed up by append-only fixed record size file with head index that is moved on every N-th peek because I can tolerate replaying already processed transactions. Given all that I think I can make it good deal faster but it'll still have many corner cases to worry about and test...

from tape.

roytmana commented on July 28, 2024

sorry, the last question - when 2.0 release is to be expected? I am using snapshot maven but would prefer a release if possible

from tape.

Batched remove() - Using remove() less frequently than peek() about tape HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent