<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

TCPMemcachedNodeImpl Queues overflow about spymemcached HOT 7 CLOSED

rconn01 commented on May 20, 2024

TCPMemcachedNodeImpl Queues overflow

from spymemcached.

Comments (7)

GoogleCodeExporter commented on May 20, 2024

Well, what I've thought would be the simple solution in the past would be to 
block
when adding to queue (optionally).  I figured an end user would want one of two
behaviors:

1)  Block when enqueuing.
2)  Get an exception and sent on your way.

You seem to suggest there's a third thing going on which is have the IO thread 
crash.
 That's definitely undesirable.

The thread scheduling thing is understandable.  It currently ends up doing two 
jobs,
and one of them is not IO (decoding the results).  I'd really like to push that 
to
the caller more, but previous attempts have resulted in rather ugly code.  
Simply
having a thread pool the transcoders pass through should speed up throughput 
and can
be implemented completely outside of the library, but I haven't tried it.

I've got a v3 branch where I'm breaking some compatibility to make room for 
more of
the things people have asked for and the kinds of things I've recognized as 
being
inflexibilities.  I'm hoping I can isolate the IO thread more and avoid 
decoding data
within it, but I haven't got to that part yet.

I'd rather not introduce checked exceptions on the interface, but an unchecked
exception for signaling such things wouldn't be too bad.

And of course, if people are having to have custom builds of my library, then 
that's
a failure in the library itself, so I do want to make sure that it is flexible 
enough
to match a variety of deployment criteria while not slowing down or growing in
complication (which is why I'm having to introduce incompatibilities moving 
forward).

If you have fixes for issues you're seeing, I'd certainly like to at least see 
them.

Original comment by [email protected] on 6 May 2008 at 11:28

from spymemcached.

GoogleCodeExporter commented on May 20, 2024

I'm attaching the latest versions of the files I've changed. I started with the 
2.0.1
codebase and haven't integrated back your most recent updates.

When I was first making changes, I started with just blocking the thread that 
was
attempting to enqueue onto the inputQueue. The problem there was that each 
thread had
a dbConnection, so eventually there were so many blocked threads with a DB 
connection
that our dbConnectionPool ran dry. 

Now, with more time I could have each thread give up it's dbConnection before 
making
an attempt to enqueue onto the inputQueue...but in practice we would rather 
have the
operation fail immediately and take appropriate action (log an error and 
continue for
sets, and return an xml exception message for gets). Which is why the code now 
throws
an exception. I figured the users of the API client could always sleep and 
re-attempt
its operation if it wanted blocking type behavior.

I haven't come up with a clean idea for the IO thread scheduling issue, mostly
because the JVM's don't honor the thread priority very well. In my last project 
this
issue was really a core problem, and our solution was to have an intelligent 
thread
scheduler and that would give time to the appropriate threads to run to keep 
the IO
buffers moving along...

That project was entirely written around that concept, and it's not really 
viable for
the amount of time I have for this work.

Original comment by [email protected] on 12 May 2008 at 7:13

Attachments:

from spymemcached.

GoogleCodeExporter commented on May 20, 2024

Here is an example of the IO thread getting an unhandled queue full exception:

Exception in thread "Memcached IO over {MemcachedConnection to
xxx.xxx/xx.xx.xx.xx:11211 xxx.xxx/xx.xx.xx.xx:11211 }"
java.lang.IllegalStateException: Queue full
at java.util.AbstractQueue.add(AbstractQueue.java:64)
at java.util.AbstractQueue.addAll(AbstractQueue.java:143)
at net.spy.memcached.MemcachedNodeImpl.copyInputQueue(MemcachedNodeImpl.java:62)
at 
net.spy.memcached.MemcachedConnection.handleInputQueue(MemcachedConnection.java:
210)
at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:141)
at net.spy.memcached.MemcachedClient.run(MemcachedClient.java:715) 

This is the writeQ running out of space. I have an example of the readQ doing 
the
same if you'd like to see :)

Original comment by [email protected] on 12 May 2008 at 7:19

from spymemcached.

GoogleCodeExporter commented on May 20, 2024

I didn't much like the IOException thing.  Without that, I got your change down 
to this:

diff --git a/src/main/java/net/spy/memcached/protocol/TCPMemcachedNodeImpl.java 
index 23ea3a7..6cdf5f3 100644
--- a/src/main/java/net/spy/memcached/protocol/TCPMemcachedNodeImpl.java
+++ b/src/main/java/net/spy/memcached/protocol/TCPMemcachedNodeImpl.java
@@ -61,7 +61,10 @@ public abstract class TCPMemcachedNodeImpl extends SpyObject
         */
        public final void copyInputQueue() {
                Collection<Operation> tmp=new ArrayList<Operation>();
-               inputQueue.drainTo(tmp);
+
+               // don't drain more than we have space to place
+               inputQueue.drainTo(tmp, writeQ.remainingCapacity());
+
                writeQ.addAll(tmp);
        }

@@ -108,7 +111,7 @@ public abstract class TCPMemcachedNodeImpl extends SpyObject
         * @see net.spy.memcached.MemcachedNode#fillWriteBuffer(boolean)
         */
        public final void fillWriteBuffer(boolean optimizeGets) {
-               if(toWrite == 0) {
+               if(toWrite == 0 && readQ.remainingCapacity() > 0) {
                        getWbuf().clear();
                        Operation o=getCurrentWriteOp();
                        while(o != null && toWrite < getWbuf().capacity()) {


I don't have a test that overflows, though.  The stack trace provides some 
hints as
to where it might be possible to cut it off, but the code you sent me looks 
like it
might do a decent job of that itself.

I'll have to play around with some manual tests.

Original comment by [email protected] on 13 May 2008 at 6:06

from spymemcached.

GoogleCodeExporter commented on May 20, 2024

OK, as of f1969bf1f88b62a71dcc9f392c4c9f0756fcea09 I think I've got more 
control over
this.

Specifically, I've written some tests to contrive queue overflows and worked on
getting them to pass.

Please enhance the test if you have another failing case.

Original comment by [email protected] on 14 May 2008 at 5:21

Changed state: Fixed

from spymemcached.

GoogleCodeExporter commented on May 20, 2024

Using 2.1rc2, I get the "Queue full" regularly.  It's a simple load generator,
iterating over a key collection, putting key:value pairs into a memcached.  
When my
collection gets bigger then ~75,000, the exception happens consistently.  
(50,000
works consistently so the threshold is between 50-75K).

I'm running the spy client and the native memcached on separate Sun W2200.  The 
OS is
Solaris 10u5; Java 1.6.0_02.

I'm happy to apply any patches and make changes per a request to run in my 
environment.

--clc

Original comment by [email protected] on 22 May 2008 at 7:25

from spymemcached.

GoogleCodeExporter commented on May 20, 2024

This bug is really more about having the client stop functioning when queues 
burst
internally.  It's OK to fill a queue and be notified of it.  What you don't 
want is
the client to fall apart and to get a new one.

Your case is somewhat unique and I don't optimize for it.  That is, lots of
unattended sets is not a normal mode of operation.  If you do want to do that, 
you
can just call waitForQueues or something every once in a while (maybe after 
ever 50k
in your example) and things would be better.

Original comment by [email protected] on 22 May 2008 at 9:43

from spymemcached.

TCPMemcachedNodeImpl Queues overflow about spymemcached HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent