Giter Site home page Giter Site logo

Comments (4)

robey avatar robey commented on June 29, 2024

Sorry for the delay -- I was out of town for a while.

There are two scripts built with kestrel that should help here:

$ ./dist/kestrel/scripts/qdump.sh --help

That will dump out a (sort of) human-readable list of the operations in a journal file, without the associated data. It might be useful for figuring out what's messed up, although if the journal is "hundreds of gigabytes", it might be too time-consuming to matter.

$ ./dist/kestrel/scripts/qpack.sh --help

This takes a set of journal files and packs them into their minimal operations (usually one ADD for each item that's currently in the queue). You should take the bad server out of rotation, run this script on its journal files, and then start it back up again with only the new journal file. (You can delete the old ones after it's finished packing the new one.) It might be helpful to "qdump" the new file before starting, in case there's something wonky.

from kestrel.

kperi avatar kperi commented on June 29, 2024

Hi,
Thanks for the reply.

I've tried qdump on some of the files and I am getting something like:

qdump.sh my_queue.1373865238490 00000000 ADD 2601 00000a3e ADD 2641 000014a4 REM 000014a5 REM 000014a6 ADD 2597 00001ee0 ADD 2616 0000292d ADD 2437 000032c7 REM 000032c8 ADD 2572 00003ce9 ADD 2624 ............... 0199cf REM 000199d0 REM 000199d1 ADD 2649 0001a43f ADD 2649 0001aead ADD 2631 0001b909 REM 0001b90a REM 0001b90b REM 0001b90c REM Exception in thread "main" java.util.NoSuchElementException: queue empty at scala.collection.mutable.Queue.dequeue(Queue.scala:65) at net.lag.kestrel.tools.QueueDumper.dumpItem(QDumper.scala:102) at net.lag.kestrel.tools.QueueDumper$$anonfun$apply$2.apply(QDumper.scala:47) at net.lag.kestrel.tools.QueueDumper$$anonfun$apply$2.apply(QDumper.scala:45) at scala.collection.Iterator$class.foreach(Iterator.scala:772) at scala.collection.Iterator$$anon$22.foreach(Iterator.scala:451) at net.lag.kestrel.tools.QueueDumper.apply(QDumper.scala:45) at net.lag.kestrel.tools.QDumper$$anonfun$main$1.apply(QDumper.scala:201) at net.lag.kestrel.tools.QDumper$$anonfun$main$1.apply(QDumper.scala:199) at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59) at scala.collection.immutable.List.foreach(List.scala:76) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:30) at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:44) at net.lag.kestrel.tools.QDumper$.main(QDumper.scala:199) at net.lag.kestrel.tools.QDumper.main(QDumper.scala)

Other files in the same queue seem not to have a problem with qdump, ie I am getting a full list of ops and qdump exits without exceptions.


Additionally, the overal queue + journal size was reduced from 154G to 500m just after start pushing new data to the queue

Thanks,
Kostas

from kestrel.

andrewclegg avatar andrewclegg commented on June 29, 2024

We've just had the same situation occur that @kperi described above.

I'm trying to rescue the situation without losing any data, but when I try to run qpack.sh on the out-of-control journal files, I get errors like these:

Packing journals...
Packing: 2.2M   2.2M  Exception in thread "main" net.lag.kestrel.BrokenItemException: java.io.IOException: Unexpected EOF
        at net.lag.kestrel.Journal.readJournalEntry(Journal.scala:414)
        at net.lag.kestrel.Journal.next$1(Journal.scala:422)
        at net.lag.kestrel.Journal$$anonfun$next$1$1.apply(Journal.scala:427)
        at net.lag.kestrel.Journal$$anonfun$next$1$1.apply(Journal.scala:427)
        at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1060)
        at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1052)
        at scala.collection.immutable.StreamIterator$$anonfun$next$1.apply(Stream.scala:952)
        at scala.collection.immutable.StreamIterator$$anonfun$next$1.apply(Stream.scala:952)
        at scala.collection.immutable.StreamIterator$LazyCell.v(Stream.scala:941)
        at scala.collection.immutable.StreamIterator.hasNext(Stream.scala:946)
        at scala.collection.TraversableOnce$FlattenOps$$anon$1.hasNext(TraversableOnce.scala:391)
        at scala.collection.Iterator$$anon$22.hasNext(Iterator.scala:457)
        at scala.collection.Iterator$class.foreach(Iterator.scala:772)
        at scala.collection.Iterator$$anon$22.foreach(Iterator.scala:451)
        at net.lag.kestrel.JournalPacker.apply(JournalPacker.scala:73)
        at net.lag.kestrel.tools.QPacker$.main(QPacker.scala:65)
        at net.lag.kestrel.tools.QPacker.main(QPacker.scala)
Caused by: java.io.IOException: Unexpected EOF
        at net.lag.kestrel.Journal.readBlock(Journal.scala:443)
        at net.lag.kestrel.Journal.readJournalEntry(Journal.scala:382)
        ... 16 more

This suggests one possible reason for the journals not getting packed automatically: the journal packer thread is failing on corrupt journal files because of this exception.

I couldn't find any direct evidence of this in the log files, but then, the disks were full due to this issue, so maybe the logger couldn't write the appropriate stacktrace. A bit speculative, but it's all I've got so far...

from kestrel.

andrewclegg avatar andrewclegg commented on June 29, 2024

Note: there were a lot of timestamped journal files with size 0, but I removed all of these first before trying to run qpack.sh.

from kestrel.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.