Giter Site home page Giter Site logo

Capture stderr of jobs about jobrnr HOT 2 CLOSED

rfdonnelly avatar rfdonnelly commented on July 23, 2024
Capture stderr of jobs

from jobrnr.

Comments (2)

rfdonnelly avatar rfdonnelly commented on July 23, 2024

Here is a proof of concept.

test.rb

#!/usr/bin/env ruby

def main
  if ARGV == %w(child)
    child
  else
    parent
  end
end

def child
  # Randomly print to stdout or stderr
  10.times { |i| [$stdout, $stderr].sample.puts i }
end

def parent
  errors = Array.new

  # Spawn child process and connect stdout and stderr to pipes
  err_r, err_w = IO.pipe
  out_r, out_w = IO.pipe
  pid = Process.spawn("#{__FILE__} child", :out => out_w, :err => err_w)

  # Handle output until child process is complete
  while Process.waitpid(pid, Process::WNOHANG).nil?
    # Check for availability of output on stdout and stderr.
    #
    # Use a timeout to prevent blocking indefinitely otherwise we'll miss
    # process completion.
    rs, = IO.select([out_r, err_r], nil, nil, 0.1)

    # Skip iteration if no output available
    next if rs.nil?

    rs.first.tap do |r|
      case r
      when err_r
        # Save stderr and redirect to stdout after coloring
        line = r.gets
        errors << line
        $stdout.write "Err: #{line}"
      when out_r
        # Forward stdout to stdout
        $stdout.write "Out: #{r.gets}"
      end
    rescue EOFError
      # EOFError doesn't seem to happen for processes?
      #
      # We can't rely on this happening and thus need to poll for completion
      # using waitpid.
      $stderr.puts "EOF"
    end
  end

  # Print errors
  if errors.any?
    puts
    puts "Errors present:"
    puts errors.join
  end
end

main

Sample output of: ./test.rb

Err: 2
Err: 3
Err: 4
Err: 8
Out: 0
Out: 1
Out: 5
Out: 6
Out: 7
Out: 9

Errors present:
2
3
4
8

I have two concerns with this solution:

  1. Performance -- I suspect this degrades performance significantly. It gets Ruby involved in job output processing. It would be nice to measure the performance impact.
  2. Ordering -- This solution does not preserve the ordering of stderr vs stdout. However, neither does the current output redirection (i.e. command >log 2>&1). This is due to buffering on stdout. It may be possible to defeat buffering but at a cost to performance.

Because of the ordering limitation, I don't think stderr is a good way for commands to report certain types of errors and therefore maybe a parsing-based solution is better.

from jobrnr.

rfdonnelly avatar rfdonnelly commented on July 23, 2024

I benchmarked the two modes:

  • Direct mode -- subprocess outputs directly to stdout
  • Passthrough mode -- subprocess output passes through parent process

Test script

ioperf

#!/usr/bin/env ruby

def main
  mode = ARGV.shift
  command = ARGV.join(" ")
  method(mode).call(command)
end

def direct(argv)
  pid = Process.spawn(argv)
  Process.waitpid(pid)
end

def passthrough(argv)
  errors = Array.new

  # Spawn child process and connect stdout and stderr to pipes
  err_r, err_w = IO.pipe
  out_r, out_w = IO.pipe
  pid = Process.spawn(argv, :out => out_w, :err => err_w)

  # Handle output until child process is complete
  while Process.waitpid(pid, Process::WNOHANG).nil?
    # Check for availability of output on stdout and stderr.
    #
    # Use a timeout to prevent blocking indefinitely otherwise we'll miss
    # process completion.
    rs, = IO.select([out_r, err_r], nil, nil, 0.1)

    # Skip iteration if no output available
    next if rs.nil?

    rs.first.tap do |r|
      case r
      when err_r
        # Save stderr and redirect to stdout after coloring
        line = r.gets
        errors << line
        $stdout.write line
      when out_r
        # Forward stdout to stdout
        $stdout.write r.gets
      end
    end
  end
end

main

Benchmark Setup

Create a large file

yes | tr '\n' ' ' | fold -w 80 | head -c1G > large.file

Benchmark

./ioperf direct cat large.file | pv > /dev/null
1.00GiB 0:00:00 [1.57GiB/s]
./ioperf passthrough cat large.file | pv > /dev/null
1023MiB 0:00:30 [33.8MiB/s]

Conclusion

Capturing job output is 2 orders of magnitude slower than the current method. Closing due to insanely poor performance.

from jobrnr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.