Giter Site home page Giter Site logo

Comments (10)

mcollina avatar mcollina commented on June 25, 2024

Generating JSON requires slightly more overhead than just strings. Specifically, your long strings must be escaped to be valid JSON.
On top of that, the algorithm used for serializing objects differs from the one used by console.log, which limits the depth.


I see more CPU utilization in your graph but I hardly see a reason for extra pods and usage threshold. (If you keep Node.js CPU usage at anything lower than 70-80%, you are wasting a lot of money).

from pino.

apazzolini avatar apazzolini commented on June 25, 2024

Generating JSON requires slightly more overhead than just strings. Specifically, your long strings must be escaped to be valid JSON. On top of that, the algorithm used for serializing objects differs from the one used by console.log, which limits the depth.

I see more CPU utilization in your graph but I hardly see a reason for extra pods and usage threshold. (If you keep Node.js CPU usage at anything lower than 70-80%, you are wasting a lot of money).

Thanks for the quick reply! We've got utilization at 60% (I should bump that up), but this service is only asking for .5 CPU, which is why they're now hitting the horizontal autoscaler:

image

If the answer is that this is not unexpected, I'm OK with needing using resources (or tweaking what we currently have to better utilize them) - I was just a bit surprised at how much overhead there was and thought it was worth asking. Thanks again.

from pino.

jsumners avatar jsumners commented on June 25, 2024

Can you please provide some numbers that show the discrepancy? The graphs are not very helpful.

from pino.

apazzolini avatar apazzolini commented on June 25, 2024

Can you please provide some numbers that show the discrepancy? The graphs are not very helpful.

Sure! What numbers specifically would be useful? I can't take an actual CPU profile on the production instances, but I can share the numbers that underlie the charts. For this service, the k8s configuration is set to:

limits:
  cpu: 1
requests:
  cpu: .5

minReplicas: 4
maxReplicas: 8

name: cpu
target:
  averageUtilization: 60

From the third chart I posted (which overlaps the instant where we deployed Pino), before Pino we were generally hovering between 4-6 replicas with ~60% of .5 CPU each for a total of 1.2 - 1.4 CPUs used. After the Pino change, we immediately scaled up to our max of 8, and the usage is at 2.5 CPUs. This is more than 60% of our max of 4 CPUs (8 max replicas * .5 CPU), but isn't scaling further because of the cap.

I totally get that it's more expensive to use structured logging than a string to stdout, I was just a bit surprised that it's nearly doubling our CPU usage vs somewhere in the 10% overhead range.

This isn't causing a noticeable delay in processing, however. I've measured wall time with this:

let timeSpent = 0;
setInterval(() => {
  console.log(`PINO_PERFORMANCE_PROFILING USE_PINO=[${USE_PINO}] timeSpent=[${timeSpent}]`);
  timeSpent = 0;
}, 10_000);

...

const start = performance.now();
if (USE_PINO) {
  LOG.debug(...);
} else {
  console.log(...);
}
timeSpent += performance.now() - start;

The actual time spent in both paths is the same, it's just that one path requires a lot more CPU.

from pino.

jsumners avatar jsumners commented on June 25, 2024

I'm still not understanding what "a lot more CPU" actually means. Something like:

'use strict'

const pino = require('pino')
const log = pino({ destination: '/dev/null' })
const { PerformanceObserver, performance } = require('node:perf_hooks');
const var1 = 'one'
const var2 = 'two'
const var3 = 'three'
const var4 = 'four'

const obs = new PerformanceObserver((items) => {});
obs.observe({ type: 'measure' });

const cpuMetrics = []
let cpu = process.cpuUsage()
cpuMetrics.push(cpu)

performance.mark('a', { detail: cpu })
for (let i = 0; i < 10_000; i += 1) {
  console.log(`SomeString[${var1}][${new Date().toISOString()}][${var2}][${var3}][${var4}]`);
}
cpu = process.cpuUsage(cpu)
cpuMetrics.push(cpu)
performance.mark('b', { detail: cpu })

cpu = process.cpuUsage(cpu)
cpuMetrics.push(cpu)
performance.mark('c', { detail: cpu })
for (let i = 0; i < 10_000; i += 1) {
  log.info(`SomeString[${var1}][${new Date().toISOString()}][${var2}][${var3}][${var4}]`)
}
cpu = process.cpuUsage(cpu)
cpuMetrics.push(cpu)
performance.mark('d', { detail: cpu })

performance.measure('console.log', 'a', 'b')
performance.measure('pino.info', 'c', 'd')

setTimeout(() => {
  console.log('\n\n\n\n\n')
  for (const entry of performance.getEntriesByType('mark')) {
    console.log(entry)
  }

  const time1 = cpuMetrics[1].user - cpuMetrics[0].user
  const time2 = cpuMetrics[3].user - cpuMetrics[2].user
  console.log(`console.log cpu time: ${time1}µs`)
  console.log(`pino.info cpu time: ${time2}µs`)
}, 2_000)

Which on my system produces:

PerformanceMark {
  name: 'a',
  entryType: 'mark',
  startTime: 124.7484188079834,
  duration: 0,
  detail: { user: 38019, system: 34800 }
}
PerformanceMark {
  name: 'b',
  entryType: 'mark',
  startTime: 179.44191932678223,
  duration: 0,
  detail: { user: 50136, system: 7629 }
}
PerformanceMark {
  name: 'c',
  entryType: 'mark',
  startTime: 179.4865436553955,
  duration: 0,
  detail: { user: 38085, system: 34815 }
}
PerformanceMark {
  name: 'd',
  entryType: 'mark',
  startTime: 194.47896003723145,
  duration: 0,
  detail: { user: 70935, system: 9116 }
}
console.log cpu time: 12117µs
pino.info cpu time: 32850µs

from pino.

apazzolini avatar apazzolini commented on June 25, 2024

I'll wrap the logs with process.cpuUsage and report back.

A question from your example: since you're using process.cpuUsage(cpu) for the second call, wouldn't cpuMetrics[1] already hold the delta in usage from element 0 and so we shouldn't subtract again for time1? I haven't used this API before, but the docs say

The result of a previous call to process.cpuUsage() can be passed as the
argument to the function, to get a diff reading.

from pino.

apazzolini avatar apazzolini commented on June 25, 2024

OK, I've got some updated numbers following the above suggestion achieved with:

const USE_PINO = Math.random() > 0.5;
let timeSpentUser = 0;
let timeSpentSystem = 0;
let numCalls = 0;
setInterval(() => {
  console.log(
    `PINO_PERFORMANCE_PROFILING USE_PINO=[${USE_PINO}] timeSpentSystem=[${timeSpentSystem}] timeSpentUser=[${timeSpentUser}] numCalls=[${numCalls}]`
  );
  timeSpentUser = 0;
  timeSpentSystem = 0;
  numCalls = 0;
}, 30_000);

...

function doThing() {
  
  ...

  const start = process.cpuUsage();
  if (USE_PINO) {
    LOG.debug(`SomeString[${var1}][${new Date().toISOString()}][${var2}][${var3}][${var4}]`);
  } else {
    console.log(`SomeString[${var1}][${new Date().toISOString()}][${var2}][${var3}][${var4}]`);
  }
  const end = process.cpuUsage(start); // tracks delta since we provided start
  timeSpentUser += end.user;
  timeSpentSystem += end.system;
  numCalls++;

  ...

}

Here's a sample of some of the results (sorted for easier scanning). Sometimes user CPU is higher with Pino, sometimes it's lower. The system time however is always quite a bit higher with Pino.

USE_PINO=[false] timeSpentSystem=[3483]  timeSpentUser=[52337] numCalls=[762]
USE_PINO=[false] timeSpentSystem=[6560]  timeSpentUser=[77116] numCalls=[1144]
USE_PINO=[false] timeSpentSystem=[6651]  timeSpentUser=[60171] numCalls=[744]
USE_PINO=[false] timeSpentSystem=[6664]  timeSpentUser=[64707] numCalls=[815]
USE_PINO=[false] timeSpentSystem=[6777]  timeSpentUser=[53418] numCalls=[725]
USE_PINO=[false] timeSpentSystem=[7417]  timeSpentUser=[55363] numCalls=[741]
USE_PINO=[false] timeSpentSystem=[7924]  timeSpentUser=[82102] numCalls=[1185]
USE_PINO=[false] timeSpentSystem=[8187]  timeSpentUser=[75061] numCalls=[1086]
USE_PINO=[false] timeSpentSystem=[8373]  timeSpentUser=[82283] numCalls=[1063]
USE_PINO=[false] timeSpentSystem=[9058]  timeSpentUser=[75110] numCalls=[1095]
USE_PINO=[false] timeSpentSystem=[10542] timeSpentUser=[90660] numCalls=[1153]
USE_PINO=[true]  timeSpentSystem=[33149] timeSpentUser=[32700] numCalls=[786]
USE_PINO=[true]  timeSpentSystem=[42430] timeSpentUser=[40303] numCalls=[897]
USE_PINO=[true]  timeSpentSystem=[43592] timeSpentUser=[38381] numCalls=[863]
USE_PINO=[true]  timeSpentSystem=[50531] timeSpentUser=[45990] numCalls=[1125]
USE_PINO=[true]  timeSpentSystem=[53014] timeSpentUser=[57296] numCalls=[1032]
USE_PINO=[true]  timeSpentSystem=[54243] timeSpentUser=[59892] numCalls=[1091]
USE_PINO=[true]  timeSpentSystem=[55062] timeSpentUser=[52429] numCalls=[1201]
USE_PINO=[true]  timeSpentSystem=[64511] timeSpentUser=[54446] numCalls=[853]
USE_PINO=[true]  timeSpentSystem=[71357] timeSpentUser=[82430] numCalls=[1029]
USE_PINO=[true]  timeSpentSystem=[76478] timeSpentUser=[66858] numCalls=[1140]
USE_PINO=[true]  timeSpentSystem=[95019] timeSpentUser=[62797] numCalls=[1122]

Here are the associated CPU charts for the two nodes

image
image

Thanks for taking a look at this! I think a likely answer is that this is indeed just unavoidable overhead, but I really appreciate the sanity check.

from pino.

jsumners avatar jsumners commented on June 25, 2024

A question from your example: since you're using process.cpuUsage(cpu) for the second call, wouldn't cpuMetrics[1] already hold the delta in usage from element 0 and so we shouldn't subtract again for time1? I haven't used this API before, but the docs say

My script may not be accurate. I cobbled it together during a short downtime window.

from pino.

mcollina avatar mcollina commented on June 25, 2024

The overhead is likely the result of the escaping cost of the massive string. If you don’t care about it and you just want a faster console.log, you could use sonic-boom directly.

from pino.

github-actions avatar github-actions commented on June 25, 2024

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

from pino.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.