Comments (10)
Generating JSON requires slightly more overhead than just strings. Specifically, your long strings must be escaped to be valid JSON.
On top of that, the algorithm used for serializing objects differs from the one used by console.log
, which limits the depth.
I see more CPU utilization in your graph but I hardly see a reason for extra pods and usage threshold. (If you keep Node.js CPU usage at anything lower than 70-80%, you are wasting a lot of money).
from pino.
Generating JSON requires slightly more overhead than just strings. Specifically, your long strings must be escaped to be valid JSON. On top of that, the algorithm used for serializing objects differs from the one used by
console.log
, which limits the depth.I see more CPU utilization in your graph but I hardly see a reason for extra pods and usage threshold. (If you keep Node.js CPU usage at anything lower than 70-80%, you are wasting a lot of money).
Thanks for the quick reply! We've got utilization at 60% (I should bump that up), but this service is only asking for .5 CPU, which is why they're now hitting the horizontal autoscaler:
If the answer is that this is not unexpected, I'm OK with needing using resources (or tweaking what we currently have to better utilize them) - I was just a bit surprised at how much overhead there was and thought it was worth asking. Thanks again.
from pino.
Can you please provide some numbers that show the discrepancy? The graphs are not very helpful.
from pino.
Can you please provide some numbers that show the discrepancy? The graphs are not very helpful.
Sure! What numbers specifically would be useful? I can't take an actual CPU profile on the production instances, but I can share the numbers that underlie the charts. For this service, the k8s configuration is set to:
limits:
cpu: 1
requests:
cpu: .5
minReplicas: 4
maxReplicas: 8
name: cpu
target:
averageUtilization: 60
From the third chart I posted (which overlaps the instant where we deployed Pino), before Pino we were generally hovering between 4-6 replicas with ~60% of .5 CPU each for a total of 1.2 - 1.4 CPUs used. After the Pino change, we immediately scaled up to our max of 8, and the usage is at 2.5 CPUs. This is more than 60% of our max of 4 CPUs (8 max replicas * .5 CPU), but isn't scaling further because of the cap.
I totally get that it's more expensive to use structured logging than a string to stdout, I was just a bit surprised that it's nearly doubling our CPU usage vs somewhere in the 10% overhead range.
This isn't causing a noticeable delay in processing, however. I've measured wall time with this:
let timeSpent = 0;
setInterval(() => {
console.log(`PINO_PERFORMANCE_PROFILING USE_PINO=[${USE_PINO}] timeSpent=[${timeSpent}]`);
timeSpent = 0;
}, 10_000);
...
const start = performance.now();
if (USE_PINO) {
LOG.debug(...);
} else {
console.log(...);
}
timeSpent += performance.now() - start;
The actual time spent in both paths is the same, it's just that one path requires a lot more CPU.
from pino.
I'm still not understanding what "a lot more CPU" actually means. Something like:
'use strict'
const pino = require('pino')
const log = pino({ destination: '/dev/null' })
const { PerformanceObserver, performance } = require('node:perf_hooks');
const var1 = 'one'
const var2 = 'two'
const var3 = 'three'
const var4 = 'four'
const obs = new PerformanceObserver((items) => {});
obs.observe({ type: 'measure' });
const cpuMetrics = []
let cpu = process.cpuUsage()
cpuMetrics.push(cpu)
performance.mark('a', { detail: cpu })
for (let i = 0; i < 10_000; i += 1) {
console.log(`SomeString[${var1}][${new Date().toISOString()}][${var2}][${var3}][${var4}]`);
}
cpu = process.cpuUsage(cpu)
cpuMetrics.push(cpu)
performance.mark('b', { detail: cpu })
cpu = process.cpuUsage(cpu)
cpuMetrics.push(cpu)
performance.mark('c', { detail: cpu })
for (let i = 0; i < 10_000; i += 1) {
log.info(`SomeString[${var1}][${new Date().toISOString()}][${var2}][${var3}][${var4}]`)
}
cpu = process.cpuUsage(cpu)
cpuMetrics.push(cpu)
performance.mark('d', { detail: cpu })
performance.measure('console.log', 'a', 'b')
performance.measure('pino.info', 'c', 'd')
setTimeout(() => {
console.log('\n\n\n\n\n')
for (const entry of performance.getEntriesByType('mark')) {
console.log(entry)
}
const time1 = cpuMetrics[1].user - cpuMetrics[0].user
const time2 = cpuMetrics[3].user - cpuMetrics[2].user
console.log(`console.log cpu time: ${time1}µs`)
console.log(`pino.info cpu time: ${time2}µs`)
}, 2_000)
Which on my system produces:
PerformanceMark {
name: 'a',
entryType: 'mark',
startTime: 124.7484188079834,
duration: 0,
detail: { user: 38019, system: 34800 }
}
PerformanceMark {
name: 'b',
entryType: 'mark',
startTime: 179.44191932678223,
duration: 0,
detail: { user: 50136, system: 7629 }
}
PerformanceMark {
name: 'c',
entryType: 'mark',
startTime: 179.4865436553955,
duration: 0,
detail: { user: 38085, system: 34815 }
}
PerformanceMark {
name: 'd',
entryType: 'mark',
startTime: 194.47896003723145,
duration: 0,
detail: { user: 70935, system: 9116 }
}
console.log cpu time: 12117µs
pino.info cpu time: 32850µs
from pino.
I'll wrap the logs with process.cpuUsage
and report back.
A question from your example: since you're using process.cpuUsage(cpu)
for the second call, wouldn't cpuMetrics[1]
already hold the delta in usage from element 0 and so we shouldn't subtract again for time1
? I haven't used this API before, but the docs say
The result of a previous call to process.cpuUsage() can be passed as the
argument to the function, to get a diff reading.
from pino.
OK, I've got some updated numbers following the above suggestion achieved with:
const USE_PINO = Math.random() > 0.5;
let timeSpentUser = 0;
let timeSpentSystem = 0;
let numCalls = 0;
setInterval(() => {
console.log(
`PINO_PERFORMANCE_PROFILING USE_PINO=[${USE_PINO}] timeSpentSystem=[${timeSpentSystem}] timeSpentUser=[${timeSpentUser}] numCalls=[${numCalls}]`
);
timeSpentUser = 0;
timeSpentSystem = 0;
numCalls = 0;
}, 30_000);
...
function doThing() {
...
const start = process.cpuUsage();
if (USE_PINO) {
LOG.debug(`SomeString[${var1}][${new Date().toISOString()}][${var2}][${var3}][${var4}]`);
} else {
console.log(`SomeString[${var1}][${new Date().toISOString()}][${var2}][${var3}][${var4}]`);
}
const end = process.cpuUsage(start); // tracks delta since we provided start
timeSpentUser += end.user;
timeSpentSystem += end.system;
numCalls++;
...
}
Here's a sample of some of the results (sorted for easier scanning). Sometimes user CPU is higher with Pino, sometimes it's lower. The system time however is always quite a bit higher with Pino.
USE_PINO=[false] timeSpentSystem=[3483] timeSpentUser=[52337] numCalls=[762]
USE_PINO=[false] timeSpentSystem=[6560] timeSpentUser=[77116] numCalls=[1144]
USE_PINO=[false] timeSpentSystem=[6651] timeSpentUser=[60171] numCalls=[744]
USE_PINO=[false] timeSpentSystem=[6664] timeSpentUser=[64707] numCalls=[815]
USE_PINO=[false] timeSpentSystem=[6777] timeSpentUser=[53418] numCalls=[725]
USE_PINO=[false] timeSpentSystem=[7417] timeSpentUser=[55363] numCalls=[741]
USE_PINO=[false] timeSpentSystem=[7924] timeSpentUser=[82102] numCalls=[1185]
USE_PINO=[false] timeSpentSystem=[8187] timeSpentUser=[75061] numCalls=[1086]
USE_PINO=[false] timeSpentSystem=[8373] timeSpentUser=[82283] numCalls=[1063]
USE_PINO=[false] timeSpentSystem=[9058] timeSpentUser=[75110] numCalls=[1095]
USE_PINO=[false] timeSpentSystem=[10542] timeSpentUser=[90660] numCalls=[1153]
USE_PINO=[true] timeSpentSystem=[33149] timeSpentUser=[32700] numCalls=[786]
USE_PINO=[true] timeSpentSystem=[42430] timeSpentUser=[40303] numCalls=[897]
USE_PINO=[true] timeSpentSystem=[43592] timeSpentUser=[38381] numCalls=[863]
USE_PINO=[true] timeSpentSystem=[50531] timeSpentUser=[45990] numCalls=[1125]
USE_PINO=[true] timeSpentSystem=[53014] timeSpentUser=[57296] numCalls=[1032]
USE_PINO=[true] timeSpentSystem=[54243] timeSpentUser=[59892] numCalls=[1091]
USE_PINO=[true] timeSpentSystem=[55062] timeSpentUser=[52429] numCalls=[1201]
USE_PINO=[true] timeSpentSystem=[64511] timeSpentUser=[54446] numCalls=[853]
USE_PINO=[true] timeSpentSystem=[71357] timeSpentUser=[82430] numCalls=[1029]
USE_PINO=[true] timeSpentSystem=[76478] timeSpentUser=[66858] numCalls=[1140]
USE_PINO=[true] timeSpentSystem=[95019] timeSpentUser=[62797] numCalls=[1122]
Here are the associated CPU charts for the two nodes
Thanks for taking a look at this! I think a likely answer is that this is indeed just unavoidable overhead, but I really appreciate the sanity check.
from pino.
A question from your example: since you're using process.cpuUsage(cpu) for the second call, wouldn't cpuMetrics[1] already hold the delta in usage from element 0 and so we shouldn't subtract again for time1? I haven't used this API before, but the docs say
My script may not be accurate. I cobbled it together during a short downtime window.
from pino.
The overhead is likely the result of the escaping cost of the massive string. If you don’t care about it and you just want a faster console.log, you could use sonic-boom directly.
from pino.
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
from pino.
Related Issues (20)
- Simple log file rotation HOT 2
- Request: mixed sync/async transports HOT 5
- transport for file does not work in middleware. HOT 2
- Re-add support for standard console-style parameter logging HOT 6
- transport options - either confusing variable names or I'm just still learning (free to admit it) HOT 1
- create a function to capture full log object HOT 1
- consider using toad-cache HOT 2
- Filtering of the content using transformations in the pipeline for multiple transports HOT 1
- [question] How do I create dynamic customized logfile names? HOT 3
- Inclusive language suggestion: rename default branch to "main" across Pino's repos HOT 3
- Bundling to `ESM` using `vite` HOT 1
- ESLint: Unsafe call of an `any` typed value.(@typescript-eslint/no-unsafe-call) HOT 2
- Applying regex to redact keys or value HOT 3
- Accessing log level label from transport HOT 7
- Verifying logs from unit tests (again!) HOT 6
- `pino.transport.pipeline` may have lost logs HOT 10
- Docs: Pino Logging with Grafana Promtail/Loki HOT 3
- [Question] Same logger for both node and browser HOT 3
- Using pino with nextjs HOT 2
- Transport is not being waited for in lambda HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pino.