richicoder1 / opentelemetry-sdk-workers Goto Github PK
View Code? Open in Web Editor NEWAn Otel SDK for Cloudflare Workers
An Otel SDK for Cloudflare Workers
The Exporters - especially the ProtoExporters are quite big. How about using the new Cloudflare ServiceBinding
to run a ExporterWorker. So the sdk could get split into the minimal instrumentation code which is small and optimized for execution. This part would the send the data in JSON form to the Exporter Worker which transforms it to Protobuf and sends it to the receiver.
Basically it's the idea of the OTLP collectors but fully serverless on CF.
What do you think @RichiCoder1 ?
There's a placeholder fetch
for subrequests, but it's not implemented yet.
Open Telemetry recently stabilized support for logging, and I think it'd be a massive value add for this SDK since cloudflare doesn't have a native log egress for workers.
However, the Open Telemetry libraries don't currently have any support for the log data model so we'll either have to work with upstream or build it ourselves.
Also will need to figure out how best to expose this facade.
One of the biggest warts with the Otel sdk today is we can not behave like the Node and Browser versions and automatically patch and flow tracestate through fetch calls. This is partially because there's no garuntees about global state (and it's actively discouraged) and there's no equivalent to async_hooks
in Cloudflare Workers.
...
Until now! https://twitter.com/jasnell/status/1633949738516230144?s=20
Cloudflare has added a node_compat
option that enables async_hooks
just like Node and is planning to implement a brand new proposal to standardize Async state storage in an efficient manner.
As such, I'm planning to try and release a new version of the SDK based on hooks and monkey-patching fetch like other SDKs do and hopefully removing the need to fetch via the SDK.
Warning
This will be a breaking change compared to how it works today
Hey there ๐
It would be nice if we can instrument the fetch
between Cloudflare worker service bindings.
On the off chance someone doesn't want to use OTLP, we should introduce a base class like the exporter base that provides support for exporting to another backend based off fetch.
Hey there ๐
I tried to use the library with the native OTLP endpoint of New Relic and traces seem to work well so far. Still I don't get any logs in (even not as errored requests). Do you have any recommendation on how to test if it's a problem with New Relic or the log exporter itself?
What OTLP Endpoint / service are you sending the data to?
Hi @RichiCoder1!
I am looking at the function cloneRequest
:
Because of strange implementation of URL class inside workers runtime, non-encoded parameters get encoded after toString
invocation. It's not reproduced in Chrome/Firefox/node.js though.
I shall probably report an issue with it for Cloudflared - but can we review this function overall from the usage perspective as well?
Is there some reason to do URL construction on request clone? Can we have smth like:
downstreamRequest = new Request(request);
or
function cloneRequest(request) {
return new Request(request.url, request);
}
Thanks!
I found some non-clear behavior, and I wonder if it is intentional.
opentelemetry-sdk-workers/packages/opentelemetry-sdk-workers/src/sdk.ts
Lines 290 to 294 in 271b1b2
I would expect throw
there instead of return
at line 292.
Am I missing something obvious?
See https://stackblitz.com/edit/cloudflare-templates-gdw8ku?file=src%2Findex.ts
Given
const otelSdk = new WorkersSDK(request, ctx, {
service: 'some-service-name',
endpoint: 'https://httpbin.org/status/200',
});
try {
console.log('fetching...');
const upstreamResponse = await otelSdk.fetch('https://httpbin.org/uuid');
console.log('fetched. now sending');
return otelSdk.sendResponse(upstreamResponse);
} catch (ex) {
console.log('ERROR fetching', ex);
otelSdk.captureException(ex);
}
logs
fetching...
fetched. now sending
Failed to flush spans: [
Error [OTLPExporterError]: Unknown error TypeError: fetch failed
at new OTLPExporterError2 (/tmp/tmp-7-WtdYhEgYO60S/index.js:3735:24)
at eval (/tmp/tmp-7-WtdYhEgYO60S/index.js:4050:13) {
data: undefined,
code: undefined
}
]
but with this change there are no errors logged.
- endpoint: 'https://httpbin.org/status/200';
+ endpoint: 'https://httpbin.org/anything',
I'm not sure what the difference is between those two URLs. In both cases, response.ok
is true
.
I initially ran into this with an endpoint that was returning a 400, but then found this case which seems to trigger it even on a success.
It seems like there may be an error in the error handling/reporting itself. Perhaps an import isn't working as expected and a function's not available?
Just testing out the library โค๏ธ
Wondering if I've missed a way to create custom spans.
The use case I've got a few examples for.
One is, we're planning on using some of the bindings not yet supported, like KV, R2 and the caches
object.
And better yet, we're not always calling it directly ๐
For instance with Workers Sites, https://developers.cloudflare.com/workers/platform/sites/start-from-worker/
There's the @cloudflare/kv-asset-handler
utility which handles a ton of static site boilerplate for you.
So while a handler like this works fantastic for the root span:
export default {
async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
const sdk = WorkersSDK.fromEnv(request, env, ctx);
const assetResponse = await getAssetFromKV(
{
request,
waitUntil: ctx.waitUntil.bind(ctx),
},
{
ASSET_NAMESPACE: env.__STATIC_CONTENT,
ASSET_MANIFEST: env.__STATIC_CONTENT_MANIFEST,
}
);
return sdk.sendResponse(assetResponse);
},
};
There's no straightforward way to add child spans around getAssetFromKV
.
Not to mention, what would one contribute to instrument the different code paths within this utility?
Tried hacking a this.traceProvider.register();
in to set the SDK's provider as the global trace provider. Which did allow me to create new spans with the regular @opentelemetry/api
calls and collect them. But they're independent spans. Not picking up the context as you'd want.
I tried installing in a current project that uses workers, and in a new one via https://workers.new/typescript They both failed but in different ways
I get these errors when I npm i opentelemetry-sdk-workers
but they don't seem too bad
npm WARN deprecated @opentelemetry/[email protected]: Please use @opentelemetry/api >= 1.3.0
npm WARN deprecated @opentelemetry/[email protected]: Please use @opentelemetry/sdk-metrics
I see the node
package is getting pulled in for some reason which then fails because the node
-native package isn't available.
โ [ERROR] Could not resolve "perf_hooks"
../../node_modules/@opentelemetry/core/build/esm/platform/node/performance.js:16:28:
16 โ import { performance } from 'perf_hooks';
โต ~~~~~~~~~~~~
The package "perf_hooks" wasn't found on the file system but is built into node. Are you trying to bundle for node? You can use "platform: 'node'" to do that, which will remove this error.
โ [ERROR] Could not resolve "perf_hooks"
../../node_modules/@opentelemetry/otlp-exporter-base/node_modules/@opentelemetry/core/build/esm/platform/node/performance.js:16:28:
16 โ import { performance } from 'perf_hooks';
โต ~~~~~~~~~~~~
The package "perf_hooks" wasn't found on the file system but is built into node. Are you trying to bundle for node? You can use "platform: 'node'" to do that, which will remove this error.
โ [ERROR] Could not resolve "perf_hooks"
../../node_modules/@opentelemetry/otlp-transformer/node_modules/@opentelemetry/core/build/esm/platform/node/performance.js:16:28:
16 โ import { performance } from 'perf_hooks';
โต ~~~~~~~~~~~~
The package "perf_hooks" wasn't found on the file system but is built into node. Are you trying to bundle for node? You can use "platform: 'node'" to do that, which will remove this error.
โ [ERROR] Could not resolve "perf_hooks"
../../node_modules/@opentelemetry/sdk-metrics-base/node_modules/@opentelemetry/core/build/esm/platform/node/performance.js:16:28:
16 โ import { performance } from 'perf_hooks';
โต ~~~~~~~~~~~~
The package "perf_hooks" wasn't found on the file system but is built into node. Are you trying to bundle for node? You can use "platform: 'node'" to do that, which will remove this error.
Build failed with 4 errors:
../../node_modules/@opentelemetry/core/build/esm/platform/node/performance.js:16:28: ERROR: Could not resolve "perf_hooks"
../../node_modules/@opentelemetry/otlp-exporter-base/node_modules/@opentelemetry/core/build/esm/platform/node/performance.js:16:28: ERROR: Could not resolve "perf_hooks"
../../node_modules/@opentelemetry/otlp-transformer/node_modules/@opentelemetry/core/build/esm/platform/node/performance.js:16:28: ERROR: Could not resolve "perf_hooks"
../../node_modules/@opentelemetry/sdk-metrics-base/node_modules/@opentelemetry/core/build/esm/platform/node/performance.js:16:28: ERROR: Could not resolve "perf_hooks"
Error
at Object.onBuildFailure (/Users/jfsiii/work/mono-lift/node_modules/@remix-run/dev/dist/cli/commands.js:183:13)
at buildEverything (/Users/jfsiii/work/mono-lift/node_modules/@remix-run/dev/dist/compiler.js:282:13)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async Object.build (/Users/jfsiii/work/mono-lift/node_modules/@remix-run/dev/dist/compiler.js:102:3)
at async Object.build (/Users/jfsiii/work/mono-lift/node_modules/@remix-run/dev/dist/cli/commands.js:178:3)
at async Object.run (/Users/jfsiii/work/mono-lift/node_modules/@remix-run/dev/dist/cli/run.js:496:7)
Reading this more closely, I saw that it was failing b/c of the missing performance
object, so I moved import 'opentelemetry-sdk-workers/performance';
to the first line and everything was fine
```
[mf:err] GET /: TypeError: Cannot read properties of undefined (reading 'timeOrigin')
at getTimeOrigin (/home/projects/cloudflare-templatesa8ep3p/node_modules/@opentelemetry/core/src/common/time.ts:71:14)
at timeInputToHrTime (/home/projects/cloudflare-templatesa8ep3p/node_modules/@opentelemetry/core/src/common/time.ts:112:15)
at new Span2 (/home/projects/cloudflare-templates-a8ep3p/node_modules/@opentelemetry/sdk-tracebase/src/Span.ts:102:54)
at Tracer2.startSpan (/home/projects/cloudflare-templates-a8ep3p/node_modules/@opentelemetry/sdk-tracebase/src/Tracer.ts:215:16)
at _WorkersSDK.initSpan (/home/projects/cloudflare-templates-a8ep3p/node_modules/opentelemetry-sdk-workers/dist/index.mjs:371:28)
at new _WorkersSDK (/home/projects/cloudflare-templates-a8ep3p/node_modules/opentelemetry-sdk-workers/dist/index.mjs:218:3)
at Object.fetch (/home/projects/cloudflare-templates-a8ep3p/src/index.ts:26:14)
at eval (/home/projects/cloudflare-templates-a8ep3p/node_modules/@miniflare/core/src/standards/event.ts:350:19)
at eval (/home/projects/cloudflare-templates-a8ep3p/node_modules/@miniflare/shared/src/event.ts:29:9)
```
Any idea what might be going on? Any thing I can do differently or to give more context for debugging?
On a somewhat related note, I've had some success with https://github.com/artifact-project/perf-tools/tree/master/performance and thought you might like it as inspiration / dependency for https://github.com/RichiCoder1/opentelemetry-sdk-workers/blob/main/packages/opentelemetry-sdk-workers/src/performance.ts
Hey there ๐
It would be nice if we can instrument the fetch
call to durable objects.
Hey there ๐
the endpoint paths are fixed to <my_otlp_url>/v1/traces
& `<my_otlp_url>/v1/logs like here: https://github.com/RichiCoder1/opentelemetry-sdk-workers/blob/main/packages/opentelemetry-sdk-workers/src/exporters/OTLPJsonTraceExporter.ts
There are services which don't follow this specs like: Lightstep -> https://ingest.lightstep.com:443/api/v2/otel/trace
which makes it impossible to use the lib.
That's why it should be possible to configure the full url or the paths in some way.
I didn't implement propagator support originally as I was worried about potential global assumptions they were making, but after more research I think it may be reasonable to expose this as a config option with a default.
Hey there ๐,
when using any LogExporter and sdk.log.log("Test Log!");
(comment: why is it sdk.log
instead of sdk.logger
?) then not only the log is sent to the OpenTelemetry endpoint but also logged to the console. This seems a bit off, as this should not be a concern of the build in logger. I know diary is used below this, so maybe this functionality could be disabled.
When somenbody wants the functionality of logging to console & otel, they should build a logger interface which accepts log sinks - e.x. the sdk.log
Hi,
I have a question regarding SpanKind
which is set to INTERNAL
now.
As per docs:
INTERNAL
span kind should be used to instrument some internal functions/operations within application process.
Workers are technically a remote service, so I believe they should produce SERVER
spans.
Overall, such inconsistency results in Elastic APM incorrectly "drawing" such spans on a timeline, because the timestamps of worker traces differ from client/backend traces, since they are sent in different times.
The current workaround for me looks like this:
const oTelSdk = new WorkersSDK(request, ctx, env, {
service: "myWorker",
...
});
oTelSdk.span.kind = SpanKind.SERVER; // ugly hack :)
Is there possibility to change this default span kind or give possibility to override it on API level?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.