uptrace / uptrace Goto Github PK

Open source APM: OpenTelemetry traces, metrics, and logs

Home Page: https://uptrace.dev/get/open-source-apm.html

License: GNU Affero General Public License v3.0

Shell 0.25% Makefile 0.12% Dockerfile 0.04% Go 51.13% JavaScript 0.20% HTML 1.90% Vue 36.32% TypeScript 9.07% SCSS 0.19% Python 0.40% Ruby 0.10% Smarty 0.27%

apm application-monitoring clickhouse distributed-tracing golang logs metrics monitoring observability opentelemetry performance-monitoring self-hosted tracing vue

uptrace's Introduction

Languages: English | 简体中文

Open source APM: OpenTelemetry traces, metrics, and logs

Uptrace is an open source APM that supports distributed tracing, metrics, and logs. You can use it to monitor applications and troubleshoot issues.

Uptrace comes with an intuitive query builder, rich dashboards, alerting rules, notifications, and integrations for most languages and frameworks.

Uptrace can process billions of spans and metrics on a single server and allows you to monitor your applications at 10x lower cost.

Uptrace uses OpenTelemetry framework to collect data and ClickHouse database to store it. It also requires PostgreSQL database to store metadata such as metric names and alerts.

Features:

Single UI for traces, metrics, and logs.
50+ pre-built dashboards that are automatically created once metrics start coming in.
Service graph and chart annotations.
Spans/logs/metrics monitoring with notifications via Email, Slack, WebHook, and AlertManager.
SQL-like query language to aggregate spans.
Promql-like language to aggregate metrics.
Data ingestion using OpenTelemetry, Prometheus, Vector, FluentBit, CloudWatch, and more.
Grafana compatibility. You can configure Grafana to use Uptrace as a Tempo/Prometheus datasource.
Managing users/projects via YAML config.
Single sign-on (SSO) using OpenID Connect: Keycloak, Google Cloud, and Cloudflare.
Efficient processing: more than 10K spans / second on a single core.
Excellent on-disk compression: 1KB span can be compressed down to ~40 bytes.

System overview

Faceted filters

Metrics

Alerts

Quickstart

You can try Uptrace in just a few minutes by visiting the cloud demo (no login required) or by running it locally with Docker.

Then follow the getting started guide.

Help

Have questions? Get help via Telegram, Slack, or start a discussion on GitHub.

Contributing

See Contributing to Uptrace.

uptrace's People

Contributors

Stargazers

Watchers

Forkers

vmihailenco hien relyks isgasho zhanglei laashub-soa daheige cxz d-fal kilingzhang forkkit ngaut altanozlu santakd nameoffnv askuy sunny19930321 tuanha1305 qiaogj1 suryatmodulus shimohq dixanms theykk-bunker nevernet showsmall eternalerrors super-rain maiqiumaker firefishy lmangani wangcc7 minias-tdi keichankeichan kspine cluas developgo shailensukul xiaoweikin yubobo xujianming2017 akainocat demonoid81 arpitkotecha doytsujin fatelei aramperes rizalgowandy dawnblack2 observability-lab erdal-pb authur117 bmike78 sitedata jzhang28 alfenfebral re3os leeonsoft rjasonadams vikrambe santoshkumar89 awesomegolang kaydxh yusufozturk gabriel-v mskj-apaas obliadp vkkan liam-i ydb-platform mauluong rattuscz frapschen monkey92t randiapr ductran95 ax1an leonyu879 pi-pi-miao finden-labs ddouweb fgy58963 parabinda sti26 xoraingroup nobles5e gorexlv toby1991 carlos-descalzi ofird55 jwcjlu luis-sousa-pinto invaderb letenkov 0xforked strogo dmitrymomot jettjia fifteen-clement argyle-engineering nhannguyensy

uptrace's Issues

Clickhouse request errors

Good day!

I've started uptrace setup via Docker and got an errors in UI:

*ch.Error: DB::Exception: Unknown function toFloat64OrDefault. Maybe you meant: ['toFloat64OrNull','dictGetFloat64OrDefault']: While processing toFloat64OrDefault(span.duration)

Uptrace logs:

[bunrouter]  09:45:30.094   500     16.918ms   GET      /api/tracing/groups?time_gte=2021-12-29T08:46:00.000Z&time_lt=2021-12-29T09:46:00.000Z&query=group+by+span.group_id+%7C+span.count_per_min+%7C+span.error_pct+%7C+p50(span.duration)+%7C+p90(span.duration)+%7C+p99(span.duration)&system=http:unknown_service          *ch.Error: DB::Exception: Unknown function toFloat64OrDefault. Maybe you meant: ['toFloat64OrNull','dictGetFloat64OrDefault']: While processing toFloat64OrDefault(`span.duration`)

[ch]  09:45:30.696   SELECT               68.642ms  SELECT count() / 60 AS "span.count_per_min", countIf(`span.status_code` = 'error') / count() AS "span.error_pct", quantileTDigest(0.5)(toFloat64OrDefault(s."span.duration")) AS "p50(span.duration)", quantileTDigest(0.9)(toFloat64OrDefault(s."span.duration")) AS "p90(span.duration)", quantileTDigest(0.99)(toFloat64OrDefault(s."span.duration")) AS "p99(span.duration)", s."span.group_id" AS "span.group_id", any(s."span.system") AS "span.system", any(s."span.name") AS "span.name" FROM "spans_index_buffer" AS "s" WHERE (s.`span.time` >= '2021-12-29 08:46:00') AND (s.`span.time` < '2021-12-29 09:46:00') AND (s.`span.system` = 'http:unknown_service') GROUP BY "span.group_id" LIMIT 1000       *ch.Error: DB::Exception: Unknown function toFloat64OrDefault. Maybe you meant: ['toFloat64OrNull','dictGetFloat64OrDefault']: While processing toFloat64OrDefault(`span.duration`) 

[bunrouter]  09:45:30.607   500    108.234ms   GET      /api/tracing/groups?time_gte=2021-12-29T08:46:00.000Z&time_lt=2021-12-29T09:46:00.000Z&query=group+by+span.group_id+%7C+span.count_per_min+%7C+span.error_pct+%7C+p50(span.duration)+%7C+p90(span.duration)+%7C+p99(span.duration)&system=http:unknown_service          *ch.Error: DB::Exception: Unknown function toFloat64OrDefault. Maybe you meant: ['toFloat64OrNull','dictGetFloat64OrDefault']: While processing toFloat64OrDefault(`span.duration`)

Clickhouse version: altinity/clickhouse-server:21.8.12.1.testingarm (cause I have macbook on m1 chip)

Record username in Uptrace http:serve traces

It would be useful to attach the username of the current user to the traces generated from Uptrace's internal http:serve.

Error sorting by wrong column span.count_per_min

http://oteldev-01.moncc.net:14318/explore/13385337/spans?time_dur=3600&query=group%20by%20span.group_id%20%7C%20span.count_per_min%20%7C%20span.error_pct%20%7C%20%7Bp50,p90,p99%7D%28span.duration%29%20%7C%20where%20span.group_id%20%3D%20%228962033991366417035%22&system=http%3Aotel-ui-dev&sort_by=span.count_per_min&sort_dir=desc

*ch.Error: DB::Exception: Missing columns: 'span.count_per_min' while processing query: 'SELECT span.id, span.trace_id FROM spans_index_buffer AS s WHERE (project_id = 13385337) AND (span.time >= toDateTime('2022-04-15 20:29:00', 'UTC')) AND (span.time < toDateTime('2022-04-15 21:29:00', 'UTC')) AND (span.system = 'http:otel-ui-dev') AND (span.group_id= '8962033991366417035') ORDER BYspan.count_per_min DESC LIMIT 10', required columns: 'span.id' 'span.trace_id' 'project_id' 'span.time' 'span.system' 'span.count_per_min' 'span.group_id', maybe you meant: ['span.id','span.trace_id','project_id','span.time','span.system','span.group_id']

Duration in Explore view seems to be 1000 too small

Hello,

I am using

uptrace 1.0.3
Data comes from
- Opentelemetry collector
- Jaeger + Opentracing source data

I have some issues with the duration unit in some dashboard :

On the dashboard /explore/<project_id>/groups, I see :

We see that for group "Celery:run:check_clients_nonprd_resources" we have a max span duration of 302 ms :

In the API call , we can see the max value is 301977114000 which is ~301e9
When I deep dive, I see that the longest span are 5 minutes which is in fact 301 seconds (I verified with Jaeger)
When I look in Clickhouse database (via the query log), I replayed the Clickhouse query :

SELECT
    group_id AS `span.group_id`,
    sum(count) / 60 AS `span.count_per_min`,
    sumIf(count, status_code = 'error') / sum(count) AS `span.error_pct`,
    quantileTDigest(0.5)(toFloat64OrDefault(duration)) AS `p50(span.duration)`,
    quantileTDigest(0.9)(toFloat64OrDefault(duration)) AS `p90(span.duration)`,
    quantileTDigest(0.99)(toFloat64OrDefault(duration)) AS `p99(span.duration)`,
    max(duration) AS `max(span.duration)`,
    any(system) AS `span.system`,
    any(name) AS `span.name`,
    any(event_name) AS `span.event_name`
FROM spans_index AS s
WHERE (project_id = 2) AND (time >= toDateTime('2022-09-21 12:02:00', 'UTC')) AND (time < toDateTime('2022-09-21 13:02:00', 'UTC')) AND (system = 'service:ccp-paris_prd') AND (kind = 'server')
GROUP BY `span.group_id`
ORDER BY `p99(span.duration)` DESC
LIMIT 1000

Query id: 32d3e1dd-3116-428a-ba9b-142c58e1324e

┌────────span.group_id─┬───span.count_per_min─┬─────span.error_pct─┬─p50(span.duration)─┬─p90(span.duration)─┬─p99(span.duration)─┬─max(span.duration)─┬─span.system───────────┬─span.name─────────────────────────────────────────────────────┬─span.event_name─┐
│ 14900324603554143372 │                 0.85 │                  0 │       284028930000 │       296918750000 │       301977100000 │       301977114000 │ service:ccp-paris_prd │ Celery:run:check_clients_nonprd_resources                     │                 │

I suppose an issue at the front visualization . As I am not a vue expert, I was not able to dig in

Allow to configure secret_key and audience in the JWT user provider

Single Sign-On or a JWT token with explanation how to sign it

Add last value to Alertmanager templates

Quote from Telegram:

I managed to connect Telegram notifications :)
With example for FS alert rule from documentation alertmanager gets these labels from Uptrace:

Labels:

alertname = Filesystem usage >= 5%
device = /dev/sdb1
host_name = cc-prod
project_id = 3
rule_query = group by host.name | group by device | where device !~ "loop" | $fs_usage{state="used"} / $fs_usage >= 0.05

And im building some nice notifications:

[ Filesystem usage >= 5% ][ cc-test ][ 3 ]
Status: FIRING 🔥

Alerts:
Device: /dev/mapper/fivegen--cleancity--vg-root
Device: /dev/sda1

But i wonder, how can i modify alert rule for Uptrace to provide measurable value to Alertmanager?
Like this:
Device: /dev/mapper/fivegen--cleancity--vg-root (35% used)
Device: /dev/sda1 (47%used)

How to setup tail-based sampling

Our golang programs export data to uptrace 14317/14318 ports with UPTRACE_DSN header, but do not know how to do tail-based sampling

Accept DSN in Zipkin API

Wordpress integration

Hello

I'm new into APM in general, I have only tried New Relic for a short time and I wonder how does this compare to New Relic, and specific how does it work for eg Wordpress applications (and others)?
Does this work "out of the box" to collect metrics, slow queries, slow php functions etc... or does this require also custom/special development from the application side to have it working in Uptrace?

I know New Relic requires us to install their plugin. Is there some equivalent for Uptrace?
Or can we do something in the root application to make it work?

Thanks in advance and very nice solution!
Looking forward to try it in a few of our projects.

Feature: Save Filters / Configurable Default Landing Page

It could be beneficial to allow the main summary view in Uptrace to be configurable by a particular attribute. Right now there are Systems, Services, and hosts but say you had an attribute of app.name. It would be useful to be able to add additional views based on some attributes users have added to a group by Application, etc.

We may look at how this could be implemented and could look to contribute, but thought I would submit an issue as well.

Thanks!

Vendor third-party stylesheets

The UI currently makes 2 third-party requests for fonts and icons:

uptrace/vue/public/index.html

Lines 9 to 10 in bb1c5fb

    
           <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:100,300,400,500,700,900"> 
        
           <link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@mdi/font@latest/css/materialdesignicons.min.css">

While this works well for public access, some Uptrace hosted installations are behind firewalls that do not have access to these CDNs or the internet. It would be beneficial to vendor those files during the build.

API: Enable Jaeger Compatible Format

Jaeger is still widely in use. Enabling a method to write traces to Uptrace and retrieving them in Jaeger format could help migrate towards the Uptrace UI/interface. This would prevent duplicate storage requirements from supporting Jaeger while transitioning to Uptrace and allow users to view traces within Grafana if needed.

Operations used in Grafana

/api/services
	=> /api/services
/api/operations
	=> /api/services/MyService/operations
/api/traces
	=> /api/traces?operation=/&service=test_service&start=1664979972328000&end=1664983572328000&lookback=custom
/traces/{trace_id}
	=> /api/traces/{trace_id}

Reference: https://github.com/jaegertracing/jaeger-idl/blob/main/proto/api_v2/query.proto

Update screenshots to use correct redirect url

So I trying to setup OpenID Connect using Google, following this tutorial https://uptrace.dev/get/auth-google.html. looks like the screenshot image and the Uptrace config doesn't match. in the screenshot the Authorized redirect URIs set to http://localhost:14318/api/v1/sso/oidc/callback but the Uptrace config providing id: google, which is the Authorized redirect URIs supposed to set to http://localhost:14318/api/v1/sso/google/callback

The same issue exists with keycloak docs - https://uptrace.dev/get/auth-keycloak.html#create-a-client-for-uptrace

If anyone has a correct screenshot, please upload it here.

update google screenshot
update keycloak screenshot

Can I follow the same trace I started in the client application on the server?

what I want to do is to put in the same trace the tracking I do in the client application with the one I do in the server, can I collect the id of the trace or something similar so that it does not create a new trace with a different id?

I want to add under biuwer_signin_success another trace that instead of coming from the client application comes from the server application.
My programming language is js

Error parsing metrics from otel-collector prometheus receiver

Trying to collect prometheus metrics with otel-collector and pass them to uptrace.
Using prometheus_simple receiver and otlp exporter.

receivers:
  prometheus_simple:
    collection_interval: 10s
    endpoint: "host:port"
    metrics_path: "/metrics"
    tls_enabled: true
    tls:
      insecure_skip_verify: true

processors:
  resourcedetection:
    detectors: [system,env,docker]
  batch:
    send_batch_size: 10000
    timeout: 10s

exporters:
  otlp:
    endpoint: *endpoint*
    tls:
      insecure: true
    headers:
      uptrace-dsn: *dsn*

service:
  pipelines:
    metrics/prometheus_simple:
      receivers: [prometheus_simple]
      processors: [batch, resourcedetection]
      exporters: [otlp,logging]

But Uptrace logs an error for this pipeline.

uptrace_1       | 2022-09-29T16:02:41.300Z      error   metrics/otlp.go:203     unknown metric  {"type": "*v1.Metric_Summary"}

Check cloki keys & values API to refine selection

Add user agent parsing

Using github.com/mileusna/useragent

Show service name in trace view

Roadmap

Metrics support using OpenTelemetry and Prometheus-like API
- Accept OpenTelemetry metrics via OTLP and store in ClickHouse
- Support AlertManager
- Port Uptrace metrics
- Do some thorough testing how Uptrace+Prometheus+AlertManager work together
- Documentation
Allow to pin attributes on the overview page - #25
Search via AppBar

Stage 2

Add latest queries history
Allow to edit YAML configs (uptrace, prometheus, and alertmanager) from Uptrace UI

Optional / late stage

Display & manage Prometheus Alerts from Uptrace UI
Add a Service Map

[question] is the ingestion to clickhouse buffered?

i mean, clickhouse can do only 100 batch insert per second before it gets code: 252, message: Too many parts (300). Merges are processing significantly slower than inserts, the question is: is the insert buffered/batched?

DB::Exception: Column `time` is not under aggregate function and not in GROUP BY

ClickHouse version: 21.11.11.1

Push development build to Docker hub

Initial Migrations do no longer work

Description

If a new deployment is created with the latest uptrace-dev docker image, the initial Clickhouse migrations will not pass.

This is because the Clickhouse formatter does not fully replace all arguments within the migrations.

The following lines declare which arguments do exist

uptrace/pkg/bunapp/app.go

Lines 363 to 370 in 93bb17c

    
           fmter := db.Formatter(). 
        
           	WithNamedArg("DB", ch.Safe(db.Config().Database)). 
        
           	WithNamedArg("SPANS_TTL", ch.Safe(app.conf.Spans.TTL)). 
        
           	WithNamedArg("METRICS_TTL", ch.Safe(app.conf.Metrics.TTL)). 
        
           	WithNamedArg("REPLICATED", ch.Safe(replicated)). 
        
           	WithNamedArg("CODEC", ch.Safe(compression)). 
        
           	WithNamedArg("CLUSTER", ch.Safe(app.conf.CHSchema.Cluster)). 
        
           	WithNamedArg("ON_CLUSTER", ch.Safe(onCluster))

But the resulting query looks like this:

CREATE TABLE uptrace.spans_index ?ON_CLUSTER (
  project_id UInt32 Codec(DoubleDelta, ?CODEC),
  system LowCardinality(String) Codec(?CODEC),
  group_id UInt64 Codec(Delta, ?CODEC),

  trace_id UUID Codec(?CODEC),
  id UInt64 Codec(?CODEC),
  parent_id UInt64 Codec(?CODEC),
  name LowCardinality(String) Codec(?CODEC),
  event_name String Codec(?CODEC),
  is_event UInt8 ALIAS event_name != '',
  kind LowCardinality(String) Codec(?CODEC),
  time DateTime Codec(Delta, ?CODEC),
  duration Int64 Codec(Delta, ?CODEC),
  count Float32 Codec(?CODEC),

  status_code LowCardinality(String) Codec(?CODEC),
  status_message String Codec(?CODEC),

  link_count UInt8 Codec(?CODEC),
  event_count UInt8 Codec(?CODEC),
  event_error_count UInt8 Codec(?CODEC),
  event_log_count UInt8 Codec(?CODEC),

  all_keys Array(LowCardinality(String)) Codec(?CODEC),
  attr_keys Array(LowCardinality(String)) Codec(?CODEC),
  attr_values Array(String) Codec(?CODEC),

  "service.name" LowCardinality(String) Codec(?CODEC),
  "host.name" LowCardinality(String) Codec(?CODEC),

  "db.system" LowCardinality(String) Codec(?CODEC),
  "db.statement" String Codec(?CODEC),
  "db.operation" LowCardinality(String) Codec(?CODEC),
  "db.sql.table" LowCardinality(String) Codec(?CODEC),

  "log.severity" LowCardinality(String) Codec(?CODEC),
  "log.message" String Codec(?CODEC),

  "exception.type" LowCardinality(String) Codec(?CODEC),
  "exception.message" String Codec(?CODEC),

  INDEX idx_attr_keys attr_keys TYPE bloom_filter(0.01) GRANULARITY 64,
  INDEX idx_duration duration TYPE minmax GRANULARITY 1
)
ENGINE = ?REPLICATEDMergeTree()
ORDER BY (project_id, system, group_id, time)
PARTITION BY toDate(time)
TTL toDate(time) + INTERVAL ?SPANS_TTL DELETE
SETTINGS ttl_only_drop_parts = 1,
         storage_policy = ?SPANS_STORAGE

Only the first argument ?DB is replaced correctly.

Table system.opentelemetry_span_log doesn't exist

ClickHouse/ClickHouse#41962

do you have any helpful information about it?

Provide a way to select a custom grouping period for metrics .e.g 1/3/5/15 minutes

the detail guide

can you provide the detail guide README from configration to use with diffrent language

read project list from a dynamic source (feature request)

Feature request: Read the project list periodically from a dynamic source such as a URL, or from the output of a script.

Currently, a reconfiguration and a service restart is required, which is not ideal for complex environments

Poll/Wait until ClickHouse and SQLite databases are available

Fix: tabs do not sync with routes

unexpected filter filtering spans by duration using pre-defined menu

When filtering spans by duration, receive an error

unexpected ">" in "where span.duration ><-= 500ms"

Manually removing the = in the filter produces results >

where span.duration > 500ms

Fix migrations with distributed tables

CREATE TABLE measure_minutes_dist ON CLUSTER company_cluster AS measure_minutes 
ENGINE = Distributed(company_cluster, currentDatabase(), measure_minutes) 
         ←[41m *ch.Error: DB::Exception: There was an error on [clickhouse01:9000]: Code: 60. DB::Exception: Table uptrace.measure_minutes doesn't exist. (UNKNOWN_TABLE) (version 22.1.3.7 (official build)) ←[0m 
2022-10-21T20:18:08.712+0300    error   uptrace/main.go:292     ClickHouse migrations failed    {"error": "DB::Exception: There was an error on [clickhouse01:9000]: Code: 60. DB::Exception: Table uptrace.measure_minutes doesn't exist. (UNKNOWN_TABLE) (version 22.1.3.7 (official build))"} 
main.initClickhouse 
        github.com/uptrace/uptrace/cmd/uptrace/main.go:292 
main.glob..func2 
        github.com/uptrace/uptrace/cmd/uptrace/main.go:133 
github.com/urfave/cli/v2.(*Command).Run 
        github.com/urfave/cli/[email protected]/command.go:169 
github.com/urfave/cli/v2.(*App).RunContext 
        github.com/urfave/cli/[email protected]/app.go:378 
github.com/urfave/cli/v2.(*App).Run 
        github.com/urfave/cli/[email protected]/app.go:251 
main.main 
        github.com/uptrace/uptrace/cmd/uptrace/main.go:68 
runtime.main 
        runtime/proc.go:250 
DB::Exception: There was an error on [clickhouse01:9000]: Code: 60. DB::Exception: Table uptrace.measure_minutes doesn't exist. (UNKNOWN_TABLE) (version 22.1.3.7 (official build))

Move exceptions and events to Logs tab

where db.rows_affected > 0 fails

Add Distributed Table to Uptrace Schema

With the addition of ch_schema.replicated it would also be useful to have an option if you want to just use a Distributed Engine that would point to either the MergeTree tables or ReplicatedMergeTree tables.

In some cases, you may want to load balance across servers and not necessarily use replication.

Thanks!

Fix: Items don't fit in line.

Warn if the database schema must be reset

Explore slowest groups button not working.

https://github.com/uptrace/uptrace/blob/master/vue/src/components/SlowestGroups.vue#L7
Groups are ordered by count_per_min instead of p50(duration)

Add support for Sentry ingestion API / accepting spans from Sentry SDK

Uptrace should accept data from Sentry SDK/clients in Sentry format and store it in the Uptrace database. The main benefit is the ability to use https://github.com/getsentry/sentry-javascript which is years ahead of opentelemetry-js.

A good start would be to configure sentry-go to send data to Uptrace:

package main

import (
	"log"
	"time"

	"github.com/getsentry/sentry-go"
)

func main() {
	err := sentry.Init(sentry.ClientOptions{
		Dsn: "http://examplePublicKey@localhost:14318/0",
		// Enable printing of SDK debug messages.
		// Useful when getting started or trying to figure something out.
		Debug: true,
	})
	if err != nil {
		log.Fatalf("sentry.Init: %s", err)
	}
	// Flush buffered events before the program terminates.
	// Set the timeout to the maximum duration the program can afford to wait.
	defer sentry.Flush(2 * time.Second)
}

Then check what data Sentry SDK sends to the backend and try to map it to internal Uptrace datastructures:

https://github.com/uptrace/uptrace/blob/master/pkg/tracing/span.go#L31-L60 - span representation. Can be used to store errors as well.
https://github.com/uptrace/uptrace/blob/master/pkg/tracing/otlp_trace.go#L167-L173 here is how we accept, convert, and process spans.

The Sentry API looks rather simple and clean:

Derive metrics from spans using ClickHouse materialized views

We already do this, but we should make this configurable.

For example, we could use this to generate metrics from Vector logs - https://vector.dev/docs/reference/configuration/sources/host_metrics/

Support an external database (postgres, mysql)

Supporting an external SQL database as an alternative to sqlite would be useful for running Uptrace in a stateless environment like Kubernetes. This could allow Uptrace to run with multiple replicas, and enable rolling restarts to reduce downtime.

Keeping sqlite as the default reduces dependencies for a quick setup, and still works well in systemd and docker-compose setups.

Personally I have no preference between Postgres and MySQL. If Bun is agnostic to all 3 options, that's even better.

How to install uptrace server in cluster mode ?

I want to install an uptrace server in cluster mode, like this picture. but I didn't find docs about this. is the uptrace server stateless？

Support more compression methods in OTLP

Add environment picker in addition to project picker

In example, docker-compose up failed

docker-uptrace-1 | *ch.Error: DB::Exception: Missing columns: 'time' while processing query: 'project_id, span.system, span.group_id, time', required columns: 'project_id' 'span.system' 'time' 'span.group_id' 'project_id' 'span.system' 'time' 'span.group_id'
docker-uptrace-1 | [ch] 09:03:18.366 ALTER 5.539ms ALTER TABLE ch_migration_locks DROP COLUMN col1
docker-uptrace-1 | 2022/03/23 09:03:18 DB::Exception: Missing columns: 'time' while processing query: 'project_id, span.system, span.group_id, time', required columns: 'project_id' 'span.system' 'time' 'span.group_id' 'project_id' 'span.system' 'time' 'span.group_id'

Create SQL migrations for 1.2.0 release

Show an error when a trace can't be find by UUID

Sync list of supported software with the cloud version

Add some templating capabilities to AlertManager templates

For #104

After v1.2.0 is released, we will have attributes and metrics values as the data in alerts, for example:

{
    "rule": "$goroutines >= 0 group by host.name, service.name",
    "attrs": {"host.name": "localhost", "service.name": "myservice"}
    "metrics": {"$goroutines": 28},
}

The goal is to allow users to customize alerts, for example, to get a message like $goroutines (28) is >= 100 on host.name=localhost service.name=myservice we could have a template like $goroutines ({{ $.Metrics["$goroutines"] }}) is >= 100 on {{ $.Attrs }}.

In AlertManager, they have examples like this:

    annotations:
      description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
      summary: 'Instance {{ $labels.instance }} down'

So it looks like Uptrace should support templating in rules annotations. I guess it makes sense to use the same variable names as Prometheus/AlertManager.

The entry point to the relevant code is here.

Issue in traefik routing to OTLP/HTTP port

I am running uptrace in docker and used traefik for routing the OLTP/HTTP requests and its working while accessing the UI, since UI and OTLP/HTTP are the same ports , OTLP/HTTP also should work.
But while sending trace data to OLTP/HTTP via traefik, we are not getting the POST request in uptrace.

But if we use the direct port we can see the POST requests in the docker logs like below.


[bunrouter]  07:04:03.145   200        188µs   POST     /v1/traces
[bunrouter]  07:04:03.149   200        177µs   POST     /v1/traces

Thanks.

	<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:100,300,400,500,700,900">
	<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/@mdi/font@latest/css/materialdesignicons.min.css">

	fmter := db.Formatter().
	WithNamedArg("DB", ch.Safe(db.Config().Database)).
	WithNamedArg("SPANS_TTL", ch.Safe(app.conf.Spans.TTL)).
	WithNamedArg("METRICS_TTL", ch.Safe(app.conf.Metrics.TTL)).
	WithNamedArg("REPLICATED", ch.Safe(replicated)).
	WithNamedArg("CODEC", ch.Safe(compression)).
	WithNamedArg("CLUSTER", ch.Safe(app.conf.CHSchema.Cluster)).
	WithNamedArg("ON_CLUSTER", ch.Safe(onCluster))

uptrace / uptrace Goto Github PK

uptrace's Introduction

Open source APM: OpenTelemetry traces, metrics, and logs

Quickstart

Help

Contributing

uptrace's People

Contributors

Stargazers

Watchers

Forkers

uptrace's Issues

Description

Recommend Projects

Recommend Topics

Recommend Org