Giter Site home page Giter Site logo

e-metrics's Introduction

Hi there 👋, I'm EluvK

  • 🔭 A Back-end developer
  • 💕 Love and enjoy coding
  • 😆 C++ is great, but Rust is better.
  • 📃 This is my blog 👉here.

e-metrics's People

Contributors

eluvk avatar

Watchers

 avatar  avatar

e-metrics's Issues

Areas of improvement

2022.10.26 update

记录一些体会。

这个模块写完之后快两年了,在团队项目里也是被蹂躏迭代到不像最初的模样了。
另外自身各个方面的理解和最开始相比都有很大变化,从开源项目的角度记录一些改进点。如果有时间可以按照这个思路一个个做。

It has been almost two years since this module was written, and it has also been devastated iterated in the team project.
In addition, my understanding of various programming aspects has changed a lot, and here are some possible improvements recorded from the perspective of open source projects. If time permitted, follow this issue.

性能 Performance

作为一个基础功能模块,性能是不得不面对的一个问题。而且这又是一个被调用的模块,你可能无法想象使用者会提出什么样的性能需求。

首先,缓存 metrics_unit 随后异步使用的容器,可以选择有长度限制的队列,每次消费线程全部pop出来处理。性能瓶颈是在系统资源繁忙时,消费线程会略有延迟,而且同时也是生成高峰,这就会导致负循环,最终只能丢弃塞不下的 metrics_unit 。这也是迫不得已的选择,但是至少需要

  1. 把这个瓶颈点测出来对使用者说清楚。另外在出现丢弃时提供反馈、记录日志等手段。
  2. 内存限制在不同场合是不同的,因此有长度限制的队列应该可配置。
  3. 无锁队列会导致内存不受控制,并不是一个太好选择。作为一个辅助模块,在任何场景下都不应该对系统造成过大的影响,哪怕是过度使用,也应该有兜底机制保证不影响系统稳定。

As a basic module, performance is a problem that has to be considered. And since this is a callee module, you may not be able to imagine what kind of performance requirements the user will make.

First, the container to cache metrics_unit used asynchronously hereafter, We can choose a queue with a limited length, and each time the consuming thread pops it all out for processing. The performance bottleneck is that when the system resources are busy, the consuming thread will have a slight delay, and at the same time, metrics surely come to a peak, which will lead to a negative loop, and finally can only discard the metrics_unit that cannot be filled. This is also a last resort, but at least we need:

  1. Measure this bottleneck point and make it clear to users. In addition, feedback, logging and other methods should be provided when discarding occurs.
  2. The memory limit is different in different occasions, so the queue with length limit should be configurable.
  3. Lock-free queue lead to uncontrolled memory and is not a good choice. As an auxiliary module, it should not have an excessive impact on the system in any scenario. Even if it is used excessively, there should be a bottom-up mechanism to ensure that it does not affect the stability of the system.

原子计数器 && 同步、异步的选择 Atomic Counter && Synchronous, Asynchronous Choice

对于效果单一(和timerflower相比,每次计数之间没太大关联,基本没有计算压力)的 counter ,项目里最多使用的就是用来统计类型的实例化个数,构造里+1,析构里-1。用来定位内存占用、内存泄漏问题等。不考虑这个需求的合理性(因为事实已经是这样了),这最容易导致的问题就是:大量的 counter metrics_unit 的产生。甚至可能高达 1 分钟 1 亿次... 这种情况下使用队列来异步处理,很不划算,不如直接把这个计数器作为原子变量,原地在使用端线程资源里处理掉。

应该新增能够通过使用来动态新增的原子计数器。不走其它 metrics_unit 公用的队列。

For counter with a single effect (compared with timer and flower, there is not much correlation between each count, and there is basically no calculation pressure), the most used-case in the project is to keep the number of instantiations of certain type , +1 in constructor, -1 in destructor. Used to locate memory usage, memory leaks, etc. Irrespective of the reasonableness of this requirement (because it is already the present situation), the most likely problem is: the generation of a large number of counter metrics_unit. It may even be as high as 100 million times per minute... In this case, it is not cost-effective to use queues for asynchronous processing. It is better to directly use this counter as an atomic variable and process it in place in the thread resources of the consumer.

Should add atomic counters that can be added dynamically as using. Do not go to other metrics_unit public queues.

使用体验 Use experience

C/C++ 的宏绝不是作为使用接口的好方法 should not use C/C++ macros as interfaces

目前的使用方式是这样的:

metrics.h 头文件里定义了宏指令,比如:

#define METRICS_COUNTER_INCREMENT(metrics_name, value) metrics::e_metrics::get_instance().counter_increase(metrics_name, value)

再通过编译参数来控制这个宏的效果:关闭编译参数后宏展开为空:

#ifdef ENABLE_METRICS
#define METRICS_COUNTER_INCREMENT(metrics_name, value) metrics::e_metrics::get_instance().counter_increase(metrics_name, value)
#else
#define METRICS_COUNTER_INCREMENT(metrics_name, value)
#endif

这个初衷是让使用端不需要变更代码,通过编译参数来开关 metrics ,实际上会出现的问题:

  1. 使用端如果需要计算出 value 等参数,关掉编译宏以后会报 变量未使用 的错误,导致编译宏还是要写在上下文代码里。
  2. 如果想尽可能地梳理清楚模块间地依赖关系、彻底清理头文件污染,就需要在所有用到 metrics 的模块里判断编译参数来决定是否链接\是否包含头文件。
  3. 为什么要做 2? 因为开关metrics编译参数会导致 metrics.h 头文件内容的变化、进而导致所有引用了这个头文件的编译单元的重新编译。

The current usage is as follows:

Macros are defined in the metrics.h header file, for example:

#define METRICS_COUNTER_INCREMENT(metrics_name, value) metrics::e_metrics::get_instance().counter_increase(metrics_name, value)

Then control the effect of this macro through the compilation parameters: after closing the compilation parameters, the macro expands to empty:

#ifdef ENABLE_METRICS
#define METRICS_COUNTER_INCREMENT(metrics_name, value) metrics::e_metrics::get_instance().counter_increase(metrics_name, value)
#else
#define METRICS_COUNTER_INCREMENT(metrics_name, value)
#endif

The original intention of this is to make the user do not need to change the code and switch metrics by compiling parameters. In fact, there will be problems:

  1. If the user needs to calculate the parameters such as value, after closing the compiled option, an error of "variable not used" will be reported, resulting in the compiled macro still having to be written in the context code.
  2. If you want to sort out the dependencies between modules as much as possible and completely clean up the pollution of header files, you need to judge the compilation parameters in all modules that use metrics to decide whether to link/include header files.
  3. Why do 2? Because switching the metrics compilation parameter will cause the content of the metrics.h header file to change, which in turn will cause all compilation units that reference this header file to be recompiled.

需要有针对模块级别的开关 Need module-level on-off option

这个需求的完美答案可以大幅改善上面三个问题。

至少有几点设计准则:

  1. 不使用全局的 metrics 开关编译选项。(如果是极致的最求最小release二进制,这种情况下应该完全不依赖这个辅助模块了)
  2. 每个模块,使用自己的编译选项来开关模块内部的所有 metrics
  3. 接口(/被其它模块使用的)头文件 (比如 metrics.h ) 里不应该有编译开关控制的代码变化。把变化放到实现里做。
    • 这点设计准则其实挺重要的,是我在优化项目编译选项里得到的体会。

A perfect answer to this need can greatly improve the three questions above.

There are at least a few design guidelines:

  1. Do not use the global metrics switch compile option. (If it is the ultimate minimum release binary, in this case, it should not depend on this auxiliary module at all)
  2. Each module, use its own compile options to switch all metrics inside the module
  3. Interface (/used by other modules) header files (eg metrics.h ) should not have code changes controlled by compile switches. Put the changes into the implementation.
    • This design criterion is actually quite important, and it is the experience I got in optimizing the project compilation options.

规范 && 联动 Specification && Linkage

metrics_name 规范 metrics_name specification

如果有了针对模块级别的二次封装使用方式,也就不需要单独处理字符串里的 category 了。

If there is a secondary encapsulation usage for the module level, there is no need to deal with the category in the string separately.

输出格式的规范 Specification of output format

基本需求没变化,通过回调函数集成进项目的日志系统里输出json格式。

未来可以配合采集上报工具和服务端汇总可视化。这两个模块 dw-agent\dw-server 计划用rust重写。

The basic requirements have not changed, and the callback function is integrated into the project's log system to output json format result.

In the future, it can cooperate with collection and reporting tools and server-side summary visualization. The two modules dw-agent\dw-server are planned to be rewritten in rust.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.