yse / easy_profiler Goto Github PK
View Code? Open in Web Editor NEWLightweight profiler library for c++
License: MIT License
Lightweight profiler library for c++
License: MIT License
Hi,
thanks for your awesome tool.
I love it so much that I am planning to (ab)using it in a slightly different way.
I would like to measure the execution of tasks and not just functions/methods.
Ideally I need to be able to manually do the following operations:
In other words, I don't want the ProfilerManager to take care of this for me, but, on the other hand, ProfilerManager is the one that provides the socket interface and dump to file.
I am positive that what I described can be done, but it would be nice if you can give any hint
Regards
Davide
Сделать средствами cmake генерацию архивов:
After loading *.prof file in the GUI. I can see the Diagram, but Hierarchy is empty.
I've just started to add some blocks into my code, and I currently have the following function which runs fine, no leaking memory.
void Renderer::RenderLines()
{
#ifdef _PROFILE
EASY_FUNCTION(profiler::colors::Amber);
#endif
m_API->SetBlendState(eBlendStates::NO_BLEND);
m_API->SetRasterizer(eRasterizer::CULL_NONE);
ID3D11RenderTargetView* backbuffer = m_API->GetBackbuffer();
ID3D11DepthStencilView* depth = m_DeferredRenderer->GetDepthStencil()->GetDepthView();
m_API->GetContext()->OMSetRenderTargets(1, &backbuffer, depth);
const auto commands = mySynchronizer->GetRenderCommands(eBufferType::LINE_BUFFER);
#ifdef _PROFILE
EASY_BLOCK("LineCommand", profiler::colors::Red);
#endif
for (s32 i = 0; i < commands.Size(); i++)
{
auto command = reinterpret_cast<LineCommand*>(commands[i]);
m_API->SetDepthStencilState(command->m_ZEnabled ? eDepthStencilState::Z_ENABLED : eDepthStencilState::Z_DISABLED, 1);
m_Line->Update(command->m_Points[0], command->m_Points[1]);
m_Line->Render(m_Camera->GetOrientation(), m_Camera->GetPerspective());
}
#ifdef _PROFILE
EASY_END_BLOCK;
#endif
m_API->SetBlendState(eBlendStates::NO_BLEND);
m_API->SetDepthStencilState(eDepthStencilState::Z_ENABLED, 1);
m_API->SetRasterizer(eRasterizer::CULL_BACK);
}
But when I add another block inside the already existing block, it can be EASY_BLOCK( foo ); or EASY_FUNCTION( bar ); both of them creates a memory leak in my application.
#ifdef _PROFILE
EASY_BLOCK("LineCommand", profiler::colors::Red);
#endif
for (s32 i = 0; i < commands.Size(); i++)
{
#ifdef _PROFILE
EASY_BLOCK("InsideLoop", profiler::colors::Green);
#endif
auto command = reinterpret_cast<LineCommand*>(commands[i]);
m_API->SetDepthStencilState(command->m_ZEnabled ? eDepthStencilState::Z_ENABLED : eDepthStencilState::Z_DISABLED, 1);
m_Line->Update(command->m_Points[0], command->m_Points[1]);
m_Line->Render(m_Camera->GetOrientation(), m_Camera->GetPerspective());
#ifdef _PROFILE
EASY_END_BLOCK;
#endif
}
#ifdef _PROFILE
EASY_END_BLOCK;
#endif
I tried putting a EASY_FUNCTION inside the Render function that m_Line calls to see if the memory leak persisted, it did.
I've got a separate configuration for my profiler & release which are setup the same way, but the profiler build has the _PROFILE flag defined, I flipped it over to release to check that it was not my own code that was causing the leak, I can confirm that it was not my own code causing this leak.
Is this something you can reproduce or have I managed to compile the lib incorrectly?
EDIT:
Forgot to mention that I can place a block AFTER the existing block and NOT leak memory.
EDIT 2:
Seems as if the memory leaking is only when I place a EASY_BLOCK around a loop and then another EASY_BLOCK / EASY_FUNCTION inside of the loop in someway.
Initialization of fonts before QApplication is not valid and causes a segfault when linking statically against Qt. This is caused by the initialization of non-local variables before main. eg.
const auto BG_FONT = ::profiler_gui::EFont("Helvetica", 10, QFont::Bold);
This can be fixed by replacing the above with
auto const & BG_FONT() { static const auto BG_FONT = ::profiler_gui::EFont("Helvetica", 10, QFont::Bold); return BG_FONT; }
Hey, this library/program looks quite promising but sadly, I was not able to use it though using mingw w64, gcc-7 and release v1.2.0.
To use mingw i built the library from source but when trying to record blocks using the program (which i downloaded from the same release since i don't have qt installed and am not sure how to get it to work with mingw) i always received the error
Can not read profiled blocks
Reason:
Profiled blocks number == 0
When trying to dump the blocks to a file at the end of the program, the profiler::dumpBlocksToFile("file.prof");
call seems to get stuck (everytime i checked with gdb the thread was in a std::this_thread::sleep_for call triggered by this call).
Not sure how you serialize the blocks, but may it require binary compatibility between the viewer program and the library? If so, is it possible to package mingw releases with the next release? Would be really awesome!
Otherwise, do you have any ideas how to fix this?
Thanks for any help.
I am working on very small cpp programs where I code and compile using command line using G++. I would like to see how my programs work and how much memory do they waste.
Is there a way to use the profiler without using CMake? If yes then please, care to explain how.
Thanks.
Would you like to replace any double quotes by angle brackets around file names for include statements?
Add an option to cmake for giving a possibility to choose timer type.
Also modify easy_profiler_core/current_time.h for supporting such functionality.
Reconnecting seems to be broken as profiler application freezes on reconnect. Closing game window does not terminate process as background music can still be heard even if window disappears. Terminating frozen profiler application makes game process exit completely.
Steps to reproduce:
Connect
button in profiler - data is gathered properlyStop
button in profiler - data is displayed properlyDisconnect
button in profiler - shouldnt profiler disconnect in previous step?Connect
button in profiler - profiler is frozenI should also point out is that we added this fix to get profiler into working order. For some reason without this loop profiler freezes on very first attempt to connect to a running application. Killing profiled application shows message that profiler could not connect. This was tested on linux and macos. While i test things with our little bit modified version our mods should not have any impact as they are pretty much all for macos + this loop i just linked.
As title says - first attempt to connect to profiled application always fails. Second attempt works.
Edit:
I might add that this code loop keeps spinning endlessly when there is no connection. Judging from EASY_LOGMSG("GUI-client connected\n");
i do not think that is what you intended as it would be spamming that message when nothing connects. Maybe this is related?
Hi!
There's unused global constant
const char* DAFAULT_ADDRESS = "tcp://127.0.0.1:28077";
Please remove it, because it is:
Since Urho3D maintainers are not interested in making use of easy_profiler i no longer maintain integration. There is however some interest from AtomicGameEngine community. However they want to maintain on-screen display of recent profiling data snapshot.
We already discussed ability to get data tree at runtime in Urho3D forum. But could we get API to receive cached profiling data anyway? Does not have to be anything fancy, just a dump of blocks profiler collected on last frame. Then i could use that to build a tree real-time for display on the screen.
Run tests using std::chrono
timers for gcc 5, gcc 6 and msvc2015 builds to calculate block cost using c++11 standard timers on modern compilers.
For gcc 4 and msvc2013 it was very high cost: ~0.6-0.8 us per block.
In profile_manager.h:chunk_allocator::allocate() / emplace_back()
and elsewhere, there are lines like *(uint16_t*)(data + n) = 0;
and *(uint16_t*)last->data = 0;
that violate strict-aliasing rules:
/easy_profiler/easy_profiler_core/profile_manager.h:175:36: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] *(uint16_t*)last->data = 0;
I have confirmed that easy_profiler does not work correctly when cross compiled (with gcc-4.9.4) for a single core ARMv5 (32 bit) Linux, with pthreads enabled. It works, however, on Linux x86_64 with the same compiler version on the same machine used for cross-compiling. This could be direct result of the UB, as I have made similar mistakes that caused code to work on x86 but not the ARM platform I am using.
Even if my problems aren't caused by the UB, it still needs to be fixed anywhere it is found. I would suggest the use of std::memcpy
/ std::memset
for such things.
EDIT the line *(uint16_t*)(data + n) = 0
doesn't violate strict-aliasing, only the second one.
I'm having an issue with a serialized block size having a zero in it, despite there actually being a block there. I'm having a really hard time debugging it without fully understanding the contents of a stream.
I keep getting this pop-up error on launch whenever I try to run profiler_gui.exe in anything other than debug. (In debug, the profiler open and seems to work just fine)
My current setup is using Visual Studio 2015, x64 and Qt5.9.1. Do you know if this is local issue on my computer or an issue with the project?
This code in profile_manager.h:126 is broken for plain GCC:
#if EASY_ENABLE_ALIGNMENT == 0
# define EASY_ALIGNED(TYPE, VAR, A) TYPE VAR
# define EASY_MALLOC(MEMSIZE, A) malloc(MEMSIZE)
# define EASY_FREE(MEMPTR) free(MEMPTR)
#else
# if defined(_MSC_VER)
# define EASY_ALIGNED(TYPE, VAR, A) __declspec(align(A)) TYPE VAR
# define EASY_MALLOC(MEMSIZE, A) _aligned_malloc(MEMSIZE, A)
# define EASY_FREE(MEMPTR) _aligned_free(MEMPTR)
# elif defined(__GNUC__)
# define EASY_ALIGNED(TYPE, VAR, A) TYPE VAR __attribute__(aligned(A))
# else
# define EASY_ALIGNED(TYPE, VAR, A) TYPE VAR
# endif
#endif
The code in the #elif defined(_GNUC_)
:
# define EASY_ALIGNED(TYPE, VAR, A) TYPE VAR __attribute__(aligned(A))
should be:
# define EASY_ALIGNED(TYPE, VAR, A) TYPE VAR __attribute__((aligned(A)))
(missing a set of parentheses)
Once that is fixed, both EASY_MALLOC and EASY_FREE need to be defined to some reasonable value which is up to you guys, since there are a few candidates.
I would like to point out that an identifier like “EASY_PROFILER__OUTPUT_STREAM__H_
” does eventually not fit to the expected naming convention of the C++ language standard.
Would you like to adjust your selection for unique names?
calling stopListen()
deadlocks here if listener thread called EasySocket::accept()
and is waiting on it.
There are several problems because of changing thread_id_t
from uint32_t
to uint64_t
referenced mainly to context-switch events on *nix systems with 64-bit pid_t
Вывод:
./profiler_gui: error while loading shared libraries: libeasy_profiler.so: cannot open shared object file: No such file or directory
вот что ldd показывает:
libeasy_profiler.so => not found
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f94f917e000)
libQt5Gui.so.5 => /usr/lib/x86_64-linux-gnu/libQt5Gui.so.5 (0x00007f94f8c36000)
libQt5Core.so.5 => /usr/lib/x86_64-linux-gnu/libQt5Core.so.5 (0x00007f94f8760000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f94f83dd000)
Библиотеки эти я вижу в папке с бинарником.
Возможно, использование в CMake команды link_directories(directory1 directory2 ...)
решит проблему.
The standard cmake argument for building a library as shared or static is BUILD_SHARED_LIBS. This would replace the EASY_OPTION_LIB_STATIC and EASY_OPTION_LIB_TYPE allowing more idiomatic usage of cmake. This is important for integration with package managers such as hunter which expect this argument to be honored. If you are happy with this change then I will create a pull request.
When lib is built with -DEASY_OPTION_LOG=ON
profiled application spams this at high speed:
EasyProfiler INFO: GUI-client connected
Nothing is connecting to it.
Add documentation and examples for UI
Keep up great work, guys!
Hi,
I'm wanting to start and end a block manually in two functions. I want to use the GCC API for start and end functions so I don't have to add the easy_profiler macros to every function call.
Something like:
void __cyg_profile_func_enter(void* this_fn, void* call_site){
const char* symbol = "";
Dl_info info;
if(dladdr(this_fn, &info) && info.dli_sname){
symbol = info.dli_sname;
}
int status = 0;
char* demangled = __cxxabiv1::__cxa_demangle(symbol, 0, 0, &status);
EASY_PROFILE_BLOCK_START(demangled != NULL && status == 0 ? demangled : symbol);
infunc = false;
}
void __cyg_profile_func_exit(void* this_fn, void* call_site){
EASY_PROFILE_BLOCK_END();
}
Is this possible?
В релизную сборку необходимо:
Инициализировать приложение как оконную подсистему. Для этого корректно расставить макросы вокруг строчки https://github.com/yse/easy_profiler/blob/develop/profiler_gui/main.cpp#L51
Создать rc-файл, куда прописать издателя и текущую версию
the thread timeline is correct, but the hierarchy is empty.
i use easy_function and easy_block.
need set something?
Thanks.
Указать необходимые предустановленные библиотеки и программы для Linux и для windows
Version:
Public v1.2.0, built locally with source and integrated into my project that way (rather than with the provided static libraries).
Callstack:
easy_profiler.dll!operator` delete(void * block) Line 21 C++ easy_profiler.dll!std::_Deallocate(void * _Ptr, unsigned __int64 _Count, unsigned __int64 _Sz) Line 133 C++ easy_profiler.dll!std::allocator<std::reference_wrapper<profiler::Block> >::deallocate(std::reference_wrapper<profiler::Block> * _Ptr, unsigned __int64 _Count) Line 721 C++ easy_profiler.dll!std::_Wrap_alloc<std::allocator<std::reference_wrapper<profiler::Block> > >::deallocate(std::reference_wrapper<profiler::Block> * _Ptr, unsigned __int64 _Count) Line 988 C++ easy_profiler.dll!std::vector<std::reference_wrapper<profiler::Block>,std::allocator<std::reference_wrapper<profiler::Block> > >::_Reallocate(unsigned __int64 _Count) Line 1619 C++ easy_profiler.dll!std::vector<std::reference_wrapper<profiler::Block>,std::allocator<std::reference_wrapper<profiler::Block> > >::_Reserve(unsigned __int64 _Count) Line 1633 C++ easy_profiler.dll!std::vector<std::reference_wrapper<profiler::Block>,std::allocator<std::reference_wrapper<profiler::Block> > >::emplace_back<profiler::Block & __ptr64>(profiler::Block & <_Val_0>) Line 928 C++ easy_profiler.dll!ProfileManager::beginBlock(profiler::Block & _block) Line 968 C++ easy_profiler.dll!beginBlock(profiler::Block & _block) Line 304 C++ TomatoGame.exe!Katgine::App::Win32AppWindow::Run(Katgine::App::IApp * app) Line 81 C++ ...
Steps to reproduce:
Workaround:
Possible solutions:
For example, just looking at the cpp files, most are UNIX line terminators, but some have CRLF (dos) line terminators.
$ find . -name '*.cpp' -exec file {} \;
./sample/main.cpp: C source, ASCII text, with CRLF line terminators
./easy_profiler_core/event_trace_win.cpp: C source, ASCII text, with CRLF line terminators
./easy_profiler_core/easy_socket.cpp: C source, ASCII text
./easy_profiler_core/block.cpp: C source, ASCII text
./easy_profiler_core/profile_manager.cpp: C source, ASCII text
./easy_profiler_core/reader.cpp: C++ source, ASCII text
./profiler_gui/descriptors_tree_widget.cpp: C source, ASCII text, with CRLF, LF line terminators
./profiler_gui/easy_qtimer.cpp: C source, ASCII text
./profiler_gui/globals.cpp: C source, ASCII text
./profiler_gui/treeitem.cpp: C source, ASCII text
./profiler_gui/easy_graphics_item.cpp: C source, ASCII text, with very long lines
./profiler_gui/main.cpp: C source, ASCII text
./profiler_gui/treemodel.cpp: C source, ASCII text
./profiler_gui/blocks_graphics_view.cpp: C++ source, ASCII text, with CRLF, LF line terminators
./profiler_gui/tree_widget_item.cpp: C source, ASCII text
./profiler_gui/globals_qobjects.cpp: C source, ASCII text
./profiler_gui/blocks_tree_widget.cpp: C source, ASCII text, with CRLF, LF line terminators
./profiler_gui/easy_graphics_scrollbar.cpp: C source, ASCII text, with CRLF, LF line terminators
./profiler_gui/main_window.cpp: C source, ASCII text, with CRLF, LF line terminators
./profiler_gui/easy_chronometer_item.cpp: C source, ASCII text
./profiler_gui/tree_widget_loader.cpp: C source, ASCII text, with very long lines, with CRLF, LF line terminators
./reader/main.cpp: C++ source, ASCII text
Some documentation is available at Wiki page. Add more information.
Consider case where some very slow event happens:
Now if we scroll the view to the right and no longer see slow region we get this:
Visible data is still hanging at very bottom of the chart making it not really useful. I think histogram range in zoom mode should adapt to visible data instead of entire dataset.
Hello,
I build easy_profiler with MSVC 2015 and Qt 5.8. For test I launch profiler_sample.exe
Objects count: 500
Render steps: 1500
Modelling steps: 1500
Resource loading count: 50
Frame time: max 4442 us // avg 4299 us
Frame time: max 35237 us // avg 4914 us
Frame time: max 6181 us // avg 4403 us
Frame time: max 6422 us // avg 4441 us
Frame time: max 7993 us // avg 4417 us
Frame time: max 11446 us // avg 4428 us
Frame time: max 6254 us // avg 4354 us
Frame time: max 6200 us // avg 4359 us
Frame time: max 5609 us // avg 4315 us
Frame time: max 7810 us // avg 4336 us
Frame time: max 15036 us // avg 4416 us
Frame time: max 42781 us // avg 4869 us
Frame time: max 6176 us // avg 4285 us
Frame time: max 6927 us // avg 4414 us
Elapsed time: 7073221 usec
And it frozen here...
I am in the progress of integrating easy_profiler to AtomicGameEngine. In the process we bumped into several nasty issues which i think should be addressed. Profiler does not build on MacOS platform or on windows with MSVC compiler. We worked out some patches and will submit PR once code is in shape.
Now on to a real problem:
MacOS is having a hard time (meaning crash) due to manual string destructor invocation. This right there is a sign of very bad design and needs to be addressed.
I started digging deeper into the problem and discovered that this destructor call is a result of custom StackBuffer class. As i understand this class was created due to performance reasons, but i think it is completely not necessary. Did you try using std::vector
? We can minimize memory reallocations by reserving space in a vector. I usually reserve double of vector size when capacity is reached. It wastes some memory, but progressively reduces memory reallocations and we can use RAII and not need ugly manual calls to destructors. To avoid copying on insertion we can just .push_back(std::move(NonscopedBlock(...)))
. This should perform much the same as StackBuffer
except so much more safer and cleaner.
I started changing StackBuffer
to std::vector
and noticed that move constructor of Block
modifies state of object it is stealing state from. Why? Move constructor is supposed to steal state from the other object and that leaves this object in undetermined state, basically destined for a trashcan. Here something else is happening which seems very wrong. I do not think i can fix this problem without understanding why it was written this way.
Now, I could be doing something stupid, but I'm pretty sure easy_profiler_core doesn't respect the build type option because of this cmake code in easy_profiler_core/CMakeLists.txt
. Note the -O3
if (UNIX)
target_compile_options(easy_profiler PRIVATE -Wall -Wno-long-long -Wno-reorder -Wno-braced-scalar-init -pedantic -O3)
target_link_libraries(easy_profiler pthread)
elseif (WIN32)
target_compile_definitions(easy_profiler PRIVATE -D_WIN32_WINNT=0x0600 -D_CRT_SECURE_NO_WARNINGS -D_WINSOCK_DEPRECATED_NO_WARNINGS)
target_link_libraries(easy_profiler ws2_32 psapi)
endif ()
I have been trying to debug an issue on my ARM platform for quite sometime now, and this really through me off. I had ruled out UB because it was still broken in the debug build...
Would it be possible to keep track of arbitrary float/integer values? That would be useful for engine to track object counts. It could be represented simply as FPS graph. API could be something like profiler::sample(valueType, value)
.
Unrelated: we would love to track object lifetimes as well, however i have no idea how that could be presented in UI. Just something to think about.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.