Giter Site home page Giter Site logo

intel / intel-cmt-cat Goto Github PK

View Code? Open in Web Editor NEW
683.0 70.0 180.0 6.69 MB

User space software for Intel(R) Resource Director Technology

Home Page: http://www.intel.com/content/www/us/en/architecture-and-technology/resource-director-technology.html

License: Other

Makefile 2.06% C 61.45% Perl 1.04% Roff 0.50% Shell 0.17% Python 17.47% SWIG 0.17% Dockerfile 0.12% JavaScript 0.03% TypeScript 12.96% HTML 2.55% SCSS 1.47%
rdt cmt cat mbm mba cache llc c snmp perl

intel-cmt-cat's Introduction

README for Intel(R) RDT Software Package

Coverity Status License

Contents

  • Overview
  • Package Content
  • Hardware Support
  • OS Support
  • Software Compatibility
  • Legal Disclaimer

Overview

This software package provides basic support for Intel(R) Resource Director Technology (Intel(R) RDT) and Intel(R) I/O Resource Director Technology (Intel(R) I/O RDT) including: Cache Monitoring Technology (CMT), Memory Bandwidth Monitoring (MBM), Cache Allocation Technology (CAT), Code and Data Prioritization (CDP) and Memory Bandwidth Allocation (MBA).

In principle, the software programs the technologies via Model Specific Registers (MSR) on a hardware thread basis. MSR access is arranged via a standard operating system driver: msr on Linux and cpuctl on FreeBSD. In the most common architectural implementations, presence of the technologies is detected via the CPUID instruction.

In a limited number of special cases where CAT is not architecturally supported on a particular SKU (but instead a non-architectural (model-specific) implementation exists) it can be detected via brand string. This brand string is read from CPUID and compared to a table of known-supported SKUs. If needed, a final check is to probe the specific MSR’s to discover hardware capabilities, however it is recommended that CPUID enumeration should be used wherever possible.

From software version v1.0.0 the library adds option to use Intel(R) RDT via available OS interfaces (perf and resctrl on Linux). The library detects presence of these interfaces and allows to select the preferred one through a configuration option. As the result, existing tools like 'pqos' or 'rdtset' can also be used to manage Intel(R) RDT in an OS compatible way. As of release v4.3.0, OS interface became the default option. 'pqos' tool wrappers have been added to help with the interface selection. 'pqos-os' and 'pqos-msr' for OS and MSR interface operations respectively.

PID API compile time option has been removed and the APIs are always available. Note that proper operation of these APIs depends on availability and selection of OS interface.

This software package is maintained, updated and developed on https://github.com/intel/intel-cmt-cat

https://github.com/intel/intel-cmt-cat/wiki provides FAQ, usage examples and useful links.

Please refer to INSTALL file for package installation instructions.

Package Content

"lib" directory:
Includes software library files providing API's for technology detection, monitoring and allocation. Please refer to the library README for more details (lib/README).

“lib/perl” directory:
Includes PQoS library Perl wrapper. Please refer to the interface README for more details (lib/perl/README).

“lib/python” directory:
Includes PQoS library Python 3.x wrapper. Please refer to the interface README for more details (lib/python/README.md).

"pqos" directory:
Includes source files for a utility that provides command line access to Intel(R) RDT. The utility links against the library and programs the technologies via its API's. Please refer to the utility README for more details "pqos/README". Manual page of "pqos" utility also provides information about tool usage: $ man pqos

"rdtset" directory:
Includes source files for a utility that provides "taskset"-like functionality for RDT configuration. The utility links against the library and programs the technologies via its API's. Please refer to the utility README for more details "rdtset/README". Manual page of "rdtset" utility also provides information about tool usage: $ man rdtset

"appqos" directory:
Includes source files for an application that allows to group apps into priority based pools. Each pool is assigned an Intel(R) RDT and Intel(R) SST configuration that can be set on startup or at runtime through a REST API. Please refer to the application README for more details "appqos/README".

"appqos_client" directory:
Includes source files for an App QoS client web application. The app provides a simple user interface to remotely configure Intel(R) RDT and Intel(R) SST on systems where App QoS is running. Please refer to the application README for more details "appqos_client/README".

"examples" directory:
Includes C and Perl examples of Intel(R) RDT usage via the library API's. Please refer to README file for more details "examples/README".

"snmp" directory:
Includes Net-SNMP AgentX subagent written in Perl to demonstrate the use of the PQoS library Perl wrapper API. Please refer to README file for more details "snmp/README".

"tools" directory:
Includes membw tool for stressing memory bandwidth with different operations.

"srpm" directory:
Includes *.src *.rpm and *.spec files for the software package.

"ChangeLog" file:
Brief description of changes between releases.

"INSTALL" file:
Installation instructions.

"LICENSE" file:
License of the package.

"unit-test" directory:
Unit tests

Hardware Support

Supported products can be found in Addendum A of the Intel® Resource Director Technology (Intel® RDT) Architecture Specification: https://www.intel.com/content/www/us/en/content-details/789566/intel-resource-director-technology-intel-rdt-architecture-specification.html

Addendum B contains a list of processors with model-specific Intel® RDT Features.
Note: Detection of model-specific features requires the RDT_PROBE_MSR environment variable to be set when using the library and utilities. These features are only available when using the MSR interface. See the "Interfaces" section below for more information. See the wiki for usage examples.

For additional Intel(R) RDT details please refer to the Intel(R) Architecture Software Development Manuals available at: https://www.intel.com/content/www/us/en/develop/download/intel-64-and-ia-32-architectures-sdm-combined-volumes-1-2a-2b-2c-2d-3a-3b-3c-3d-and-4.html Specific information can be found in volume 3a, Chapters 17.18 and 17.19.

OS Support

Overview

Linux is the primary supported operating system at the moment. There is a FreeBSD port of the software but due to limited validation scope it is rather experimental at this stage. Although most modern Linux kernels include support for Intel(R) RDT, the Intel(R) RDT software package predates these extensions and can operate with and without kernel support. The Intel(R) RDT software can detect and leverage these kernel extensions when available to add functionality, but is also compatible with legacy kernels.

OS Frameworks

Linux kernel support for Intel(R) RDT was originally introduced with Linux perf system call extensions for CMT and MBM. More recently, the Resctrl interface added support for CAT, CDP and MBA. On modern Linux kernels, it is advised to use the kernel/OS interface when available. Details about these interfaces can be found in resctrl_ui.txt. This software package, Intel(R) RDT, remains to work seamlessly in all Linux kernel versions.

Interfaces

The Intel(R) RDT software library and utilities offer two interfaces to program Intel(R) RDT technologies, these are the MSR & OS interfaces.

The MSR interface is used to configure the platform by programming the hardware (MSR's) directly. This is the legacy interface and requires no kernel support for Intel(R) RDT but is limited to monitoring and managing resources on a per core basis.

The OS interface was later added to the package and when selected, the library will leverage Linux kernel extensions to program these technologies. This allows monitoring and managing resources on a per core/process basis and should be used when available.

Please see the tables below for more information on when Intel(R) RDT feature (MSR & OS) support was added to the package.

Table 2. MSR interface feature support
Intel(R) RDT version RDT feature enabled Kernel version required
0.1.3 L3 CAT, CMT, MBM Any
0.1.4 L3 CDP Any
0.1.5 L2 CAT Any
1.2.0 MBA Any
2.0.0 L2 CDP Any
5.0.0 I/O RDT Any
Table 3. OS interface feature support
Intel(R) RDT version RDT feature enabled Kernel version required Recommended interface
0.1.4 CMT (Perf) 4.1 MSR (1)
1.0.0 MBM (Perf) 4.7 MSR (1)
1.1.0 L3 CAT, L3 CDP, L2 CAT (Resctrl) 4.10 OS for allocation only (with the exception of MBA) MSR for allocation + monitoring (2)
1.2.0 MBA (Resctrl) 4.12 OS for allocation only MSR for allocation + monitoring (2)
2.0.0 CMT, MBM (Resctrl) 4.14 OS
2.0.0 L2 CDP 4.16 OS
3.0.0 MBA CTRL (Resctrl) 4.18 OS

References:

  1. Monitoring with Perf on a per core basis is not supported and returns invalid results.
  2. The MSR and OS interfaces are not compatible. MSR interface is recommended if monitoring and allocation is to be used.

Software dependencies

The only dependencies of Intel(R) RDT is access to C and pthreads libraries and:

  • without kernel extensions - 'msr' kernel module
  • with kernel extensions - Intel(R) RDT extended Perf system call and Resctrl filesystem

Enable Intel(R) RDT support in:

  • kernel v4.10 - v4.13 with kernel configuration option CONFIG_INTEL_RDT_A
  • kernel v4.14+ with kernel configuration option CONFIG_INTEL_RDT
  • kernel v5.0+ with kernel configuration option CONFIG_X86_RESCTRL

Note: No kernel configuration options required before v4.10.

Software Compatibility

In short, using Intel(R) RDT or PCM software together with Linux perf and cgroup frameworks is not allowed at the moment.

As disappointing as it is, use of Linux perf for CMT & MBM and Intel(R) RDT for CAT & CDP is not allowed. This is because Linux perf overrides existing CAT configuration during its operations.

There are a number of options to choose from in order to make use of CAT:

  • Intel(R) RDT software for CMT/MBM/CAT and CDP (core granularity only)
  • use Linux resctrl for CAT and Linux perf for monitoring (kernel 4.10+)
  • patch kernel with an out of tree cgroup patch (CAT) and only use perf for monitoring (CMT kernels 4.1+, MBM kernels 4.6+)

Table 4. Software interoperability matrix

Intel(R) RDT PCM Linux perf Linux cgroup Linux resctrl
Intel(R) RDT Yes(1) Yes(2) Yes(5) No Yes(5)
PCM Yes(2) Yes No No No
Linux perf Yes(5) No Yes Yes(3) Yes
Linux cgroup No No Yes Yes(3) No
Linux resctrl (4) Yes(5) No Yes No Yes

References:

  1. pqos monitoring from Intel(R) RDT can detect other pqos monitoring processes in the system. rdtset from Intel(R) RDT detects other processes started with rdtset and it will not use their CAT/CDP resources.

  2. pqos from Intel(R) RDT can detect that PCM monitors cores and it will not attempt to hijack the cores unless forced. However, if pqos monitoring is started first and then PCM is started then the latter one will hijack monitoring infrastructure from pqos for its use.

  3. Linux cgroup kernel patch https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt

  4. Linux kernel version 4.10 and newer. A wiki for Intel resctrl is available at: https://github.com/intel/intel-cmt-cat/wiki/resctrl

  5. Only with Linux kernel version 4.10 (and newer), Intel(R) RDT version 1.0.0 (and newer) with selected OS interface See '-I' option in 'man pqos' or 'pqos-os'.

PCM is available at: https://github.com/opcm/pcm

Table 5. Intel(R) RDT software enabling status.

Core Task CMT MBM L3 CAT L3 CDP L2 CAT MBA
Intel(R) RDT Yes Yes(7) Yes Yes Yes Yes Yes Yes
Linux perf Yes(6) Yes Yes(1) Yes(2) No(3) No(3) No(3) No
Linux cgroup No Yes No No Yes(4) No No No
Linux resctrl (5) Yes Yes Yes(8) Yes(8) Yes Yes Yes Yes(9)

Legend:

  • Core - use of technology with core granularity
  • Task - use of technology per task or group of tasks

References:

  1. Linux kernel version 4.1 and newer
  2. Linux kernel version 4.6 and newer
  3. Linux perf corrupts CAT and CDP configuration even though it doesn't enable it
  4. This is patch and relies on Linux perf enabling
  5. Linux kernel version 4.10 and newer
  6. perf API allows for CMT/MBM core monitoring but returned values are incorrect
  7. Intel(R) RDT version 1.0.0 monitoring only and depends on kernel support
  8. Linux kernel version 4.14 and newer
  9. Linux kernel version 4.12 and newer

Legal Disclaimer

THIS SOFTWARE IS PROVIDED BY INTEL"AS IS". NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS ARE GRANTED THROUGH USE. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.

intel-cmt-cat's People

Contributors

aboczkox avatar adziarnx avatar ahetheri avatar aleksinx avatar axecalever avatar babumoger avatar bdoole1 avatar bmwiedemann avatar cjshanahan avatar colinianking avatar fengyuleidian0615 avatar gsauthof avatar jbizimun avatar jodh-intel avatar klosowskimarcinx avatar kmabbasi avatar marquiz avatar mdcornu avatar mstarzyx avatar olsajiri avatar philippwendler avatar rjablonx avatar rkanagar avatar rstorozh avatar sitheek avatar tkanteck avatar uzairbex avatar vkarpenk avatar wandralx avatar xiaochenshen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

intel-cmt-cat's Issues

BUG?

enum pqos_mon_event {
PQOS_MON_EVENT_L3_OCCUP = 1, /< LLC occupancy event */
PQOS_MON_EVENT_LMEM_BW = 2, /
< Local memory bandwidth */
PQOS_MON_EVENT_TMEM_BW = 4, /< Total memory bandwidth */
PQOS_MON_EVENT_RMEM_BW = 8, /
< Remote memory bandwidth
(virtual event) */

maybe PQOS_MON_EVENT_LMEM_BW should equal 0x4 based on Intel manual

pqos -T results in "Monitoring start error on core 0, status 3"

Hi,
I checked out v0.1.3, maked and ran modprobe msr as well as pqos -T. Unfortunatly, I get Monitoring start error on core 0, status 3 as a result. Is this a bug or an issue with my system?

The output of pqos -T -v is as follows:

INFO: Detected core 47 on socket 1, cluster 1
INFO: Adding monitoring event: resource ID 1, type 1 to table index 0
INFO: Monitoring capability detected
INFO: CPUID.0x7.0: CAT not supported. Check brand string.
INFO: CPU brand string 'Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz'
WARN: Cache allocation not supported on model name 'Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz'!
INFO: L3CA capability not detected
INFO: No Linux perf process monitoring support. Ensure kernel version 4.0 or higher is installed.
INFO: Max RMID per monitoring cluster is 48
INFO: RMID internal tables allocated
INFO: Detected RMID47 is associated with core 0. Marking RMID & core unavailable.
INFO: Detected RMID46 is associated with core 1. Marking RMID & core unavailable.
INFO: monitoring init OK
INFO: allocation init OK
Monitoring start error on core 0, status 3

Build failure on tip

Hi there, I'm getting a build failure from the tip of the tree:

Bisected this down to:
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[6e0aedd] File layout change to accomodate new tools, features and extensions

cc -L../../../lib -lpqos -I../../../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -g -O2 -Wcast-align -Wnested-externs allocation_app.c -o allocation_app
/tmp/cc0jYqxq.o: In function main': /home/king/repos/intel-cmt-cat/examples/c/CAT/allocation_app.c:202: undefined reference topqos_init'
/home/king/repos/intel-cmt-cat/examples/c/CAT/allocation_app.c:244: undefined reference to pqos_fini' /home/king/repos/intel-cmt-cat/examples/c/CAT/allocation_app.c:209: undefined reference topqos_cap_get'
/home/king/repos/intel-cmt-cat/examples/c/CAT/allocation_app.c:216: undefined reference to pqos_cpu_get_sockets' /tmp/cc0jYqxq.o: In functionset_allocation_class':
/home/king/repos/intel-cmt-cat/examples/c/CAT/allocation_app.c:139: undefined reference to pqos_l3ca_set' /tmp/cc0jYqxq.o: In functionprint_allocation_config':
/home/king/repos/intel-cmt-cat/examples/c/CAT/allocation_app.c:171: undefined reference to `pqos_l3ca_get'
collect2: error: ld returned 1 exit status
: recipe for target 'allocation_app' failed
make[1]: *** [allocation_app] Error 1
make[1]: Leaving directory '/home/king/repos/intel-cmt-cat/examples/c/CAT'
Makefile:56: recipe for target 'all' failed
make: *** [all] Error 2

Do core base L3 allocations extend to kernel/ring0 code?

Do core based limitations extend to Kernel code? So if I have a kernel module run on a particular core, will it be limited to the L3 allocation for that particular core or are the limitations just for userspace applications?

Can't locate object method "swig_ways_mask_get" via package "pqos::pqos_l3ca" at /usr/local/lib64/perl5/pqos.pm line 33.

System details:

# swig -version

SWIG Version 2.0.10

Compiled with g++ [x86_64-redhat-linux-gnu]

Configured options: +pcre

# perl --version

This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi
(with 29 registered patches, see perl -V for more detail)

# gcc -dumpversion
4.8.5

# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)

I am running on Jun 10 commit (43bc01e911b696f9d91299f9183ed5f859605ee1).

It can be triggered by calling ./examples/perl/hello_world.pl or ./lib/perl/test.pl.

Unexpected action in mask definition

Hello to everyone!

I am experiencing the following issue with the cache allocation feature:
I am trying to define the mask of a Class of Service to be 0x80000, and another one to be 0x7FFFF, (so I am trying to achieve an isolation scheme).
pqos -e 'llc:1=0x80000;llc:2=0x7ffff'

However, after receiving the appropriate message of success, I see that the mask of the COS which was defined as 0x80000, is instead 0xC0000, which means that I am getting an extra way of associativity, and the isolation is not achieved.

I inserted debug messages in the process of defining the mask, but in no point did I find an error or a change in the mask, until the call to msr_write in host_allocation.c

CPU: Intel Xeon CPU E5-2699 v4 @ 2.20GHz

rdtset make all failed

user@ubuntu:~/ankur/intel-cmt-cat/rdtset$ sudo make all
cc -MM -MP -MF rdtset.d -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -D_GNU_SOURCE -Wcast-align -Wnested-externs -g -O3 rdtset.c
cat rdtset.d | sed 's/rdtset.o/rdtset.d/' >> rdtset.d
cc -MM -MP -MF rdt.d -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -D_GNU_SOURCE -Wcast-align -Wnested-externs -g -O3 rdt.c
cat rdt.d | sed 's/rdt.o/rdt.d/' >> rdt.d
cc -MM -MP -MF cpu.d -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -D_GNU_SOURCE -Wcast-align -Wnested-externs -g -O3 cpu.c
cat cpu.d | sed 's/cpu.o/cpu.d/' >> cpu.d
cc -MM -MP -MF common.d -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -D_GNU_SOURCE -Wcast-align -Wnested-externs -g -O3 common.c
cat common.d | sed 's/common.o/common.d/' >> common.d
cc -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -D_GNU_SOURCE -Wcast-align -Wnested-externs -g -O3 -c -o common.o common.c
cc -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -D_GNU_SOURCE -Wcast-align -Wnested-externs -g -O3 -c -o cpu.o cpu.c
cc -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -D_GNU_SOURCE -Wcast-align -Wnested-externs -g -O3 -c -o rdt.o rdt.c
cc -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -D_GNU_SOURCE -Wcast-align -Wnested-externs -g -O3 -c -o rdtset.o rdtset.c
cc -L../lib -fPIE -z noexecstack -z relro -z now common.o cpu.o rdt.o rdtset.o -lpqos -lpthread -o rdtset
rdt.o: In function alloc_release': /home/xiaoning/ankur/intel-cmt-cat/rdtset/rdt.c:862: undefined reference to pqos_alloc_release'
rdt.o: In function cat_set': /home/xiaoning/ankur/intel-cmt-cat/rdtset/rdt.c:1030: undefined reference to pqos_alloc_assign'
rdt.o: In function cat_configure_cos': /home/xiaoning/ankur/intel-cmt-cat/rdtset/rdt.c:972: undefined reference to pqos_l2ca_set'
rdt.o: In function alloc_release': /home/xiaoning/ankur/intel-cmt-cat/rdtset/rdt.c:862: undefined reference to pqos_alloc_release'
rdt.o: In function check_cpus': /home/xiaoning/ankur/intel-cmt-cat/rdtset/rdt.c:511: undefined reference to pqos_alloc_assoc_get'
rdt.o: In function cat_reset': /home/xiaoning/ankur/intel-cmt-cat/rdtset/rdt.c:1101: undefined reference to pqos_alloc_assoc_set'
collect2: error: ld returned 1 exit status
Makefile:86: recipe for target 'rdtset' failed
make: *** [rdtset] Error 1

Error: "Monitoring start error on core(s) �D, status 2"

Hello again,

I have done some develloping in order to manage dynamically CAT but by mistake I deleted all files... Fortunately I had a previous copy which I am trying to use but while the command line instructions play (for example ./pqos -m:1-9 shows the everything) when I am pressing Ctr+C I am getting this error ( "Monitoring start error on core(s) �D��, status 2"). In monitor.c it says that this error maybe is because two instances of PQoS attempt to use the same core id. I am deleting libpqos in /var/lock but nothing seems to be fixed. Also to mention that when I am trying to store results in a .csv file, the .csv is empty (nothing is stored).
How could I fix this ?

Thank you very much in advance,
Dimitra

Questions about info returned from pqos

Hi,

pqos -sv on my platform returns:

INFO: L3 CAT details: CDP support=1, CDP on=0, #COS=16, #ways=20, ways contention bit-mask 0xc0000
  1. What is the "ways contention bit-mask"?

  2. Does #ways=20 refer to the L3 associativity ways, or the subsets to which L3 can be partitioned? The Intel System Programming Guide states (17.18.2):

    Though the representation of CBMs looks similar to a way-based mapping they are independent
    of any enforcement implementation (e.g. way partitioning)

    I understand that, on my platform, cache partitioning is performed on a way-basis, but this might be
    different in future implementations.

Assignment of COS to cores per socket

Hi,

I need to assign CPU cores from different sockets to their respective COS on each socket.

For example, I have theses COS:

pqos -e llc@1:1=0x0003E;llc@1:2=0x007C0;llc@0:1=0x0001E;llc@0:2=0x007E0

Now I want to assign the cores per socket (made up example):

pqos -a "llc@1:1=0,2,6-10;llc@0:1=1;"

However, the command as stated above does not work.
Any hints on how to specify which socket to use when assigning COS to cores?

Does this happen automatically depending on the location of the core?

Intel Resource Director Technology and Intel Performance Counter Monitor

Hello,
I'm interested about whether I could use Intel Resource Director Technology and Intel Performance Counter Monitor together. I'd like to monitor some resources, which are only separately available in one of them. But both of them will access to the model-specific register (MSR), it may lead to interfere. Should they run parallel to each other or not?

Null pointer deference in parse_allocation_cos(), alloc.c

a parse_error is reported on detection of a NULL, but then immediately afterwards a zero char is written to the NULL pointer, which will cause a segfault.

    p = strchr(str, '=');
    if (p == NULL)
            parse_error(str, "invalid class of service definition");
    *p = '\0';

Question about L3 and L2 CAT and MBA

Hi,

I saw in README that both L2 and L3 CAT are supported in the software.
(1) Is L2 shared among all cores or only between the hardware threads?
(2) Is there any Intel CPU that supports both L2 and L3 CAT?

As to the Memory Bandwidth Allocation,
(3) Is there any Intel CPU that support Memory Bandwidth Allocation now? When should we expect to purchase it on the market?

Thank you very much for your help!

Meng

use of ncurses rather than ANSI escape sequences to produce tty output

Using ANSI escape sequences works on a limited subset of terminals. It may be a good idea to use ncurses instead for a couple of reasons:

1, it is more efficient over slow tty serial connections
2, it supports more ttys
3, it is less flickery
4, it will restore the screen when complete

Just a wish really, and would make the tool look more polished.

lib/Makefile calls ldconfig. it must not

ldconfig is a root-only operation, and distros and users alike do not and should not build things as root.... there is a reason no other package uses ldconfig as part of its makefiles

(and please consider using autoconf or the like. yes autoconfig is evil, but speaking as a distro creator, the Makefiles in this project are much worse than autoconf)

Hardware abstraction layer for testing

Hi Team,

Do you have a hardware abstraction layer available to release for testing?
I wish to try a few patches on hardware I dont have access to.

Best,
Aaron

Getting error on Ubuntu16.04

I am trying to monitor a process. Getting below mentioned error.

#sudo pqos -I -V -s

INFO: Monitoring capability detected
INFO: CPUID.0x7.0: L3 CAT supported
INFO: CDP is enabled
INFO: L3 CAT details: CDP support=1, CDP on=1, #COS=8, #ways=20, ways contention bit-mask 0xc0000
INFO: L3 CAT details: cache size 36700160 bytes, way size 1835008 bytes
INFO: L3CA capability detected
INFO: CPUID 0x10.0: L2 CAT not supported!
INFO: L2CA capability not detected
INFO: CPUID 0x10.0: MBA not supported!
INFO: MBA capability not detected
INFO: resctrl not detected. Kernel version 4.10 or higher required
ERROR: OS interface selected but not supported
ERROR: discover_os_capabilities() error 1
Error initializing PQoS library!

root@oss7:~# sudo pqos -I -p all:19017
NOTE: Mixed use of MSR and kernel interfaces to manage
CAT or CMT & MBM may lead to unexpected behavior.
ERROR: OS interface selected but not supported
ERROR: discover_os_capabilities() error 1
Error initializing PQoS library!

# Server Model:
model name : Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz

Unexpected results in LLC Footprint and Question about a member struct

Hello again :)

[Question about pqos reported measurements]
I execute some scenarios trying to exploit CAT capability. Although, pqos reports some unexpected results and I was wondering how valid are the reported measuments. I attach the following output in the same excecution scenario. To be more specific, an application is running and I am executing "pqos -m: [0-16],[17-21]" and after than (the application continues to run, I am not restarting it) I am doing "pqos -m: [0-5],[6-11],[12-16],[17-21]". I expected that the LLC reported in group [17-21] via the first command would be equal to the LLC reported in same group ([17-21]) via the second command. Also, I expect that the reported LLC footprint in group [0-16] would be equal to the sum of LLC footprint in groups [0-5],[6-11],[12-16].. But pqos verifies neither of these..

TIME 2017-11-06 16:15:30
CORE IPC MISSES LLC[KB] MBL[MB/s] MBL[MB/s]
0-16 , 1.69 , 3751k , 3168.0 , 51.8 , 120.6
17-21 , 1.71 , 404k , 27720.0 , 6.1 , 6.9

TIME 2017-11-06 16:16:08
CORE IPC MISSES LLC[KB] MBL[MB/s] MBL[MB/s]
0-5 , 0.37 , 3345k , 3200.0 , 26.7 , 94.3
6-11 , 0.88 , 435k , 2024.0 , 19.9 , 11.8
12-16 , 2.48 , 452k , 352.0 , 17.2 , 6.8
17-21 , 1.78 , 340k , 43560.0 , 2.3 , 9.5

[Question about a field in a struct]
Also, I was wondering what "unsigned num_poll_ctx; /**< number of poll contexts */" in "struct pqos_mon_data", is used for ? It is initialized and used in function "hw_mon_start", but I cannot completely understand its use..

Thank you very much in advance,
Dimitra

Question about the CAT support on Celeron N3350

Hi,

Does the Intel Celeron N3350 (based on the Apollo Lake platform) have the Intel CAT technology?
How many CLOS registers do the processor support?
How many bits in the CBM field in the CLOS registers on the processor?

I'm sorry for bothering you about the details.
I couldn't find the information anywhere else.
I want to make sure the Intel SoC I will buy does have the necessary functionalities of Intel CAT.

Thanks,

Meng

How fine-grained can the cache assignment be for each COS?

How fine-grained can the cache assignment be for each COS? Is there a limitation to how fine-grained the assignment can be, for instance, can I assign 1MB to a given COS?

I have successfully used the sample CFG0 that equally divides the cache between all the classes of service. I am using the Intel(R) Xeon(R) CPU E5-2618L v3 @ 2.30GHz machine.

Bandwidth Measurement and time interval

Hi everyone,

I am trying to figure out how the MBL is reported by pqos while the time interval changes. For example pqos reports the average value calculated in this interval or the max value detected ? To be more specific, what kind of measurements do I expect when time interval is for example 1 (which means, that pqos reports the monitoring values every 100ms) and when time interval is 5 (500ms). In the latter case (500ms), the repoted values are the same if I did 5 measurements with time interval 1 and then calculated the average value?

In /lib/monitoring.c function pqos_core_poll seems to have the answer. The num_poll_ctx, which is the number of poll contexes, indicates that the reported value of MBL is derived by polling and reading the MSRs ?

Thank you in advance,
Dimitra Giantsidi

COS out of range check needed for L3 CAT in pqos utility.

Right now, when the user tries to set a L3 CAT COS with an invalid class ID using the pqos utility, the library returns an error and the application simply fails. A check should be added to the pqos utility to verify that the specified class ID is valid and if not, should print a message to the user stating so.

Query regarding limite of four classes of Service

Is the limit of 4 classes of service (COS) on the Intel(R) Xeon(R) processor E5 v3 generation permanent or it limited by the current software implementation and will expand in the future. I ask because the section 17.16.3.3 Cache Mask Configuration of the Intel Software Developer's manuals mentions that the range of MSRs reserved for CAT can accommodate up to 256 COS or multiple resources.

Core id's are enumerated twice

Configuration
SW: Linux 3.19.0-15-generic x86_64 GNU/Linux
HW: CPU model is one of pre-production engineering samples

Steps to reproduce
Problem seems to affect monitoring only. The following commands will report error when setting up monitoring:
pqos
pqos -T

Workaround
Specify list of monitored cores:
pqos -m ':0-31'
pqos -T -m ':0-31'

Root cause
/proc/cpuinfo file is used to detect CPU on Linux and with particular kernel version format of the file is slightly different which produces the above problem.

Fix
Fix has been tested and patch will be submitted shortly.

Some question about CAT

Hi everyone !

I have some question about CAT :

  1. I know the number of classes and the number of cache ways can change , but how to change them ? Is there any function to change them with pqos?
  2. I saw there are many socket and each socket has its own COS , when i change the COS MASK, all of socket have been changed, how can i only change one of them? For example: i only want to change COS1 MASK in socket 0.

Thanks for your answer !

Questions about the Intel-cmt-cat tool and LLC granularity

Hello Aaron,

I hope this email finds you well, and my apology to bother you here. I am a researcher. I was running some experiments based on the Intel's Cache Monitoring Technology using the tool released on github. I had some questions concerning the precision of the LLC occupancy monitoring. I am not sure if creating an issue on the github repository is appropriate. So I tried to ask the questions via email.

Here's what I would like to figure out.: by looking at the LLC occupancy traces, it looks like the data is always multiples of 81920 Bytes on my machine. I tried to use cpuid to read the upscaling factor (cpuid.0fx.1:ebx) and it reports 81920 as well. Here are the questions:

  1. is 80K is the finest granularity CMT LLC occupancy can report?
  2. can you give me a hint on what is the general precision for LLC monitoring?
  3. is it possible to get a finer granularity measurements?

My machine configuraiton is as follows:
Intel Xeon CPU E5-2698 v4
Linux kernel: 4.10.12-1

I appreicate your help. :)

--
Best regards

Fan

Fix casting for C++ compiling

Hi,

Please could we add casting in pqos.h for the casts for these on line 1187:

        uint64_t * const p_64 = value;
        double * const p_dbl = value;

to

        uint64_t * const p_64 = (uint64_t *)value;
        double * const p_dbl = (double* )value;

This fails to compile with a C++ compiler without -fpermissive

Sorry this isn't a pull request, i'm behind a convoluted firewall which makes it difficult.

Thanks,
Paul

Some records missing in 24 hour monitoring test

In 24 hour monitoring test of multiple application instances that sample data at 1[s] interval, it has been detected that number of records is missing. Record loss rate is about 0.01% (1 per 10,000) - "loss" manifests itself as 2[s] difference between consecutive records in the monitoring output.
Root cause is not known yet and the issue needs to be investigated.

Proposed new feature changes - feedback request (L2CDP)

Issue outline:

Support for Intel(R) Resource Director Technology (RDT) Code and Data Prioritization (CDP) for L2 cache is planned to be added to the library and applications.
Feature outlined in the Intel(R) Software Developer's Manual Chapter 17.19.6.

Issue:
The proposed changes require updates to pqos_l2ca data structure that are NOT backwards compatible.

Impact:
Any applications currently using L2 CAT API will need to be updated to use the latest version of the library.

Request:

If these changes severely impact your applications and you would like to see a solution that is backwards compatible with current applications, please provide feedback before Jan 22nd 2018.

If no feedback is received, the changes will be implemented.

Proposed changes:
Current L2 Cache Allocation Class of Service data structure:

struct pqos_l2ca {
        unsigned class_id;      /**< class of service */
        uint32_t ways_mask;     /**< bit mask for L2 cache ways */
};

Updated structure:

struct pqos_l2ca {
        unsigned class_id;              /**< class of service */
        int cdp;                        /**< data & code masks used if true */
        union {
                uint64_t ways_mask;     /**< bit mask for L2 cache ways */
                struct {
                        uint64_t data_mask;
                        uint64_t code_mask;
                } s;
        } u;
};

Question about CBM length and minimum number of cache partitions supported

Hi,

In the README, it only says the number of CLOS registers.
I'm wondering if there is any document that tells the length of the CBM field in CLOS registers on the following processors:

  1. Intel Xeon Processor E5-4620 v4
  2. Intel Xeon D-1518
  3. Intel Xeon Processor E5-2683 v4

I remembered that the minimum number of cache partitions for a core is at least 2 for the Haswell processors.
Does this constraint exist for any of the above three processors?

I want to confirm the CAT properties of these processors before buy any of them.
BTW, it will be really really helpful if there is a document about the details of each processor's CAT and CMT capability, such as the CBM length.

Thank you very much!

Meng

Error when trying to monitor a process

I am trying to monitor a process with the following command:
./pqos -p llc:69866
but I keep receiving the same error message:
PID 69866 monitoring start error,status 1
Am I doing something wrong?

Using CAT with Cluster on Die (COD) configuration

We investigated CAT on a server with Cluster on Die configuration. The results where surprising in that assigning half the cache of a Xeon E5 v4 with 32MB LLC to one class resulted in this class using only 8 MB of cache instead of 15 MB without CAT.

From the documentation it seemed that there are no special provisions required when using CAT with CoD, or am I missing something?

Core information for socket 0:
    Core 0 => COS3, RMID0
    Core 1 => COS3, RMID0
    Core 2 => COS1, RMID0
    Core 3 => COS2, RMID0
    Core 4 => COS2, RMID0
    Core 5 => COS2, RMID0
    Core 12 => COS1, RMID0
    Core 13 => COS1, RMID0
    Core 14 => COS1, RMID0
    Core 15 => COS2, RMID0
    Core 16 => COS2, RMID0
    Core 17 => COS2, RMID0
Core information for socket 1:
    Core 6 => COS1, RMID0
    Core 7 => COS1, RMID0
    Core 8 => COS1, RMID0
    Core 9 => COS2, RMID0
    Core 10 => COS2, RMID0
    Core 11 => COS2, RMID0
    Core 18 => COS1, RMID0
    Core 19 => COS1, RMID0
    Core 20 => COS1, RMID0
    Core 21 => COS2, RMID0
    Core 22 => COS2, RMID0
    Core 23 => COS2, RMID0
numactl -H
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 12 13 14
node 1 cpus: 3 4 5 15 16 17
node 2 cpus: 6 7 8 18 19 20
node 3 cpus: 9 10 11 21 22 23
[...]
`�``

pqos make all failed

user@ubuntu:~/ankur/intel-cmt-cat/pqos$ sudo make all
cc -MM -MP -MF monitor.d -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -Wcast-align -Wnested-externs -g -O3 monitor.c
cat monitor.d | sed 's/monitor.o/monitor.d/' >> monitor.d
cc -MM -MP -MF main.d -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -Wcast-align -Wnested-externs -g -O3 main.c
cat main.d | sed 's/main.o/main.d/' >> main.d
cc -MM -MP -MF alloc.d -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -Wcast-align -Wnested-externs -g -O3 alloc.c
cat alloc.d | sed 's/alloc.o/alloc.d/' >> alloc.d
cc -MM -MP -MF profiles.d -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -Wcast-align -Wnested-externs -g -O3 profiles.c
cat profiles.d | sed 's/profiles.o/profiles.d/' >> profiles.d
cc -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -Wcast-align -Wnested-externs -g -O3 -c -o profiles.o profiles.c
cc -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -Wcast-align -Wnested-externs -g -O3 -c -o alloc.o alloc.c
cc -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -Wcast-align -Wnested-externs -g -O3 -c -o main.o main.c
cc -I../lib -W -Wall -Wextra -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-qual -Wundef -Wwrite-strings -Wformat -Wformat-security -fstack-protector -fPIE -D_FORTIFY_SOURCE=2 -Wunreachable-code -Wmissing-noreturn -Wsign-compare -Wno-endif-labels -Wcast-align -Wnested-externs -g -O3 -c -o monitor.o monitor.c
cc -L../lib -fPIE -z noexecstack -z relro -z now profiles.o alloc.o main.o monitor.o -lpqos -lpthread -o pqos
alloc.o: In function alloc_print_config': /home/xiaoning/ankur/intel-cmt-cat/pqos/alloc.c:718: undefined reference to pqos_l2ca_get'
/home/xiaoning/ankur/intel-cmt-cat/pqos/alloc.c:751: undefined reference to pqos_alloc_assoc_get' alloc.o: In function set_allocation_assoc':
/home/xiaoning/ankur/intel-cmt-cat/pqos/alloc.c:560: undefined reference to pqos_alloc_assoc_set' alloc.o: In function set_alloc':
/home/xiaoning/ankur/intel-cmt-cat/pqos/alloc.c:127: undefined reference to pqos_l2ca_set' /home/xiaoning/ankur/intel-cmt-cat/pqos/alloc.c:127: undefined reference to pqos_l2ca_set'
main.o: In function main': /home/xiaoning/ankur/intel-cmt-cat/pqos/main.c:716: undefined reference to pqos_mon_reset'
/home/xiaoning/ankur/intel-cmt-cat/pqos/main.c:728: undefined reference to `pqos_alloc_reset'
collect2: error: ld returned 1 exit status
Makefile:80: recipe for target 'pqos' failed
make: *** [pqos] Error 1

Requested but not Supported by System (L2/MBA)

Having trouble using CAT with the L2.

System Information:
[uname -R]
4.10.0-041000-generic

[cpuinfo]
processor : 47
vendor_id : GenuineIntel
cpu family : 6
model : 79
model name : Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
stepping : 1

[/etc/os-release]
NAME="Ubuntu"
VERSION="14.04.5 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian

When using pqos to do L2 or MBA allocation
sudo pqos -e l2:2=0x1
I get this error:

Socket 0 MBA COS1 - FAILED!
Allocation configuration error!

L2ID 0 L2CA COS2 - FAILED!
Allocation configuration error!

Tried same thing with RDTSET
sudo rdtset -t 'l2=0xf;cpu=2' -c 2 sleep 10

got this error:
Allocation: L2CA requested but not supported by system!
Requested configuration is not valid!
Allocation: Failed to configure allocation!

Tried with resctrl
echo "L2:0=1;1=1" > COS1/schemata
error: bash: echo: write error: Invalid argument

I am able to do L3 +CAT using all three methods but L2 and MBA allocation are erroring out. Am I allocating these incorrectly? Thanks.

Bit masks for different cache sizes

Can you please explain how are the bitmasks for the different cache sizes determined?

I want to vary the size of a partition from 2MB all the way to 15MB in increments of 1MB but can't seem to figure out what the bitmasks would be for each cache size. Thanks!

make clean not cleaning lib directory

$ make clean
$ ls lib/*.[ao]
lib/cpuinfo.o lib/host_cap.o lib/host_pidapi.o lib/log.o lib/utils.o
lib/host_allocation.o lib/host_monitoring.o lib/libpqos.a lib/machine.o

Switching task execution while monitoring

Hi!
I'm conducting an experiment where I use monitoring on a per thread basis to monitor task events. The way I'm trying to achieve it is by:

  • When a task starts execution, the thread executing it starts monitoring for PQoS events through pqos_mon_start_pid.
  • When a task stops for whatever reason (i.e. blocked but not finished) the thread accumulates its events (pqos_mon_poll) into the task.
  • When a task that was blocked restarts its execution, the thread resumes monitoring through pqos_mon_start_pid.

I am getting various errors due to starting monitoring more than once in a thread. I assumed I would be able to do it this way in order to restart data structures, as pqos_mon_stop frees them. Are there any functions that allow resetting the monitoring of threads (events) without destroying data structures? I'm basically looking for a "start-pause-resume-stop" pattern.

Thanks in advance,

Toni

cpus file looks strange?

According to https://github.com/01org/intel-cmt-cat/wiki/resctrl reading the cpus file should look something like:

# cat cpu
	c
# cat COS1/cpu
	3

On my system however the output is something like:

$ cat /sys/fs/resctrl/cpus 
ffff,e98a2f85
$ cat /sys/fs/resctrl/cobench0/cpus 
0000,00248008
$ cat /sys/fs/resctrl/cobench1/cpus 
0000,16515072

it is a two socket NUMA system. Do you have any idea how to interpret the cpu files? Are the two NUMA domain separated by ,? Why is the second mask to much bigger?

Some system info:

$ cat /proc/cpuinfo | grep "model name" | uniq
model name	: Intel(R) Xeon(R) CPU E5-2658 v3 @ 2.20GHz

$ uname -a
Linux 4.10.0+ #1 SMP Tue Apr 11 16:03:07 CEST 2017 x86_64 x86_64 x86_64 GNU/Linux

CC: @stlankes @spickartz

Identifying cores for CAT

We use lstopo to get a complete view on the resource topology of the server we run pqos on. The comparison of the output of both tools does not lead to an obvious way to map the core and PU ids from lstopo to pqos (see bottom of this message). Intel DPDK uses the same core ids (os_index) as lstopo.

It would be great to know how the enumeration and the corresponding identification of CPU cores works with pqos.

lstopo:

Machine (126GB total)
  Package L#0
    NUMANode L#0 (P#0 31GB) + L3 L#0 (15MB)
        L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0)
        L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#2)
        L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#4)
        L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#6)
        L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4 + PU L#4 (P#8)
        L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5 + PU L#5 (P#10)
    NUMANode L#1 (P#2 31GB) + L3 L#1 (15MB)
      L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6 + PU L#6 (P#12)
      L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7 + PU L#7 (P#14)
      L2 L#8 (256KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8 + PU L#8 (P#16)
      L2 L#9 (256KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9 + PU L#9 (P#18)
      L2 L#10 (256KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10 + PU L#10 (P#20)
      L2 L#11 (256KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11 + PU L#11 (P#22)
  Package L#1
    NUMANode L#2 (P#1 31GB) + L3 L#2 (15MB)
      L2 L#12 (256KB) + L1d L#12 (32KB) + L1i L#12 (32KB) + Core L#12 + PU L#12 (P#1)
      L2 L#13 (256KB) + L1d L#13 (32KB) + L1i L#13 (32KB) + Core L#13 + PU L#13 (P#3)
      L2 L#14 (256KB) + L1d L#14 (32KB) + L1i L#14 (32KB) + Core L#14 + PU L#14 (P#5)
      L2 L#15 (256KB) + L1d L#15 (32KB) + L1i L#15 (32KB) + Core L#15 + PU L#15 (P#7)
      L2 L#16 (256KB) + L1d L#16 (32KB) + L1i L#16 (32KB) + Core L#16 + PU L#16 (P#9)
      L2 L#17 (256KB) + L1d L#17 (32KB) + L1i L#17 (32KB) + Core L#17 + PU L#17 (P#11)
    NUMANode L#3 (P#3 31GB) + L3 L#3 (15MB)
      L2 L#18 (256KB) + L1d L#18 (32KB) + L1i L#18 (32KB) + Core L#18 + PU L#18 (P#13)
      L2 L#19 (256KB) + L1d L#19 (32KB) + L1i L#19 (32KB) + Core L#19 + PU L#19 (P#15)
      L2 L#20 (256KB) + L1d L#20 (32KB) + L1i L#20 (32KB) + Core L#20 + PU L#20 (P#17)
      L2 L#21 (256KB) + L1d L#21 (32KB) + L1i L#21 (32KB) + Core L#21 + PU L#21 (P#19)
      L2 L#22 (256KB) + L1d L#22 (32KB) + L1i L#22 (32KB) + Core L#22 + PU L#22 (P#21)
      L2 L#23 (256KB) + L1d L#23 (32KB) + L1i L#23 (32KB) + Core L#23 + PU L#23 (P#23)

pqos:

Core information for socket 0:
    Core 0 => COS0, RMID0
    Core 2 => COS0, RMID0
    Core 4 => COS0, RMID0
    Core 6 => COS0, RMID0
    Core 8 => COS0, RMID0
    Core 10 => COS0, RMID0
    Core 12 => COS0, RMID0
    Core 14 => COS0, RMID0
    Core 16 => COS0, RMID0
    Core 18 => COS0, RMID0
    Core 20 => COS0, RMID0
    Core 22 => COS0, RMID0
Core information for socket 1:
    Core 1 => COS0, RMID0
    Core 3 => COS0, RMID0
    Core 5 => COS0, RMID0
    Core 7 => COS0, RMID0
    Core 9 => COS0, RMID0
    Core 11 => COS0, RMID0
    Core 13 => COS0, RMID0
    Core 15 => COS0, RMID0
    Core 17 => COS0, RMID0
    Core 19 => COS0, RMID0
    Core 21 => COS0, RMID0
    Core 23 => COS0, RMID0

Wrong documentation about the CAT capability in Processor D

According to the README in the repo., it says all Processor D processor has 16 CLOS registers.

Based on this information, we purchased an Intel SoC with Intel Xeon CPU [email protected] processor. We expect it to have 16 COS registers.

However, after checking the cpuid.eax=10H, ecx=1H, we found it only has 4 COS registers and its CBM length is 12.

Could you please update the document, documenting the capabilities of each processor?
The incorrect information causes us to waste money on the CPU that does not best suite our requirements. :(

Thanks!

Meng

Update:
Remove the complaint about the CBM_Length which is product specific.
The CPU has 16 CLOS registers but the cpuid instruction provides the incorrect result, probably due to the BIOS defect.

Thread-level Monitoring

Hi!

Does the API currently support thread-level monitoring? I cannot find any documentation on it.

I've searched through the documentation on header files and an official document from Intel's website, and found out that a part of the API allows both CORE-level or SMT-level monitoring.
I found this in lib/cupinfo.c, on the "detect_apic_core_masks" function

I was also wondering if I can use the OS mode monitoring with thread ids (TID) instead of process ids (PID). A working example from the usage examples would be: ./monitoring_app -I -p threadId1...

Thanks,

Toni

[pqos] missing quotes for Core field in CSV output

Time,Core,IPC,LLC Misses,LLC[KB],MBL[MB/s],MBR[MB/s]
2016-06-14 01:40:53,0-15,32-47,0.10,2558,37248.0,976274.4,5309.5
                    ^^^^^^^^^^

Instead of a single field, this becomes two separate fields in CSV.

Should be:

2016-06-14 01:40:53,"0-15,32-47",0.10,2558,37248.0,976274.4,5309.5

This happens when, e.g. -m all:[0-15,32-47], is used.

Error initializing PQoS library on purley machine

Any kernel version requires?

Linux PLY01SDP 4.8.0-22-generic #24-Ubuntu SMP Sat Oct 8 09:15:00 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
(reverse-i-search)`lsc': ^Ccpu
test@PLY01SDP:~/intel-cmt-cat$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 112
On-line CPU(s) list: 0-111
Thread(s) per core: 2
Core(s) per socket: 28
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: 06/55
Stepping: 2
CPU MHz: 1399.987
CPU max MHz: 3200.0000
CPU min MHz: 1000.0000
BogoMIPS: 3600.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 39424K
NUMA node0 CPU(s): 0-27,56-83
NUMA node1 CPU(s): 28-55,84-111

test@PLY01SDP:~/intel-cmt-cat$ pqos -m llc:0
NOTE: Mixed use of MSR and kernel interfaces to manage
CAT or CMT & MBM may lead to unexpected behavior.
WARN: Error opening file '/dev/cpu/0/msr'!
ERROR: CDP detection error!
ERROR: Fatal error encounter in L3 CAT discovery!
ERROR: discover_capabilities() error 1
Error initializing PQoS library!

What is meant by `number of classes = 4, number of cache ways = 12`

sudo ./pqos -H returns the following.
1) Config ID: CFG0 Description: non-overlapping, ways equally divided Configurations: number of classes = 4, number of cache ways = 12 number of classes = 4, number of cache ways = 16 number of classes = 4, number of cache ways = 20
2) Config ID: CFG1 Description: non-overlapping, ways unequally divided Configurations: number of classes = 4, number of cache ways = 12 number of classes = 4, number of cache ways = 16 number of classes = 4, number of cache ways = 20
3) Config ID: CFG2 Description: overlapping, ways unequally divided, class 0 can access all ways Configurations: number of classes = 4, number of cache ways = 12 number of classes = 4, number of cache ways = 16 number of classes = 4, number of cache ways = 20
4) Config ID: CFG3 Description: ways unequally divided, overlapping access for higher classes Configurations: number of classes = 4, number of cache ways = 12 number of classes = 4, number of cache ways = 16 number of classes = 4, number of cache ways = 20

What is meant by number of classes = 4, number of cache ways = 12 number of classes = 4, number of cache ways = 16 number of classes = 4, number of cache ways = 20 which is repeated across all the options?

Thread-level Monitoring (2)

Hi once again!

With reference to #57, I've updated to a kernel that allows me to use the per-thread monitoring capability (more specifically, 4.13).

While using the API, (i.e. ./monitor_app -I also reports this) I get the following warning:

WARN: As of Kernel 4.10, Intel(R) RDT perf results per core are found to be incorrect.

Could I get any information as to what results are incorrect? Should I not take any measurements obtained with the API for granted? Also, does this mean per-thread results are incorrect as well?

Regards,

Toni

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.