microsoft / oms-agent-for-linux Goto Github PK
View Code? Open in Web Editor NEWHome Page: http://www.microsoft.com/oms
License: Other
Home Page: http://www.microsoft.com/oms
License: Other
We're using DCOS as a container OS, but every node in the docker cluster shows the below segment fault errors in the log. any idea?
[37649.866296] omiagent[13741]: segfault at 10 ip 00007f645fbbc869 sp 00007fff3cbfaad0 error 4 in libcontainer.so[7f645fb5d000+77000]
[37681.862753] omiagent[13962]: segfault at 10 ip 00007f083abf6869 sp 00007ffccbd94220 error 4 in libcontainer.so[7f083ab97000+77000]
I'm trying to integrate the OMS agent in my Kubernetes cluster but there are certain things that could be improved. I'm using CoreOS, so I had to run the OMS agent in a privileged container as described in https://hub.docker.com/r/microsoft/oms/. The fact that I have to configure the Docker daemon to use the fluentd log driver instead of the json driver prevents certain commands like "docker logs" or "kubectl logs" from functioning properly.
Can't the agent tail the json log files present in /var/log/containers/*.log ?
Also, in the case of fluentd, there is a plugin that queries the Kubernetes API to inject additional metadata in the log lines. Does the OMS agent somehow make use of fluentd underneath? Can we use this plugin?
Currently hardcoded paths are used to find syslog-ng.conf in linux.data
E.g. OMS-Agent expects syslog-ng.conf in /etc/syslog-ng/syslog-ng.conf. On my distribution it is in /etc/syslog-ng.conf Other valid locations are e.g. /usr/local/etc/syslog-ng.conf
Possible solutions:
This also applies to other Paths in linux.data e.g. rsyslog /sudoers configuration files
No real issue, but rather a CR:
It would be good if the omsagent / fluentd call could have debug parameters passed (-vv) when service_control is called with a debug level.
START_QUALS="-d $PIDFILE --no-supervisor -o $LOGFILE -vv" <<
And my question:
Even with -vv I do not get plugin tracing information.
Looking for lines like
log.debug "Success sending #{tag} x #{count} in #{time.round(2)}s"
from the out_oms plugin.
Any hints?
My Ubuntu srv not collected data from update (3 servers effected) on log no information available.
mat
Hey guys,
Is there a support for cert based authentication?
I mean, while installing the agent and onboarding OMS, instead of passing the keys as parameters, can I pass a certificate?
For some reason omsagent keeps running following file:
/etc/opt/microsoft/scx/conf/sudodir/dh_builddeb
Which includes following lines
complex_doit("find $tmp -name $_ | xargs rm -rf")
Cause of that /tmp is constantly wiped and none of other applications can normally run while omsagent is running.
Symptom:
Every 5 minutes two new python processes are added to the process list which are not terminated. Same parend PID.
omsagent 22076 21695 0 14:32 ? 00:00:00 [python]
....
omsagent 31358 21695 0 16:02 ? 00:00:00 [python]
omsagent 31360 21695 0 16:02 ? 00:00:00 [python]
omsagent 31862 21695 0 16:08 ? 00:00:00 [python]
omsagent 31864 21695 0 16:08 ? 00:00:00 [python]
omsagent 32401 21695 0 16:13 ? 00:00:00 [python]
omsagent 32403 21695 0 16:13 ? 00:00:00 [python]
...
[root@srv21 yum.repos.d]# ps -ef |grep omsagent | wc -l
69
[root@srv21 yum.repos.d]#
Same parent PID:
omsagent 21695 21663 0 14:28 ? 00:00:05 /opt/omi/bin/omiagent 11 14 --destdir / --providerdir /opt/omi/lib --idletimeout 90 --loglevel WARNING
I think issue occured after omsagent multi-homing with SCOM 2012 R2 was enabled.
Restarting the omsagent does not remediate this. The omiserver was not yet restarted.
Error in /var/opt/omi/log/omiserver.log:
2016/02/22 17:33:35 [21663,21663] WARNING: null(0): EventId=30131 Priority=WARNING wsman: authentication failed for user [opsuser]
in the same intervals.
Even if this is related I would assume it is not desirable to fill up the process list.
Hi, guys
Can the SSL configuration for OMS's network traffic be hardened without negatively affecting Azure's infrastructure communications? At the moment, it sets off vulnerability scanners with the following:
Negotiated with the following insecure cipher suites: SSL 3.0 ciphers: TLS_RSA_WITH_IDEA_CBC_SHATLS 1.0 ciphers: TLS_RSA_WITH_IDEA_CBC_SHATLS 1.1 ciphers: TLS_RSA_WITH_IDEA_CBC_SHATLS 1.2 ciphers: TLS_RSA_WITH_IDEA_CBC_SHA
Negotiated with the following insecure cipher suites: SSL 3.0 ciphers: TLS_RSA_WITH_RC4_128_MD5TLS_RSA_WITH_RC4_128_SHATLS 1.0 ciphers: TLS_RSA_WITH_RC4_128_MD5TLS_RSA_WITH_RC4_128_SHA TLS 1.1 ciphers: TLS_RSA_WITH_RC4_128_MD5 TLS_RSA_WITH_RC4_128_SHATLS 1.2 ciphers: TLS_RSA_WITH_RC4_128_MD5TLS_RSA_WITH_RC4_128_SHA
Would bit be possible to follow better security practices and disable weak ciphers?
The Container Memory Usage tile uses this search
Type=Perf ObjectName=Container CounterName="Memory Usage MB" | Measure Avg(CounterValue) as AvgUsedMemory by InstanceName interval 30minute
However, this provides a list of container ID's. Can this be changed (or help me define a custom search) that will provide the same data, but labelled by the Name of the container (ie. ContainerInventory.Name)? In Swarm 1.12 this is particularly useful.
Version omsagent-1.1.0-239.universal.x64.sh
root@us:~# service omsagent restart
* Shutting down Operations Management Suite agent: [ OK ]
* Starting Operations Management Suite agent: /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require': /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/x86_64-linux/openssl.so: undefined symbol: SSLv3_method - /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/x86_64-linux/openssl.so (LoadError)
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/openssl.rb:17:in `<top (required)>'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/net/https.rb:22:in `<top (required)>'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /opt/microsoft/omsagent/plugin/oms_common.rb:11:in `<class:Common>'
from /opt/microsoft/omsagent/plugin/oms_common.rb:8:in `<module:OMS>'
from /opt/microsoft/omsagent/plugin/oms_common.rb:1:in `<top (required)>'
from /opt/microsoft/omsagent/plugin/changetracking_lib.rb:6:in `require_relative'
from /opt/microsoft/omsagent/plugin/changetracking_lib.rb:6:in `<top (required)>'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/lib/fluent/plugin.rb:89:in `block in load_plugin_dir'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/lib/fluent/plugin.rb:87:in `each'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/lib/fluent/plugin.rb:87:in `load_plugin_dir'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/lib/fluent/engine.rb:138:in `load_plugin_dir'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/lib/fluent/supervisor.rb:513:in `block in init_engine'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/lib/fluent/supervisor.rb:510:in `each'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/lib/fluent/supervisor.rb:510:in `init_engine'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/lib/fluent/supervisor.rb:166:in `block in start'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/lib/fluent/supervisor.rb:360:in `call'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/lib/fluent/supervisor.rb:360:in `main_process'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/lib/fluent/supervisor.rb:164:in `start'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/lib/fluent/command/fluentd.rb:173:in `<top (required)>'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:69:in `require'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:69:in `require'
from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.24/bin/fluentd:5:in `<top (required)>'
from /opt/microsoft/omsagent/bin/omsagent:23:in `load'
from /opt/microsoft/omsagent/bin/omsagent:23:in `<main>'
[fail]
root@us:~#
root@us:~# openssl version -a
OpenSSL 1.0.2h 3 May 2016
built on: reproducible build, date unspecified
platform: debian-amd64
options: bn(64,64) rc4(16x,int) des(idx,cisc,16,int) blowfish(idx)
compiler: gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -m64 -DL_ENDIAN -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DMD32_REG_T=int -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM
OPENSSLDIR: "/usr/lib/ssl"
root@us:~#
root@us:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty
root@us:~#
Hi,
we are using omsagent-1.1.0-217 and have activated the proxy feature as nearly all of our linux VMs do not have publicIP nor are routable to the internet. Agent is correctly registering but we note that the 'ComputerIP' field is the public IP of our proxy rather than the private IP of the VM.
Of course this means EVERY VM is shown in OMS with the same IP
Is there a configuration setting to deal with this and show the private IP of the VM?
The Onboarding using the command line instructions say that the omsadmin.sh
command "must be run as root (w/ sudo elevation) or run as the created omsagent user".
If the command is run as the omsagent
user, it fails with the following while trying to create the omsagent
service:
Configuring OMS agent service ...
cp: cannot create regular file '/etc/init.d/omsagent': Permission denied
update-rc.d: error: initscript does not exist: /etc/init.d/omsagent
Must have root privileges for this operation
The instruction should be updated to say it must only be run as root, and not say that it can be run as the created omsagent
user.
This has been replaced with our internal development repository. If you re-fork, you'll now have our live development repository with the latest changes.
We can't accept pull requests at this time, but we're working on it. Please see CONTRIBUTING.md.
We kept the releases intact by moving their tags to the correct location in this repository.
After update oms agent for Linux to 1.2.0-75 ipv6 proxy no more work. ipv4 work instead correctly.
Version 1.2.0.25 was working fine also.
/etc/opt/microsoft/omsagent/conf/proxy.conf
http://[fc00::1]:8123
/var/opt/microsoft/omsagent/log/omsagent.log
2016-09-28 15:52:28 +0200 [error]: Failed to parse the proxy configuration in '/etc/opt/microsoft/omsagent/conf/proxy.conf'
2016-09-28 15:52:28 +0200 [info]: listening fluent socket on 127.0.0.1:25225
2016-09-28 15:52:28 +0200 [info]: listening syslog socket on 127.0.0.1:25224 with udp
2016-09-28 15:52:59 +0200 [info]: Encountered retryable exception. Will retry sending data later.
All docker logs are coming in with this error under the log search.The installation worked fine, and I am able to see the service running, and docker is able to connect to it. But there seems to be something wrong with getting the logs over to azure. All other logs are working fine, this seems to be the only issue.
I redirect the logs to the oms agent form a Netgeat GS724Tv3
I am getting this error.
2016-12-02 23:02:11 +0100 [error]: "<14> Dec 02 23:02:11 192.168.2.254-1 UNKN[2182428004]: cli_txtcfg.c(608) 62 %% Text Configuration of length 2623 (126 CMDS) compressed to save 60% Flash" error="invalid time format: value = Dec 02, error_class = ArgumentError, error = invalid strptime format - `%b %d %H:%M:%S'"
2016-12-02 23:02:11 +0100 [error]: suppressed same stacktrace
using this source.
I redirect the log to my rsyslog:
Dec 2 22:25:54 192.168.2.254-1 UNKN[2182428004]: cli_txtcfg.c(608) 76 %% Text Configuration of length 2620 (126 CMDS) compressed to save 60% Flash
Dec 2 22:26:03 192.168.2.254-1 SNTP[2189131708]: sntp_client.c(1806) 77 %% SNTP: system clock synchronized on Fri Dec 02 22:26:03 2016 UTC+01:00
I try too on my time_format %b %-d %H:%M:%S
But I get the same error.
Following documentation from here:
https://github.com/Microsoft/OMS-Agent-for-Linux/blob/master/docs/Docker-Instructions.md
I'd like to sign up for the container solution pack preview. When I try and email [email protected] it bounces.
The Onboarding using a file instructions in the documentation do not work. The user is instructed to run sudo service omsagent restart
, but if the OMS agent is not onboarded during installation the omsagent
service does not exist.
To reproduce on 64-bit Ubuntu 16.04:
sudo ./omsagent-1.2.0-148.universal.x64.sh --install
on a machine without the OMS agent installed/etc/omsagent-onboard.conf
as per the documentation, by running sudo su omsagent vi /etc/omsagent-onboard.conf
sudo service omsagent restart
as per the documentationAt step 2, running sudo su omsagent vi /etc/omsagent-onboard.conf
results in an error:
$ sudo su omsagent vi /etc/omsagent-onboard.conf
/usr/bin/vi: /usr/bin/vi: cannot execute binary file
Trying variants of the command allows vi
to run - e.g. sudo -su omsagent vi /etc/omsagent-onboard.conf
(passing the -s
and -u
flags to sudo
) or sudo su omsagent -c 'vi /etc/omsagent-onboard.conf'
(passing vi /etc/omsagent-onboard.conf
to su
as the command to execute as the user) - however neither variant of the command allows the file to be created successfully as the omsagent
does not have write permission to the /etc
directory.
The instruction "Create the file /etc/omsagent-onboard.conf
The file must be readable and writable for the user omsagent. sudo su omsagent vi /etc/omsagent-onboard.conf
" probably needs to be split into two steps - first, create the file and make it readable (sudo touch /etc/omsagent-onboard.conf && sudo chown omsagent:omsagent /etc/omsagent-onboard.conf
), second, edit the file (sudo -su omsagent vi /etc/omsagent-onboard.conf
). Alternatively, perhaps the file should first be created as root (sudo vi /etc/omsagent-onboard.conf
) and then made readable/writable by omsagent
(sudo chown omsagent:omsagent /etc/omsagent-onboard.conf
)
Once /etc/omsagent-onboard.conf
is created, at step 3 running sudo service omsagent restart
results in an error:
$ sudo service omsagent restart
Failed to restart omsagent.service: Unit omsagent.service not found.
During installation, the following warning is output: "Warning: Agent is not onboarded. omsagent cannot be registered as a service." This seems to indicate that the omsagent
service will not exist until the agent is onboarded, so we can't use service omsagent restart
to onboard the agent.
It appears from #185 that the correct instruction is cd <somewhere> && sudo ./service_control <something>
- perhaps cd /opt/microsoft/omsagent/bin/ && sudo ./service_control restart
, or just sudo /opt/microsoft/omsagent/bin/service_control restart
The Change Tracking module from the workspace is filling the log file with a description of each installed package.
Here is an example from the logs:
2016/12/01 15:13:33: INFO: Scripts/nxPackage.pyc(247):
PackageGroup value is False
2016/12/01 15:13:33: INFO: Scripts/nxPackage.pyc(250):
PackageGroup type is <type 'bool'>
2016/12/01 15:13:33: ERROR: Scripts/nxPackage.pyc(486):
ERROR in ParseAllInfo. Output was abiword|efficient, featureful word processor with collaboration
AbiWord is a full-featured, efficient word processing application.
It is suitable for a wide variety of word processing tasks, and
is extensible with a variety of plugins.
I have tried removing and reinstalling the agent and adding/removing the VM from the OMS workspace.
I am a little stuck on where to go from here and any guidance will be appreciated.
Thanks
Don
Version info:
ii omsagent 1.2.0.148 amd64 Microsoft Operations Management Suite for UNIX/Linux agent
ii omsconfig 1.1.1.351 amd64 Operations Management Suite Agent Configuration
ii dsc 1.1.1.294 amd64 Windows Powershell Desired State Configuration for Linux
ii omi 1.1.0.0 amd64 Open Management Infrastructure
When using the Azure Portal to connect a VM to OMS it says connecting and installs version 1.0.0.3. (click OMS Workspace, Click Virtual machines, click on VM, then click connect). After this the Log Analytics management shows 1 connected VM.
When I manually use the shell script and installed version 1.1.0-28 it works, but says No connected under OMS Connections.
Any ideas? Thanks
I have seen a few cases where if the hosts file has lots of static entries, the omsagent service or even installation might crash. The problem with the service crashing is that it will create lots of core files, sometimes filling up the disk.
In this case you can see the exception during the install:
[root@centos72 tmp]# ./omsagent-1.2.0-75.universal.x64.sh --install
Checking host architecture ...
Checking for ctypes python module ...
Extracting...
Installing OMS agent ...
----- Installing package: omi (omi-1.1.0-0.universal.x64) -----
----- Installing package: scx (scx-cimprov-1.6.3-13.universal.x64) -----
----- Installing package: omsagent (omsagent-1.2.0-75.universal.x64) -----
Checking for ctypes python module ...
----- Installing package: omsconfig (omsconfig-1.1.1-316.x64) -----
Preparing... ################################# [100%]
Creating omiusers group ...
Updating / installing...
1:omi-1.1.0-0 ################################# [ 25%]
Generating a 2048 bit RSA private key
.......+++
....................+++
Configuring OMI service ...
Created symlink from /etc/systemd/system/multi-user.target.wants/omid.service to /usr/lib/systemd/system/omid.service.
2:scx-1.6.3-13 ################################# [ 50%]
terminate called after throwing an instance of 'SCXCoreLib::SCXInternalErrorException'
/var/tmp/rpm-tmp.hC4nOj: line 77: 2969 Aborted /opt/microsoft/scx/bin/tools/scxsslconfig
warning: %post(scx-1.6.3-13.x86_64) scriptlet failed, exit status 1
Creating omsagent group ...
Creating omsagent service account ...
3:omsagent-1.2.0-75 ################################# [ 75%]
Warning: Agent is not onboarded. omsagent cannot be registered as a service.
Configuring rsyslog for OMS logging
Restarting service: rsyslog
Checking for ctypes python module...ok!
4:omsconfig-1.1.1-316 ################################# [100%]
Installing resource MSFT_nxGroupResource
Installing resource MSFT_nxPackageResource
Installing resource MSFT_nxServiceResource
Installing resource MSFT_nxAvailableUpdatesResource
Installing resource MSFT_nxUserResource
Installing resource MSFT_nxOMSAgentResource
Installing resource MSFT_nxOMSSyslogResource
Installing resource MSFT_nxOMSKeyMgmtResource
Installing resource MSFT_nxOMSCustomLogResource
gpg: directory /etc/opt/omi/conf/omsconfig/.gnupg' created gpg: new configuration file
/etc/opt/omi/conf/omsconfig/.gnupg/gpg.conf' created
gpg: WARNING: options in /etc/opt/omi/conf/omsconfig/.gnupg/gpg.conf' are not yet active during this run gpg: keyring
/etc/opt/omi/conf/omsconfig/.gnupg/secring.gpg' created
gpg: keyring /etc/opt/omi/conf/omsconfig/keymgmtring.gpg' created gpg: /etc/opt/omi/conf/omsconfig/.gnupg/trustdb.gpg: trustdb created gpg: key 44BC4178: public key "Microsoft (Release Signing) <[email protected]>" imported gpg: Total number processed: 1 gpg: imported: 1 (RSA: 1) gpg: keyring
/etc/opt/omi/conf/omsconfig/keyring.gpg' created
gpg: key DE321294: public key "Microsoft (Release Signing) [email protected]" imported
gpg: Total number processed: 1
gpg: imported: 1 (RSA: 1)
Warning: Agent is not onboarded. omsagent cannot be registered as a service.
----- Installing bundled packages -----
Checking if Apache is installed ...
Apache not found, will not install
Checking if Docker is installed...
Docker not found. Docker agent will not be installed.
Checking if MySQL is installed ...
A quick example hosts file that should generate the issue would be:
127.0.0.1 localhost correcthost
::1 localhost correcthosthost
127.0.0.1 localhost anyhost
::1 localhost anyhost
127.0.0.1 localhost anyhost
::1 localhost anyhost
127.0.0.1 localhost anyhost
::1 localhost anyhost
127.0.0.1 localhost anyhost
::1 localhost anyhost
We all know that this should not be normal, however there are sometimes applications that end up making bad changes to /etc/hosts.
Tried with the latest OMSAgent available on GitHub on CentOS 7.2
I created a VM using ARM using OmsAgentForLinux and the DockerExtension, the docker extension was initialized using the docker options has specified on the doc, "options": ["--log-driver=fluentd", "--log-opt", "fluentd-address=localhost:25225"]
. After a successful provision ps
shows the correct process:
omsagent 36326 1 0 22:29 ? 00:00:01 /opt/microsoft/omsagent/ruby/bin/ruby /opt/microsoft/omsagent/bin/omsagent -d /var/opt/microsoft/omsagent/run/omsagent.pid --no-supervisor -o /var/opt/microsoft/omsagent/log/omsagent.log
root 36399 1 0 22:30 ? 00:00:00 /usr/bin/docker daemon -H=unix:// --log-driver=fluentd --log-opt fluentd-address=localhost:25225
But when I try to run a Docker container I get the error dial tcp 127.0.0.1: 25225: connection refused
. The problem is that the OmsAgentForLinux does not start the TCP Service on 25225, it only has the UDP at 25224. This was shown by netstat -tulpn
After updating /etc/opt/microsoft/omsagent/conf/omsagent.conf
and re-starting, to start the service on TCP@25225 - as instructed here: https://github.com/Microsoft/OMS-Agent-for-Linux/blob/master/docs/OMS-Agent-for-Linux.md#enabling-high-volume-syslog-event-collection - I was able to start my Docker Container
Two questions:
Operations Management Suite is connecting to your server
on the Containers Panel
Thanks for your help!
Hey guys,
I am trying to add a custom plugin and I am facing this error:
2016-06-30 20:58:47 +0000 [error]: fluent/supervisor.rb:313:rescue in main_process: config error file="/etc/opt/microsoft/omsagent/conf/omsagent.conf" error="Unknown input plugin 'in_cloudfetch'. Run 'gem search -rd fluent-plugin' to find plugins"
The same plugin works for td-agent.
Here are the steps I followed:
Updated /etc/opt/microsoft/omsagent/conf/omsagent.conf with:
<source>
type in_cloudfetch
tag cloud
</source>
Created a new 'plugin' folder /etc/opt/microsoft/omsagent/conf/plugin
and added my input plugin: in_cloudfetch.rb
require 'fluent/input'
module Fluent
class InCloudfetch < Input
Fluent::Plugin.register_input('in_cloudfetch', self)
config_param :tag
def configure(conf)
super
# My plugin code here
end
def start
super
# My plugin code here
end
def shutdown
super
# My plugin code here
end
end
end
Restarted OMS agent:
sudo /opt/microsoft/omsagent/bin/service_control restart
Let me know if I am missing anything here.
Installing from a .sh doesn't really scale that well..
during onboarding via omsadmin.sh i got this error
-e info Generating certificate ...
/opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require': /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/x86_64-linux/openssl.so: undefined symbol: SSLv3_method - /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/x86_64-linux/openssl.so (LoadError)
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/openssl.rb:17:in `<top (required)>'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in `require'
from /opt/microsoft/omsagent/bin/auth_key.rb:2:in `<main>'
-e error Error onboarding. HTTP code 400
and during onboarding via omsagent-onboard.conf i got this error
โ omsagent.service - Operations Management Suite agent
Loaded: loaded (/lib/systemd/system/omsagent.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sun 2016-06-12 02:38:17 UTC; 7s ago
Process: 4686 ExecStop=/bin/rm -f /var/opt/microsoft/omsagent/run/omsagent.pid (code=exited, status=0/SUCCESS)
Process: 4678 ExecStart=/opt/microsoft/omsagent/bin/omsagent -d /var/opt/microsoft/omsagent/run/omsagent.pid -o /var/opt/microsoft/omsagent/log/omsagent.log --no-supervisor (code=exited, status=1/FAILURE)
Main PID: 4678 (code=exited, status=1/FAILURE)
Jun 12 02:38:17 nales omsagent[4678]: from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/supervisor.rb:485:in `init_engine'
Jun 12 02:38:17 nales omsagent[4678]: from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/supervisor.rb:153:in `block in start'
Jun 12 02:38:17 nales omsagent[4678]: from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/supervisor.rb:306:in `call'
Jun 12 02:38:17 nales omsagent[4678]: from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/supervisor.rb:306:in `main_process'
Jun 12 02:38:17 nales omsagent[4678]: from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/supervisor.rb:151:in `start'
Jun 12 02:38:17 nales omsagent[4678]: from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/command/fluentd.rb:167:in `<top (required)>'
Jun 12 02:38:17 nales omsagent[4678]: from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:69:in `require'
Jun 12 02:38:17 nales systemd[1]: omsagent.service: main process exited, code=exited, status=1/FAILURE
Jun 12 02:38:17 nales systemd[1]: Unit omsagent.service entered failed state.
Jun 12 02:38:17 nales systemd[1]: omsagent.service failed.
You should use drop-in units to reconfigure docker rather than editting the system package's docker.service
. The Docker docs documents this process pretty clearly.
I am experiencing issues with OMS agent and firewall enabled on my Ubuntu 15.10 box. Can you elaborate on which ports have to be accessible by OMS agent to work properly. Currently getting the following messages with firewall enabled:
systemd[1]: Stopping Operations Management Suite agent...
systemd[1]: omsagent.service: Main process exited, code=killed, status=9/KILL
systemd[1]: Stopped Operations Management Suite agent.
systemd[1]: omsagent.service: Unit entered failed state.
systemd[1]: omsagent.service: Failed with result 'signal'.
I don't know which host it is trying to connect to, so it is hard to troubleshoot the issue.
2016-02-18 06:43:34 -0500 [warn]: temporarily failed to flush the buffer. next_retry=2016-02-18 06:44:01 -0500 error_class="RuntimeError" error="Net::HTTP.Post raises exception: OpenSSL::SSL::SSLError, 'SSL_connect SYSCALL returned=5 errno=0 state=SSLv2/v3 read server hello A'" plugin_id="object:3fc3968593c0"
2016-02-18 06:43:34 -0500 [warn]: /opt/microsoft/omsagent/plugin/oms_common.rb:106:in rescue in start_request' 2016-02-18 06:43:34 -0500 [warn]: /opt/microsoft/omsagent/plugin/oms_common.rb:101:in
start_request'
2016-02-18 06:43:34 -0500 [warn]: /opt/microsoft/omsagent/plugin/out_oms.rb:47:in handle_record' 2016-02-18 06:43:34 -0500 [warn]: /opt/microsoft/omsagent/plugin/out_oms.rb:88:in
block in write'
2016-02-18 06:43:34 -0500 [warn]: /opt/microsoft/omsagent/plugin/out_oms.rb:87:in each' 2016-02-18 06:43:34 -0500 [warn]: /opt/microsoft/omsagent/plugin/out_oms.rb:87:in
write'
2016-02-18 06:43:34 -0500 [warn]: /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/buffer.rb:325:in write_chunk' 2016-02-18 06:43:34 -0500 [warn]: /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/buffer.rb:304:in
pop'
2016-02-18 06:43:34 -0500 [warn]: /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/output.rb:321:in try_flush' 2016-02-18 06:43:34 -0500 [warn]: /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/output.rb:140:in
run'
2016-02-18 06:44:04 -0500 [warn]: retry succeeded. plugin_id="object:3fc3968593c0"
Using latest Ignite version of OMS Linux Agent with Docker 1.12.1, when image URL's contain port numbers it seems to be badly parsed into fields with incorrect values
When URL would be: registry.somehost.com:18443/zzz/xxx-yyyyyy-services:12.6.0
But is parsed into the following fields
Repository:registry.somehost.com:18443
Image:zzz/xxx-yyyyyy-services:12.6.0
ImageTag:18443/zzz/xxx-yyyyyy-services:12.6.0
Expect
Repository:registry.somehost.com:18443
Image:zzz/xxx-yyyyyy-services
ImageTag:12.6.0
Setting up omsconfig (1.1.1.49) ...
require': /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/x86_64-linux/openssl.so: undefined symbol: SSLv3_method - /opt/microsoft/omsagent/ruby/lib/ruby/2. 2.0/x86_64-linux/openssl.so (LoadError) from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in
require'<top (required)>' from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in
require'require' from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/net/https.rb:22:in
<top (required)>'require' from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in
require'<class:Common>' from /opt/microsoft/omsagent/plugin/oms_common.rb:3:in
module:OMS'<top (required)>' from /opt/microsoft/omsagent/plugin/omi_lib.rb:15:in
require_relative'<class:Omi>' from /opt/microsoft/omsagent/plugin/omi_lib.rb:13:in
module:OmiModule'<top (required)>' from /opt/microsoft/omsagent/plugin/filter_omi.rb:3:in
require_relative'<module:Fluent>' from /opt/microsoft/omsagent/plugin/filter_omi.rb:1:in
<top (required)>'require' from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:54:in
require'block in load_plugin_dir' from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/plugin.rb:83:in
each'load_plugin_dir' from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/engine.rb:112:in
load_plugin_dir'block in init_engine' from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/supervisor.rb:485:in
each'init_engine' from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/supervisor.rb:153:in
block in start'call' from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/supervisor.rb:306:in
main_process'start' from /opt/microsoft/omsagent/ruby/lib/ruby/gems/2.2.0/gems/fluentd-0.12.14/lib/fluent/command/fluentd.rb:167:in
<top (required)>'require' from /opt/microsoft/omsagent/ruby/lib/ruby/2.2.0/rubygems/core_ext/kernel_require.rb:69:in
require'<top (required)>' from /opt/microsoft/omsagent/bin/omsagent:23:in
load'Any chance of an armhf build to be able to run this on embedded / small-scale linux environments, like the Raspberry Pi?
Symptom:
According to documentation "Uninstalling the OMS Agent for Linux" the following must be run:
sudo rpm -e omi
This fails if provider rpms are installed:
error: Failed dependencies:
omi >= 1.0.8-3 is needed by (installed) apache-cimprov-1.0.1-1.x86_64
omi >= 1.0.8-3 is needed by (installed) mysql-cimprov-1.0.1-1.x86_64
Request to update documentation.
Installed the newest version, omsagent-1.2.0-148, of the oms agent seem to crash continuously and fills upp the disk with core dumps under the following path, /var/opt/omi/run/.
The issue seems to be with the omi provided in that package and verified that it crashes on ubuntu 16.04.1 and centos 7.2. Previous versions of the agent worked without issues.
Running on the vm's is
Docker 1.12.3
Omsagent 1.2.0-148 (omi 1.1.0)
We get the following errors on ubuntu/centos machines
[root@myhost run]# journalctl -f | grep omiag
Nov 23 14:40:22 myhost kernel: omiagent[57281]: segfault at 10 ip 00007f12d7b92869 sp 00007fffdab230e0 error 4 in libcontainer.so[7f12d7b33000+77000]
Nov 23 14:40:59 myhost kernel: omiagent[57322]: segfault at 10 ip 00007f03bc34f869 sp 00007ffd14f432a0 error 4 in libcontainer.so[7f03bc2f0000+77000]
Nov 23 14:41:36 myhost kernel: omiagent[57399]: segfault at 10 ip 00007fe3fedf4869 sp 00007ffc5f19ede0 error 4 in libcontainer.so[7fe3fed95000+77000]
Nov 23 14:42:13 myhost kernel: omiagent[57427]: segfault at 10 ip 00007f3d70af4869 sp 00007ffcf7b88350 error 4 in libcontainer.so[7f3d70a95000+77000]
A collegue ran a gdb trace and i've attached it to the issue
gdb.txt
Posted this on the omi repo, but because this is a critical issue i posted it here also
The host name is captured from the time part of the message, basically the format regexp is incorrect.
I cannot create a pull so if you're interested this is the format that is working in my environment:
/^(?[^ ]\s[^ ]* [^ ]* [^ ]) (?[^ ]) : (?[a-zA-Z0-9_%/.-])(?:[(?[0-9]+)])?(?:[^:]:)? (?.)$/
Hi all.
Failing to collect Nagios alerts with OMS agent.
I've configured omsagent.conf
<source>
type tail
path /opt/omd/sites/DomainLocal/var/nagios/nagios.log
format none
tag oms.nagios
</source>
<filter oms.nagios>
type filter_nagios_log
</filter>
After restarting oms agent service I can see the following warning in /var/opt/microsoft/omsagent/log/omsagent.log
2016-07-06 08:21:23 -0400 [warn]: temporarily failed to flush the buffer. next_retry=2016-07-06 08:28:36 -0400 error_class="TypeError" error="no implicit conversion of nil into Array" plugin_id="object:3fb0864997e4"
2016-07-06 08:21:23 -0400 [warn]: suppressed same stacktrace
I can see my server connected to OMS and performance data collected.
In Check_MK web portal I can see active alers, but they did not show up in the OMS portal. I've tried to create some new alerts after the OMS agent restart, but this didn't help. They were still only in the Check_MK portal, but not in the OMS portal.
I've tried to execute su - omsagent
and then execute cat /opt/omd/sites/DomainLocal/var/nagios/nagios.log
and I successfully recieved output. As you can see below ls -la
shows that everyone has read permission
[root@Site1CheckMK nagios]# ls -la
total 152
drwxr-xr-x 3 DomainLocal DomainLocal 121 Jul 6 08:30 .
drwxr-xr-x 11 DomainLocal DomainLocal 143 Jul 6 07:13 ..
drwxr-xr-x 2 DomainLocal DomainLocal 6 Jul 6 07:13 archive
-rw-r--r-- 1 DomainLocal DomainLocal 1477 Jul 6 08:30 livestatus.log
-rw-r--r-- 1 DomainLocal DomainLocal 36198 Jul 6 08:30 nagios.log
-rw-r--r-- 1 DomainLocal DomainLocal 34747 Jul 6 08:30 objects.cache
-rw-r--r-- 1 DomainLocal DomainLocal 34747 Jul 6 08:30 objects.precache
-rw-r--r-- 1 DomainLocal DomainLocal 40472 Jul 6 08:30 retention.dat
The only help I get from the troubleshooting guide is this
Probable Causes
omsagent user does not have permissions to read from Nagios log file
Nagios source and filter have not been uncommented from omsagent.conf file
Any help will be highly appreciated.
Implemented the logic of transforming XML to JSON from fluentd
filter to source to resolve memory issues. Two filters related to Patching management (patch_management.conf) in existing code has memory issue. More details can be found in
#257
oms.patch_management
oms.patch_management_immediate_run
In OMS-Agent-for-Linux/source/code/plugins/in_omi.rb and OMS-Agent-for-Linux/source/code/plugins/in_oms_omi.rb the ruby path points to /usr/local/bin/ruby. The default on most systems would be /usr/bin/ruby.
After installing omsagent on RHEL 7.2, log rotation is prevented by Red Hat's SELinux targeted policy. The error encountered is:
error: stat of /var/opt/microsoft/omsagent/log/omsagent.log failed: Permission denied
SELinux logs the following in /var/log/audit/audit.log:
type=AVC msg=audit(1471331366.853:3801338): avc: denied { getattr } for pid=2181 comm="logrotate" path="/var/opt/microsoft/omsagent/log/omsagent.log" dev="sda2" ino=35118430 scontext=system_u:system_r:logrotate_t:s0-s0:c0.c1023 tcontext=system_u:object_r:var_t:s0 tclass=file
Was caused by:
Missing type enforcement (TE) allow rule.
You can use audit2allow to generate a loadable module to allow this access.
This issue is fixed manually with the following commands (run as root):
semanage fcontext -a -t logrotate_exec_t '/var/opt/microsoft/omsagent/log/omsagent.log'
restorecon -v '/var/opt/microsoft/omsagent/log/omsagent.log'
However, this is a manual modification that would need to be performed on each virtual machine. I think this should be done automatically via the install script.
Allow me to define custom meta-data that will help with easy log filtering by passing Container Labels through as ContainerInventory Fields. Then I can add categorization labels to help group my Contianer Log entries.
The current recommended configuration for direct forwarding syslog messages only appears to support messages in the outdated RFC3164 format that is built into fluentd. This results in any custom log fields being lost. There are existing fluentd input plugins that would support RFC5424 log messages, but OMS agent does not seem to allow them to be installed and used.
2016-06-28 13:16:01 -0700 [warn]: /tmp/a.txt not found. Continuing without tailing it.
2016-06-28 13:16:01 -0700 [warn]: /tmp/b.txt not found. Continuing without tailing it.
@Microsoft/omsagent-devs
Hi,
I've created a new instance from the bitnami MySQL image, based on Ubuntu 14.04 LTS release.
I manually installed the oms agent for linux, as detailed in the instructions. The bitnami mysql configured is not located in the default directories (i.e. /var/lib/mysql), so I had to create two symlinks to allow the installation of the mysql-cimprov extension: One for /usr/lib/libmysqlclient.so and the other for /var/lib/mysql
Once created both, the package installs. and the agent runs without errors.
I've also upgraded the Azure VM extension LinuxDiagnostic to 2.2 version. I've also set up valid credentials with mycimprovauth to allow the agent to connect into the mysql and fetch the performance data. Also, I granted enough permissions on mysql to provide them.
Once done, I'm able to see the performance indicators on the OMS settings, but I'm not receiving any data from the MySQL server.
/var/opt/omi/log/omiserver.log is empty
/var/opt/omi/log/omiagent.root.root.log is empty
/var/opt/microsoft/mysql-cimprov/log/omsagent/ has no logs
/var/opt/microsoft/omsagent/log/omsagent.log has contents, but the configuration file shown in the log doesn't seem to have any source related to mysql
/opt/omi/bin/omicli ei root/mysql MySQL_Server returns nothing
How can I debug the problem? What can be the issue here?
Thanks
Since OMS-Agent-for-Linux has released, the community was unable to build OMS-Agent-for-Linux from source because one of the components, OMI, was not public. This has come up multiple times (in relation to Raspberry Pi and other issues as well).
As of now, the OMI repository is now public (this coincides with PowerShell for Windows and Linux being open source and public as well). Thus, if you wish to build OMS-Agent-for-Linux from source, you can now do that by following the instructions on the super-project (we have a super-project to manage dependencies).
If you have any questions or concerns, please post issues on the super-project.
Sorry for the delay in making this happen. But finally, we're there!
ContainerLogs seems to work fine, but the Container tile isn't leaving the 'Performing Assessment' stage when using 1.12. Confession: I have not tested earlier docker versions, so I'm assuming this is the issue.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.