criteo / cassandra_exporter Goto Github PK
View Code? Open in Web Editor NEWApache Cassandra® metrics exporter for Prometheus
License: Apache License 2.0
Apache Cassandra® metrics exporter for Prometheus
License: Apache License 2.0
For example scrape logs are parsed to stderr. It would be nice to have only errors appear in stderr cause it makes monitoring alot simpler ;D
I am trying to set some configs like user and password on my cassandra_exporter, however, everytime I try to mount the config on the container, I have this error mentioned below.
I tried to set those configs using -e "VARKey value" but it also didn't work.
As I see there are many people using it, I just assume I'm doing it wrong, but I didn't figure out how to properly use it.
The configs are currently on /tmp/config.yml, but it was in different directories before, I was just trying to move around to check permissions.
docker run --privileged --rm -ti -v /tmp/config.yml:/etc/cassandra_exporter/config.yml --name cassandra-exporter criteord/cassandra_exporter
Starting Cassandra exporter
JVM_OPTS:
CASSANDRA_EXPORTER_CONFIG_user
sed: cannot rename /etc/cassandra_exporter/sedjzhwca: Device or resource busy
Hi,
When I am trying to run the exporter (v2.0.3) I receive:
ERROR com.criteo.nosql.cassandra.exporter.JmxScraper - Cannot retrieve the datacenter name information for the node
javax.management.AttributeNotFoundException: No such attribute: Datacenter
In the config.yml file we tried to blacklist the node - org:apache:cassandra:db:.* but it didnt help.
It seems that the error is related with the code on JmxScraper.java:373. We tried to use and older (v1.0.1) which is not parsing the Datacenter name and it worked fine.
We are using Cassandra 3.0.6,
Cheers
to start the application
java -jar cassandra_exporter.jar config.yml
git clone https://github.com/criteo/cassandra_exporter.git
cd cassandra && java -jar cassandra_exporter.jar config.ym
java -jar cassandra_exporter.jar config.ym
Error: Unable to access jarfile cassandra_exporter.jar
find / -name "cassandra_exporter.jar" 2> /devnull
and the result is nothing
how can i start the exporter?
Hi,
I am using the cassandra_exporter for tracking its state and performance.I am using kubernetes Platform for my cassandra cluster. Now I want to track the details of writing to one table like how much row it currently having or written.(using prometheus and Grafana) But while checking the kb: http://cassandra.apache.org/doc/4.0/operating/metrics.html, I am not able to get the actual query. Can any one help me on this?
Upgraded to 2.0 and now cassandra 2.1.13 dont distinguish DC .
Showing blank datacenter name and cluster name.
Hello,
When starting the docker container I get the following error: Error: Could not find or load main class Xmx
I haven't had time to dig into the root of this exception yet, but wanted to bring it up. Running the exporter with 4.0 throws an exception:
java -jar ./build/libs/cassandra_exporter-2.2.1-all.jar config.yml ✭master
[main] INFO com.criteo.nosql.cassandra.exporter.Config - Loading yaml config from config.yml
[main] ERROR com.criteo.nosql.cassandra.exporter.Main - Scrapper stopped due to uncaught exception
java.lang.ClassCastException: java.util.ArrayList cannot be cast to [J
at com.criteo.nosql.cassandra.exporter.JmxScraper.updateMetric(JmxScraper.java:300)
at com.criteo.nosql.cassandra.exporter.JmxScraper.lambda$run$7(JmxScraper.java:164)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:270)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1556)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at com.criteo.nosql.cassandra.exporter.JmxScraper.run(JmxScraper.java:164)
at com.criteo.nosql.cassandra.exporter.Main.main(Main.java:36)
Hi there!
Thanks for the exporter, original prometheus jmx exporter is somewhat unstable in our environment.
Before i start heavy digging, i'd like to ask why i can only see cassandra_stats
.
There are lot's of stuff to collect, and it seems things like clientrequest and columnfamily are not shown.
Am i missing something obvious here?
Config:
---
host: localhost:7199
ssl: False
listenPort: 4067
blacklist:
# Unaccessible metrics (not enough privilege)
- java:lang:memorypool:.*usagethreshold.*
# Leaf attributes not interesting for us but that are presents in many path (reduce cardinality of metrics)
- .*:999thpercentile
- .*:95thpercentile
- .*:fifteenminuterate
- .*:fiveminuterate
- .*:durationunit
- .*:rateunit
- .*:stddev
- .*:meanrate
- .*:mean
- .*:min
# Path present in many metrics but uninterresting
- .*:viewlockacquiretime:.*
- .*:viewreadtime:.*
- .*:cas[a-z]+latency:.*
- .*:colupdatetimedeltahistogram:.*
# Mostly for RPC, do not scrap them
- org:apache:cassandra:db:.*
# columnfamily is an alias for Table metrics in cassandra 3.x
# https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/metrics/TableMetrics.java#L162
- org:apache:cassandra:metrics:columnfamily:.*
# Should we export metrics for system keyspaces/tables ?
- org:apache:cassandra:metrics:[^:]+:system[^:]*:.*
# Don't scrape us
- com:criteo:nosql:cassandra:exporter:.*
maxScrapFrequencyInSec:
50:
- .*
# Refresh those metrics only every hour as it is costly for cassandra to retrieve them
3600:
- .*:snapshotssize:.*
- .*:estimated.*
- .*:totaldiskspaceused:.*
JMX is fine
Thanks a lot in advance!
P.S. Sample output | head -n 10:
# TYPE cassandra_stats gauge
cassandra_stats{cluster="clustername",datacenter="datacenter1",keyspace="",table="",name="org:apache:cassandra:metrics:clientrequest:rangeslice:unavailables:count",} 0.0
cassandra_stats{cluster="clustername",datacenter="datacenter1",keyspace="",table="",name="org:apache:cassandra:metrics:indextable:someenv:newmessages:newmessages_deleted_idx:rangelatency:99thpercentile",} 0.0
cassandra_stats{cluster="clustername",datacenter="datacenter1",keyspace="",table="",name="org:apache:cassandra:metrics:indextable:someenv:usersessions:usersessions_deleted_idx:writelatency:count",} 0.0
cassandra_stats{cluster="clustername",datacenter="datacenter1",keyspace="",table="",name="org:apache:cassandra:metrics:indextable:someenv:newchatusers:newchatusers_chat_type_unencr_idx:coordinatorreadlatency:99thpercentile",} 0.0
cassandra_stats{cluster="clustername",datacenter="datacenter1",keyspace="someenv",table="stickers",name="org:apache:cassandra:metrics:table:someenv:stickers:readtotallatency:count",} 0.0
cassandra_stats{cluster="clustername",datacenter="datacenter1",keyspace="someenv",table="drafts",name="org:apache:cassandra:metrics:table:someenv:drafts:readlatency:98thpercentile",} 0.0
cassandra_stats{cluster="clustername",datacenter="datacenter1",keyspace="",table="",name="org:apache:cassandra:metrics:indexcolumnfamily:someenv:newuserchats:newuserchats_chat_type_unencr_idx:tombstonescannedhistogram:50thpercentile",} 0.0
cassandra_stats{cluster="clustername",datacenter="datacenter1",keyspace="channels",table="channels",name="org:apache:cassandra:metrics:table:channels:channels:readlatency:50thpercentile",} 0.0```
I'm afraid the latest release of the docker image does not work:
Starting Cassandra exporter
JVM_OPTS:
[dumb-init] /usr/bin/java: No such file or directory
Looks like Java is now installed in a different way and location, but run.sh
hardcodes the old path:
$ docker run -t -i criteord/cassandra_exporter:2.3.2 grep java /run.sh
/sbin/dumb-init /usr/bin/java ${JVM_OPTS} -jar /opt/cassandra_exporter/cassandra_exporter.jar /etc/cassandra_exporter/config.yml
$ docker run -t -i criteord/cassandra_exporter:2.3.2 which java
/usr/local/openjdk-11/bin/java
$ docker run -t -i criteord/cassandra_exporter:2.3.2 ls -l /usr/bin/java
ls: cannot access '/usr/bin/java': No such file or directory
$
Hi. Thanks a lot for the project!
While trying to optimize the amount of metrics I store and process in Prometheus, I was wondering, is there a way in cassandra-exporter to scrap a list of metrics or metrics families that we know we want, instead of blacklisting the rest?
The whitelist is much easier to make and maintain in my opinion.
Thanks a lot.
Hi,
I'm using cassandra_exporter-2.2.1 pre-built library for exporting cassandra metrics and found that streaming related metrics are missing.
Do you guys aware of this problem or I'm I missing something here?
Thanks.
Does Cassandra exporter need to be run on every machine where cassandra is running ? or it can be run on one machine and can connect to all nodes ?
How does it work. Please guide me.
Is there anyway to add prefix or hostname with metrics
There is a lot of good stuff here, but I hate that it lumps everything under cassandra_stats with the name label looking like a full graphite path that is ':' separated. This stuff should be broken into multiple metrics with multiple labels to really leverage prometheus. Does the exporter support the JMX exporters ability to change the exported metric format?
Hi, thanks for the project!
Do you have grafana dashboard for cassandra metrics?
Greetings,
I'm having a small implementation issue with this exporter.
I have created my own dockerfile which is using this image "criteord/cassandra_exporter". The dockerfile will overwrite the entrypoint of the exporter image, with a script that will put the exporter on hold and will wait until cassandra is up and the port is listening. After the DB is up, the CMD's of the exporter will launch, via the script -this was the procedure that worked on other DB's with images that had a simple /bin/exporter entrypoint with and/or connection string(that connected to a DB).
I couldn't make it with this exporter and would like to know if you could offer some suggestion on how to implement it via the method above or other methods that work.
Thank you in advance.
Hi all,
I have a short question.
Is it possible by passing config to expose the metrics at root path / instead of /metrics?
Thanks a lot!
Cheers
It would be nice to have a configurable parameter to bind to a specific IP and not only to "0.0.0.0"
Hi, I'm fairly new to the the exporter/prometheus world, looking for an alternative to Opscenter.
In Opscenter we can get metrics per IP, or per datacenter, consolidating all the IPs of that DC.
Is it something possible with this exporter?
Hi,
I'm particularly interested in the metric:
org.aparche.cassandra.metrics.compaction.pendingtasksbytablename
The value is a map like {columnfamily1:number, columnfamily2:number,...}.
Is there a way to tell cassandra-exporter to map this metric into something compatible with prometheus, like using prometheus labels, or changing the metric name?
E.g.:
cassandra_stats{name="org.aparche.cassandra.metrics.compaction.pendingtasksbytablename:<name_of_cf>"}
That way we can store in Prometheus the pending compactions with column family granularity and detect if a particular CF is suffering specially from compaction.
Thanks, regards,
Miguel
Hi, Thanks for this awesome project.
It looks like scraping non-numeric metrics causing more time for the whole run.
Debug log:
[main] DEBUG com.criteo.nosql.cassandra.exporter.JmxScraper - Cannot parse java.lang:type=OperatingSystem as it as an unknown type java.lang.String with value Linux
In our case, whole run scraping took ~10000ms for ~850 metrics, but none of those are expensive scrapes.
If possible, exclude scraping/parsing non-numeric metrics which could improve performance.
Hi,
I couldn't find any metric which gives the node status.
The closest metric is using "up" but this only indicates if the exporter is up/down.
java -jar jmx_prometheus_javaagent-0.11.0.jar config.yml
no main manifest attribute, in /home/userhome/jmx_prometheus_javaagent-0.11.0.jar
I'm very new to Cassandra and need to healthcheck a cluster of 5 Cassandra nodes with ~1000 of keyspaces created.
With the default setup, I'm getting a huge number of metrics in my Prometheus, which makes it impossible to query for a time range of more than 1 hour, then it gets killed by OOM. And all those metrics aren't informative to me.
So for the sake of future users, who faced the same problem, I wonder is there a minimal config for the metrics to scrape so average administrator could take a look at and see if there is (or gonna be) something wrong with Cassandra?
Thanks in advance for any kind of help and excuse me for my barbarian English.
Current implementation is such that all metrics are labels in a gauge named as "cassandra_stats".
I think, a better design would be similar to what JMX prometheus exporter does. Gauges are separated out. And labels can be configurable such as keyspace name or table name etc.
Any thoughts?
While running generate.py, there is an error showing Error during export dashboard: cassandra_default,
Error during export dashboard: cassandra_kubernetes.
rules:
Hi!
Is my grafana gone crazy or is this dashboard not working in version 6.0.1?
https://github.com/criteo/cassandra_exporter/blob/grafana/grafana/cassandra_default.json
Hi Team,
When i run the command to start the exporter(2.3.4) getting the below error.
[main] INFO com.criteo.nosql.cassandra.exporter.Config - Loading yaml config from config.yml
[main] TRACE com.criteo.nosql.cassandra.exporter.Config - com.criteo.nosql.cassandra.exporter.Config@887af79
[main] ERROR com.criteo.nosql.cassandra.exporter.JmxScraper - Cannot retrieve the datacenter name information for the node
javax.management.AttributeNotFoundException: No such attribute: HostIdToEndpoint
at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:81)
at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1445)
at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401)
at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:639)
at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:346)
at sun.rmi.transport.Transport$1.run(Transport.java:200)
at sun.rmi.transport.Transport$1.run(Transport.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:276)
at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:253)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:162)
at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
at javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown Source)
at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:903)
at com.criteo.nosql.cassandra.exporter.JmxScraper$NodeInfo.getNodeInfo(JmxScraper.java:413)
at com.criteo.nosql.cassandra.exporter.JmxScraper.run(JmxScraper.java:175)
at com.criteo.nosql.cassandra.exporter.Main.start(Main.java:38)
at com.criteo.nosql.cassandra.exporter.Main.main(Main.java:30)
I just set up cassandra exporter using docker and I was wondering if there's some way to secure the endpoint with basic auth or similar.
There is no way to configure cassandra_exporter listen on specific address in current version.
I use ansible control my cluster running on aws, I want exporter listen on EC2 instance's secondary private ip address to simplify my ansible settings.
Thank you for this great project.
We would like to embed cassandra exporter in to our existing Cassandra agent process which does lot other things other than capturing metric. Please let me know if the feature is available, I will start work on it otherwise.
Hello there,
I am in the progress of adding the Cassandra exporter to my Cassandra cluster to measure the amount of tombstones, but unfortunately I am not getting any table
and keyspace
attributes along with my statistics inside the cassandra_stats
object. I do see these stats show up in multiple examples among the documentation. Am I missing something or is this a bug in the later versions of Apache Cassandra?
Environment details
Cassandra version: 3.11.1
Cassandra exporter version: 2.2.0
Configuration
host: localhost:7199
ssl: False
user:
password:
listenAddress: 0.0.0.0
listenPort: 8080
blacklist:
# To profile the duration of jmx call you can start the program with the following options
# > java -Dorg.slf4j.simpleLogger.defaultLogLevel=trace -jar cassandra_exporter.jar config.yml --oneshot
#
# To get intuition of what is done by cassandra when something is called you can look in cassandra
# https://github.com/apache/cassandra/tree/trunk/src/java/org/apache/cassandra/metrics
# Please avoid to scrape frequently those calls that are iterating over all sstables
# Unaccessible metrics (not enough privilege)
- java:lang:memorypool:.*usagethreshold.*
# Leaf attributes not interesting for us but that are presents in many path
- .*:999thpercentile
- .*:95thpercentile
- .*:fifteenminuterate
- .*:fiveminuterate
- .*:durationunit
- .*:rateunit
- .*:stddev
- .*:meanrate
- .*:mean
- .*:min
# Path present in many metrics but uninterresting
- .*:viewlockacquiretime:.*
- .*:viewreadtime:.*
- .*:cas[a-z]+latency:.*
- .*:colupdatetimedeltahistogram:.*
# Mostly for RPC, do not scrap them
- org:apache:cassandra:db:.*
# columnfamily is an alias for Table metrics
# https://github.com/apache/cassandra/blob/8b3a60b9a7dbefeecc06bace617279612ec7092d/src/java/org/apache/cassandra/metrics/TableMetrics.java#L162
- org:apache:cassandra:metrics:columnfamily:.*
# Should we export metrics for system keyspaces/tables ?
- org:apache:cassandra:metrics:[^:]+:system[^:]*:.*
# Don't scrap us
- com:criteo:nosql:cassandra:exporter:.*
maxScrapFrequencyInSec:
50:
- .*
# Refresh those metrics only every hour as it is costly for cassandra to retrieve them
3600:
- .*:snapshotssize:.*
- .*:estimated.*
- .*:totaldiskspaceused:.*
Hi. I'm seeing the following error on 2.2.1 (Cassandra 2.0.11.93
):
ERROR com.criteo.nosql.cassandra.exporter.JmxScraper - Cannot retrieve the datacenter name information for the node
Full output:
$ java -jar cassandra_exporter-2.2.1-all.jar config.yml
[main] INFO com.criteo.nosql.cassandra.exporter.Config - Loading yaml config from config.yml
[main] ERROR com.criteo.nosql.cassandra.exporter.JmxScraper - Cannot retrieve the datacenter name information for the node
javax.management.AttributeNotFoundException: No such attribute: HostIdToEndpoint
at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:81)
at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1445)
at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401)
at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:639)
at sun.reflect.GeneratedMethodAccessor19.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323)
at sun.rmi.transport.Transport$1.run(Transport.java:200)
at sun.rmi.transport.Transport$1.run(Transport.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
at sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:276)
at sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:253)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:162)
at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
at javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown Source)
at javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:903)
at com.criteo.nosql.cassandra.exporter.JmxScraper$NodeInfo.getNodeInfo(JmxScraper.java:375)
at com.criteo.nosql.cassandra.exporter.JmxScraper.run(JmxScraper.java:155)
at com.criteo.nosql.cassandra.exporter.Main.main(Main.java:36)
config.yml
host: localhost:10144
ssl: False
listenAddress: 0.0.0.0
listenPort: 9198
blacklist:
- .*:999thpercentile
- .*:95thpercentile
- .*:fifteenminuterate
- .*:fiveminuterate
- .*:durationunit
- .*:rateunit
- .*:stddev
- .*:meanrate
- .*:mean
- .*:min
maxScrapFrequencyInSec:
# Refresh those metrics only every hour as it is costly for cassandra to retrieve them
3600:
- .*:snapshotssize:.*
I'm able to query JMX successfully with jmxterm
on localhost:10144
.
Thanks!
Health checking /
or /metrics
doesn't like a very good idea, especially when dealing with large cluster that contain many keyspaces and data.
Hello,
We have a ~540 node cassandra cluster that are exporting ~1500 metrics each. We're sending over 800k time series in the cassandra_stats
metric namespace. This is causing a lot of issues when querying Prometheus since the index gets hit so hard. Recording rules are definitely an option, but we don't always know in advance when something should have a recording rule to perform any aggregation.
Is there a workaround for this in the current code base? If not, would you be open to exploring a change with us?
This seems to be due to one stats design: cassasndra_stats. Any way to make this work?
Example with calculating percent:
this one works (used/used):
(cassandra_stats{cluster="$cluster",datacenter="$datacenter",instance="$instance",job="cassandra",name="java:lang:memory:heapmemoryusage:used"} / cassandra_stats{cluster="$cluster",datacenter="$datacenter",instance="$instance",job="cassandra",name="java:lang:memory:heapmemoryusage:used"}) * 100
this one does not (used/max):
(cassandra_stats{cluster="$cluster",datacenter="$datacenter",instance="$instance",job="cassandra",name="java:lang:memory:heapmemoryusage:used"} / cassandra_stats{cluster="$cluster",datacenter="$datacenter",instance="$instance",job="cassandra",name="java:lang:memory:heapmemoryusage:max"}) * 100
My cassandra version is 2.0.8?
do we have prometheus yml for this ? I am able to see the metrics but I believe grafana is only configured on prometheus server . How can I access prometheus UI in this case ?
Hello,
there are numerous problems with MBeanInfo parser for Cassandra 2.2.8.
Both exporter and cassandra runtime logs are attached.
Could you confirm this behaviour with 2.x branch?
Thanks!
[main] DEBUG com.criteo.nosql.cassandra.exporter.JmxScraper - Cannot parse java.lang:type=MemoryPool,name=Compressed Class Space as it as an unknown type java.lang.String with value NON_HEAP
[main] DEBUG com.criteo.nosql.cassandra.exporter.JmxScraper - Cannot parse java.lang:type=MemoryPool,name=Compressed Class Space as it as an unknown type javax.management.ObjectName with value java.lang:type=MemoryPool,name=Compressed Class Space
[main] DEBUG com.criteo.nosql.cassandra.exporter.JmxScraper - Cannot parse org.apache.cassandra.metrics:type=Cache,scope=RowCache,name=HitRate as it as an unknown type java.lang.Object with value NaN
[main] DEBUG com.criteo.nosql.cassandra.exporter.JmxScraper - Cannot parse java.lang:type=MemoryManager,name=Metaspace Manager as it as an unknown type [Ljava.lang.String; with value [Metaspace, Compressed Class Space]
Hi:
I download this project, and build by gradle like
" gradle build"
But there have some problems, the error show as :
Could not resolve all artifacts for configuration ':classpath'.
Could not download shadow.jar (com.github.jengelman.gradle.plugins:shadow:2.0.1)
> Could not get resource 'https://jcenter.bintray.com/com/github/jengelman/gradle/plugins/shadow/2.0.1/shadow-2.0.1.jar'.
> Could not GET 'https://jcenter.bintray.com/com/github/jengelman/gradle/plugins/shadow/2.0.1/shadow-2.0.1.jar'.
> Connect to d29vzk4ow07wi7.cloudfront.net:443 [d29vzk4ow07wi7.cloudfront.net/52.222.217.47, d29vzk4ow07wi7.cloudfront.net/52.222.217.83, d29vzk4ow07wi7.cloudfront.net/52.222.217.198, d29vzk4ow07wi7.cloudfront.net/52.222.217.210] failed: Read timed out
Seems the shadow-2.0.1.jar needs some dependences.But I can use URL"https://jcenter.bintray.com/com/github/jengelman/gradle/plugins/shadow/2.0.1/shadow-2.0.1.jar" download this jar.
So I don't know how to fix this. Can you tell me how to build success?
thanks a lot .
Hi Team,
I am able to run the cassandra_exporter and able to see the metrics in UI. Usually we write the rules in exporter.yml to filter out the metrics, where how can we pass the rules yml file to filter out.
Or can we directly import the metrics in grafana and filter out as we have more number of cassandra servers checking on this.
This is because config files get patched in-place:
sed: couldn't open temporary file /etc/cassandra_exporter/sedHUtiwj: Permission denied
I think its possibly a bug or my configuration mistake
I tried setting host as remote ip address
$ head -1 /app/config.yml
host: 10.42.10.34:7199
On starting it still tries to connect to localhost
, here are the logs:
$ java -Dorg.slf4j.simpleLogger.defaultLogLevel=trace -jar cassandra_exporter-1.0.1-all.jar /app/config.yml
[main] INFO com.criteo.nosql.cassandra.exporter.Config - Loading yaml config from /app/config.yml
[main] TRACE com.criteo.nosql.cassandra.exporter.Config - com.criteo.nosql.cassandra.exporter.Config@42dafa95
[main] ERROR com.criteo.nosql.cassandra.exporter.Main - Scrapper stopped due to uncaught exception
java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused (Connection refused)
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:129)
at javax.management.remote.rmi.RMIServerImpl_Stub.newClient(Unknown Source)
at javax.management.remote.rmi.RMIConnector.getConnection(RMIConnector.java:2430)
at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:308)
at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270)
at com.criteo.nosql.cassandra.exporter.JmxScraper.run(JmxScraper.java:104)
at com.criteo.nosql.cassandra.exporter.Main.main(Main.java:36)
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at java.net.Socket.connect(Socket.java:538)
at java.net.Socket.<init>(Socket.java:434)
at java.net.Socket.<init>(Socket.java:211)
at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:148)
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
Hi,
thanks for your efforts of putting this together! Is there a way to get statistics per node in the cluster as well? Sometimes single nodes misbehave where a dashboard helps to quickly identify a faulty node.
Kind regards,
Christian
Hello,
nodetool status
reports status of every node in the cluster (from current node point of view), that can help to detect network partitions and another weird issues like this (fourth node thinks that all is ok, but another nodes disagree):
root@cassandra-0:/# nodetool status
Datacenter: staging
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.140.80.4 947.48 GiB 96 ? 1be67d6c-5ec3-4352-874a-2cfa7b56966d rack1
UN 10.140.81.4 1.04 TiB 96 ? 8dcbee81-6a73-4bcb-b95e-0833790394ac rack1
UN 10.140.82.4 1.01 TiB 96 ? 4846412a-ab09-472c-a6da-10fb6834865e rack1
DN 10.140.83.2 1.05 TiB 96 ? 72a13bd8-ae4b-4c20-833a-774b9688b264 rack1
root@cassandra-1:/# nodetool status
Datacenter: staging
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.140.80.4 947.44 GiB 96 ? 1be67d6c-5ec3-4352-874a-2cfa7b56966d rack1
UN 10.140.81.4 1.04 TiB 96 ? 8dcbee81-6a73-4bcb-b95e-0833790394ac rack1
UN 10.140.82.4 1.01 TiB 96 ? 4846412a-ab09-472c-a6da-10fb6834865e rack1
DN 10.140.83.2 1.05 TiB 96 ? 72a13bd8-ae4b-4c20-833a-774b9688b264 rack1
root@cassandra-2:/# nodetool status
Datacenter: staging
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.140.80.4 947.44 GiB 96 ? 1be67d6c-5ec3-4352-874a-2cfa7b56966d rack1
UN 10.140.81.4 1.04 TiB 96 ? 8dcbee81-6a73-4bcb-b95e-0833790394ac rack1
UN 10.140.82.4 1.01 TiB 96 ? 4846412a-ab09-472c-a6da-10fb6834865e rack1
DN 10.140.83.2 1.05 TiB 96 ? 72a13bd8-ae4b-4c20-833a-774b9688b264 rack1
root@cassandra-3:/# nodetool status
Datacenter: staging
======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.140.80.4 947.44 GiB 96 ? 1be67d6c-5ec3-4352-874a-2cfa7b56966d rack1
UN 10.140.81.4 1.04 TiB 96 ? 8dcbee81-6a73-4bcb-b95e-0833790394ac rack1
UN 10.140.82.4 1.01 TiB 96 ? 4846412a-ab09-472c-a6da-10fb6834865e rack1
UN 10.140.83.2 1.05 TiB 96 ? 72a13bd8-ae4b-4c20-833a-774b9688b264 rack1
This information is available via JMX, but corresponding MBean attribute has java.util.List
type (which unsupported by exporter):
# java -jar jmxterm-1.0.0-uber.jar --url localhost:7199
Welcome to JMX terminal. Type "help" for available commands.
$>bean org.apache.cassandra.db:type=StorageService
#bean is set to org.apache.cassandra.db:type=StorageService
$>get LiveNodes
#mbean = org.apache.cassandra.db:type=StorageService:
LiveNodes = ( 10.140.80.4, 10.140.81.4, 10.140.82.4, 10.140.83.2 );
$>get UnreachableNodes
#mbean = org.apache.cassandra.db:type=StorageService:
UnreachableNodes = ( );
Is there a good way to detect issues like this?
Also, I looked on criteo/casspoke, but, if I understand it right, it can detect nodes that unavailable for client, but not inner communication issues. Looks like it can't cover this case. :(
How is this meant to be run if you already have Cassandra in a docker container, if it can't be a javaagent like jmx_exporter?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.