Comments (17)
Log of the mentioned exception I was getting after a server reboot:
radar-mongodb-connector_1 | [2017-06-20 08:44:38,769] INFO [FLUSH-WRITER] Time-elapsed: 4.1237E-5 s (org.radarcns.mongodb.MongoDbWriter)
radar-mongodb-connector_1 | [2017-06-20 08:44:38,807] INFO [FLUSH] Time elapsed: 0.037974438 s (org.radarcns.mongodb.MongoDbSinkTask)
radar-mongodb-connector_1 | [2017-06-20 08:44:38,807] INFO WorkerSinkTask{id=radar-connector-mongodb-sink-0} Committing offsets (org.apache.kafka.connect.runtime.WorkerSinkTask)
radar-mongodb-connector_1 | [2017-06-20 08:44:38,818] ERROR Task radar-connector-mongodb-sink-0 threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask)
radar-mongodb-connector_1 | org.apache.kafka.connect.errors.DataException: Failed to deserialize data to Avro:
radar-mongodb-connector_1 | at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:109)
radar-mongodb-connector_1 | at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:357)
radar-mongodb-connector_1 | at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:239)
radar-mongodb-connector_1 | at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:172)
radar-mongodb-connector_1 | at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:143)
radar-mongodb-connector_1 | at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140)
radar-mongodb-connector_1 | at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)
radar-mongodb-connector_1 | at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
radar-mongodb-connector_1 | at java.util.concurrent.FutureTask.run(FutureTask.java:266)
radar-mongodb-connector_1 | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
radar-mongodb-connector_1 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
radar-mongodb-connector_1 | at java.lang.Thread.run(Thread.java:745)
radar-mongodb-connector_1 | Caused by: org.apache.kafka.common.errors.SerializationException: Error retrieving Avro schema for id 8
radar-mongodb-connector_1 | Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema not found; error code: 40403
radar-mongodb-connector_1 | at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:170)
radar-mongodb-connector_1 | at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:187)
radar-mongodb-connector_1 | at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:323)
radar-mongodb-connector_1 | at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:316)
radar-mongodb-connector_1 | at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getSchemaByIdFromRegistry(CachedSchemaRegistryClient.java:63)
radar-mongodb-connector_1 | at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getBySubjectAndID(CachedSchemaRegistryClient.java:118)
radar-mongodb-connector_1 | at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:121)
radar-mongodb-connector_1 | at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserializeWithSchemaAndVersion(AbstractKafkaAvroDeserializer.java:190)
radar-mongodb-connector_1 | at io.confluent.connect.avro.AvroConverter$Deserializer.deserialize(AvroConverter.java:130)
radar-mongodb-connector_1 | at io.confluent.connect.avro.AvroConverter.toConnectData(AvroConverter.java:99)
radar-mongodb-connector_1 | at org.apache.kafka.connect.runtime.WorkerSinkTask.convertMessages(WorkerSinkTask.java:357)
radar-mongodb-connector_1 | at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:239)
radar-mongodb-connector_1 | at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:172)
radar-mongodb-connector_1 | at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:143)
radar-mongodb-connector_1 | at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140)
radar-mongodb-connector_1 | at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)
radar-mongodb-connector_1 | at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
radar-mongodb-connector_1 | at java.util.concurrent.FutureTask.run(FutureTask.java:266)
radar-mongodb-connector_1 | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
radar-mongodb-connector_1 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
radar-mongodb-connector_1 | at java.lang.Thread.run(Thread.java:745)
radar-mongodb-connector_1 | [2017-06-20 08:44:38,821] ERROR Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask)
radar-mongodb-connector_1 | [2017-06-20 08:44:38,822] INFO Closed connection [connectionId{localValue:2, serverValue:7}] to hotstorage:27017 because the pool has been closed. (org.mongodb.driver.connection)
radar-mongodb-connector_1 | [2017-06-20 08:44:38,824] INFO MongoDB connection is has been closed (org.radarcns.mongodb.MongoWrapper)
radar-mongodb-connector_1 | [2017-06-20 08:44:38,824] INFO Stopped MongoDbWriter (org.radarcns.mongodb.MongoDbWriter)
radar-mongodb-connector_1 | [2017-06-20 08:44:47,779] INFO Reflections took 11177 ms to scan 265 urls, producing 13109 keys and 86311 values (org.reflections.Reflections)
from radar-docker.
Thanks for the issue and error message. It looks like the Schema Registry does not recover correctly, and this causes errors in the MongoDB connector.
from radar-docker.
One of these
volumes:
kafka-1-data: {}
kafka-2-data: {}
kafka-3-data: {}
get corrupted.
from radar-docker.
Yeah this is also what Francesco concluded. It was fixed after recreating the stack, i.e.
docker-compose down
docker system prune
[delete mongodb and hdfs storage]
install....sh
however this should of course be avoided in a production deployment
from radar-docker.
@sboettcher were you streaming data?
from radar-docker.
At the time of the reboot that caused the issue yes, I believe it was streaming data from one E4.
from radar-docker.
Do you still have logs of the android app?
from radar-docker.
And the logs of the kafka brokers?
from radar-docker.
@blootsvoets maybe we have lost logs form Kafka Brokers
and Schema-Registry
from radar-docker.
Nope. This initially happened a week ago actually, just now got around to worrying about it again... Sorry
from radar-docker.
Which OS are you using? What did you update?
from radar-docker.
OS version: Linux nz1200 4.4.0-81-generic #104-Ubuntu SMP Wed Jun 14 08:17:06 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
docker version: Docker version 17.03.1-ce, build c6d412e
docker-compose version: docker-compose version 1.13.0, build 1719ceb
Reviewing my apt logs, I didn't actually update anything on that day, just rebooted after docker-compose stop
.
from radar-docker.
Just want to report on an event we had today and I think it fits to this issue:
There was a major power outage today at our clinic, which affected the server machine that the docker stack was running on. Three sources were streaming at the time, however the sources were not affected by the outage.
The server was killed and immediately rebooted, and the still registered containers were seemingly restored after a few minutes. However, the still running pRMT sources could not properly stream to the server anymore. The app reported failed uploading of records, and the webserver responded with 502 to the upload requests. Unfortunately I did not manage to get a full log of the stack at this point.
I shut down the stack via docker-compose down
and started it again via the install script, which solved the problem and the apps could again connect and upload. However it took quite some time for the locally saved records to be fully streamed to the server (the connection disruption lasted about 45 min, the last source caught up with real-time data after ~3 hours). It looked like only 1-2 sources were allowed to stream for some time, and then it switched to other sources, the switch taking sometimes 15-30 mins in which time nothing was streamed.
Everything works again now, just wanted to report on this incident and what I saw how the system behaved.
Looks like the stack is robust enough to work through a reboot, if it is properly shut down and started again (i.e. the docker auto start of containers on reboot does not work).
from radar-docker.
@sboettcher I have noticed the same once when I unexpectedly crashed the server. When I looked at the logs of containers,there was an error in both mongodb and hdfs connectors. They couldn't read values from topic partitions. Maybe crashing of server requires partition reassignment. Stopping and starting the stack worked.
from radar-docker.
yes this fits in with what we've seen too (although unlike your case of the power outage, it was unclear what triggered this) -- the auto restart containers fails to provide a working stack. This also was resolved when stopped/started using the scripts. I guess it must be some pre-initialisation that is provided through -- is there any way we can block the auto-restart and trigger a script based restart?
from radar-docker.
We also sometimes need to delete the HDFS and mongo data directories for it to work and sometimes docker prune too. A restart script can be run based on the health of the container. Like if mongoDb or HDFS connector reports unhealthy we can trigger a hard restart of the stack. We already have a check_health script, we can modify it to include this function. Auto-restart can be stopped by just removing restart:always from docker-compose.yml
from radar-docker.
Closing as deprecated
from radar-docker.
Related Issues (20)
- Nginx Optional services conf should be customised based on the .env property
- 503 Error under Installation HOT 2
- Kafka socket error during installation (Socket error occurred: zookeeper-3/172.18.0.2:2181: Connection refused) HOT 3
- Specs for server?
- Host not found in upstream error upon installation HOT 10
- (401 error) Unable to log into management portal as admin after install HOT 4
- Radar-base Installation error in postgres HOT 4
- Error: Host not found in upstream "radar-integration" in /etc/nginx/optional-services.conf:3 HOT 3
- Failed to Setup IP tables when installing radar-docker HOT 1
- nginx service keeps restaring HOT 8
- Installation successful but I can't access management portal HOT 14
- Source type shows as empty HOT 9
- Kafka brokers not available.
- Gateway Errors? HOT 4
- bin/radar-docker install error on AWS EC2 server instance HOT 1
- Management portal first access error - "Not fetching public key more than once every PT1M" HOT 1
- Migrate to Github Actions HOT 3
- Remove deprecated services
- Align hp and s3 stack with latest github action and docker-compose setup HOT 1
- Separate the steps in CI to different jobs HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from radar-docker.