PartitionManager does not write Offset to zookeeper in some condition duplicated d

Hi, I'm a coworker with <a class="user-mention notranslate" data-hovercard-type="user"

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

The infinity loops caused by this line: <a href="https://github.com/dibbhatt/kafka

PartitionManager not writing Offset to Zookeeper if a record fetched from Kafka only contain one line. about kafka-spark-consumer HOT 9 CLOSED

ChorPangChan commented on June 2, 2024

PartitionManager not writing Offset to Zookeeper if a record fetched from Kafka only contain one line.

from kafka-spark-consumer.

Comments (9)

dibbhatt commented on June 2, 2024

Thanks for testing this. This issue is already fixed in 1.0.8. Did you see same issue in 1.0.8 as well ?

from kafka-spark-consumer.

saturday-shi commented on June 2, 2024

Hi, I'm a coworker with @ChorPangChan.

This issue is already fixed in 1.0.8. Did you see same issue in 1.0.8 as well ?

No, this is not fixed. We tested in 1.0.8 and found the issue still exists. Here are the logs:

A succeeded pattern:

17/01/20 14:18:34 DEBUG kafka.PartitionManager: Total 1 messages from Kafka: cdh005.testdev.local : 0 there in internal buffers
17/01/20 14:18:34 DEBUG kafka.PartitionManager: Store for topic topic_tid_9999_dev for partition 0 is 76
17/01/20 14:18:34 DEBUG kafka.PartitionManager: LastComitted Offset 75
17/01/20 14:18:34 DEBUG kafka.PartitionManager: New Emitted Offset 77
17/01/20 14:18:34 DEBUG kafka.PartitionManager: Enqueued Offset 76
17/01/20 14:18:34 DEBUG kafka.PartitionManager: Committing offset for Partition{host=cdh005.testdev.local:9092, partition=0}
17/01/20 14:18:34 DEBUG kafka.PartitionManager: Wrote committed offset to ZK: 77
17/01/20 14:18:34 DEBUG kafka.PartitionManager: Committed offset 77 for Partition{host=cdh005.testdev.local:9092, partition=0} for consumer: test_test_kpi_reg_user

... followed by a failed pattern:

17/01/20 14:19:21 DEBUG kafka.PartitionManager: Total 1 messages from Kafka: cdh005.testdev.local : 0 there in internal buffers
17/01/20 14:19:21 DEBUG kafka.PartitionManager: Store for topic topic_tid_9999_dev for partition 0 is 77
17/01/20 14:19:21 DEBUG kafka.PartitionManager: LastComitted Offset 77
17/01/20 14:19:21 DEBUG kafka.PartitionManager: New Emitted Offset 78
17/01/20 14:19:21 DEBUG kafka.PartitionManager: Enqueued Offset 77
17/01/20 14:19:21 DEBUG kafka.PartitionManager: Last Enqueued offset 77 not incremented since previous Comitted Offset 77 for partition  Partition{host=cdh005.testdev.local:9092, partition=0} for Consumer test_test_kpi_reg_user.
17/01/20 14:19:21 DEBUG kafka.PartitionManager: LastComitted Offset 77
17/01/20 14:19:21 DEBUG kafka.PartitionManager: New Emitted Offset 78
17/01/20 14:19:21 DEBUG kafka.PartitionManager: Enqueued Offset 77
17/01/20 14:19:21 DEBUG kafka.PartitionManager: Last Enqueued offset 77 not incremented since previous Comitted Offset 77 for partition  Partition{host=cdh005.testdev.local:9092, partition=0} for Consumer test_test_kpi_reg_user.
17/01/20 14:19:21 DEBUG kafka.PartitionManager: LastComitted Offset 77
17/01/20 14:19:21 DEBUG kafka.PartitionManager: New Emitted Offset 78
17/01/20 14:19:21 DEBUG kafka.PartitionManager: Enqueued Offset 77
17/01/20 14:19:21 DEBUG kafka.PartitionManager: Last Enqueued offset 77 not incremented since previous Comitted Offset 77 for partition  Partition{host=cdh005.testdev.local:9092, partition=0} for Consumer test_test_kpi_reg_user.
...
# Infinity loop

The result is even worse than 1.0.6. Committing to zookeeper kept failing, and fell into an infinity loop.

from kafka-spark-consumer.

saturday-shi commented on June 2, 2024

@dibbhatt
It seems that you want to put _emittedToOffset into zkp:
https://github.com/dibbhatt/kafka-spark-consumer/blob/master/src/main/java/consumer/kafka/PartitionManager.java#L290

If this succeeded, _lastEnquedOffset will equal to _lastComittedOffset in next loop, so the condition in here will be false:
https://github.com/dibbhatt/kafka-spark-consumer/blob/master/src/main/java/consumer/kafka/PartitionManager.java#L286

... and the next offset will never be pushed into zkp.

from kafka-spark-consumer.

dibbhatt commented on June 2, 2024

What if I do

if (_lastEnquedOffset >= _lastComittedOffset) {
...
}

that should work right. Can you please raise a PR and I will merge it .

Dib

from kafka-spark-consumer.

saturday-shi commented on June 2, 2024

It works well for a running program which keeps consuming messages from kafka. But when the program starts/restarts, there will be a new infinity loop which keeps committing same offset to zkp:

17/01/20 17:21:18 DEBUG kafka.PartitionManager: Read partition information from : /consumer-path/test_test_kpi_reg_user/topic_tid_9999_dev/partition_0 --> {"partition":0,"offset":85,"topic":"topic_tid_9999_dev","broker":{"port":9092,"host":"cdh005.testdev.local"},"consumer":{"id":"test_test_kpi_reg_user"}}
17/01/20 17:21:18 DEBUG kafka.PartitionManager: Read last commit offset from zookeeper:85 ; old topology_id:test_test_kpi_reg_user - new consumer_id: test_test_kpi_reg_user
17/01/20 17:21:18 DEBUG kafka.PartitionManager: Starting Consumer cdh005.testdev.local :0 from offset 85
17/01/20 17:21:18 DEBUG kafka.PartitionManager: LastComitted Offset 85
17/01/20 17:21:18 DEBUG kafka.PartitionManager: New Emitted Offset 85
17/01/20 17:21:18 DEBUG kafka.PartitionManager: Enqueued Offset 85
17/01/20 17:21:18 DEBUG kafka.PartitionManager: Committing offset for Partition{host=cdh005.testdev.local:9092, partition=0}
17/01/20 17:21:18 DEBUG kafka.PartitionManager: Wrote committed offset to ZK: 85
17/01/20 17:21:18 DEBUG kafka.PartitionManager: Committed offset 85 for Partition{host=cdh005.testdev.local:9092, partition=0} for consumer: test_test_kpi_reg_user
17/01/20 17:21:18 DEBUG kafka.PartitionManager: LastComitted Offset 85
17/01/20 17:21:18 DEBUG kafka.PartitionManager: New Emitted Offset 85
17/01/20 17:21:18 DEBUG kafka.PartitionManager: Enqueued Offset 85
17/01/20 17:21:18 DEBUG kafka.PartitionManager: Committing offset for Partition{host=cdh005.testdev.local:9092, partition=0}
17/01/20 17:21:18 DEBUG kafka.PartitionManager: Wrote committed offset to ZK: 85
17/01/20 17:21:18 DEBUG kafka.PartitionManager: Committed offset 85 for Partition{host=cdh005.testdev.local:9092, partition=0} for consumer: test_test_kpi_reg_user
...
# Infinity loop

from kafka-spark-consumer.

saturday-shi commented on June 2, 2024

The infinity loops caused by this line:
https://github.com/dibbhatt/kafka-spark-consumer/blob/master/src/main/java/consumer/kafka/PartitionManager.java#L211

commit() should be executed only in the case of buffer is not empty. That is, it should be inside of the if block at L207.

from kafka-spark-consumer.

saturday-shi commented on June 2, 2024

Actually, we can simply fix this issue by changing from > to >=. But in fact, the code saves the _emittedToOffset instead of the last-processed-offset - which is mentioned in README.md - into zookeeper. I think you should unify it.

from kafka-spark-consumer.

dibbhatt commented on June 2, 2024

OK sure. If you can make the PR with fix and ReadMe that will be great. . Or I will take a look at it over the weekend to fix.

from kafka-spark-consumer.

saturday-shi commented on June 2, 2024

OK. I'll make a PR next Monday.

from kafka-spark-consumer.

PartitionManager not writing Offset to Zookeeper if a record fetched from Kafka only contain one line. about kafka-spark-consumer HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent