I'm consuming messages from kafka topic in confluent cloud using structured streaming

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

to_confluent_avro low performance + warnings: Schema Registry client is already configured. about abris HOT 10 CLOSED

absaoss commented on June 28, 2024

to_confluent_avro low performance + warnings: Schema Registry client is already configured.

from abris.

Comments (10)

cerveada commented on June 28, 2024 2

Hello @agolovenko, thanks for the PR.

We investigated the issue further and the core of the problem seems to be in CatalystDataToAvro. To answer your question: yes the warning shouldn't be created if everything works as it should. I'm already working on a fix to both of these issues. It should be finished tomorrow.

from abris.

felipemmelo commented on June 28, 2024

Hi @agolovenko ,

First of all, you're doing everything correctly.

Regarding the logs, they seem to be generated by Spark, but should not harm performance. You can disable them in the logging configs. Are you using log4j?

Regarding performance, it might depend on several factors. What is the latency of your network? What is the size of your message? What is the record rate on the production side? What is the latency introduced by the security layer?

To have a better understanding of the performance factors, can you run the same workload in a different cluster or turn-off security?

Cheers.

from abris.

agolovenko commented on June 28, 2024

Hi @felipemmelo, thanks for the prompt reply.

Regarding the logs messages: they seem to originate here in Abris: https://github.com/AbsaOSS/ABRiS/blob/master/src/main/scala/za/co/absa/abris/avro/read/confluent/SchemaManager.scala#L183. Sure I can set the logLevel or disable that. But before doing so I'd like to understand that the code works as supposed to. And if it does then why post a warning?

Regrading performance:

Network & security latency: We've tried the exactly same setup but with a dummy stream just reading and writing binary data without any avro parsing and throughtput increased by x1000:

val outputDF = inputDF.select(col("value"))

We generate records ourselves. But to ensure that it's not a bottleneck I was running stream on a topic that already had 1M of messages, set startingOffsets to earliest and cleaned the checkpoints folder.
Regarding the message, in both cases it was the same:
- payload: {"firstname": "Ashot", "lastname": "Golovenko", "country":"RU"}
- schema: {"type":"record","name":"ProofOfConcept","fields":[{"name":"firstname","type":"string"},{"name":"lastname","type":"string"},{"name":"country","type":"string"}]}

Thanks!

from abris.

agolovenko commented on June 28, 2024

Update: we've isolated the performance problem to writing in confluent avro format using to_confluent_avro:
This

val outputDF = inputDF
    .select(from_confluent_avro(col("value"), schemaRegistryConfigIn) as "parsed_message")
    .select("parsed_message.*")
    .select(to_avro(struct("firstname", "lastname", "country")) as "value")

vs this:

val outputDF = inputDF
    .select(from_confluent_avro(col("value"), schemaRegistryConfigIn) as "parsed_message")
    .select("parsed_message.*")
    .select(
      to_confluent_avro(struct("firstname", "lastname", "country"), schemaRegistryConfigOut) as "value"
    )

Gives 1000x performance boost. Still no idea why though.

from abris.

agolovenko commented on June 28, 2024

Update 2:

I'm quite confident I've found the source of low throughput.
I was debugging the execution and this chain is causing large delays being executed on every record:

The last line calls two checks SchemaManager.exists(subject) and SchemaManager.isCompatible(schema, subject) that albeit use CachedSchemaRegistryClient aren't cached calls and poll the rest API of schema registry. This happens twice per record.
https://github.com/AbsaOSS/ABRiS/blob/master/src/main/scala/za/co/absa/abris/avro/read/confluent/SchemaManager.scala#L216-L241

Removing these checks is getting the performance to the desired level (I tweaked the snapshot version). To me, it looks like these checks should be done once only per schema.

from abris.

felipemmelo commented on June 28, 2024

Hi @agolovenko , yes, that definitely makes sense. We'll get into this and try to release a new version asap.

In the meantime, since you've already done, wouldn't you like to submit the change as a PR?

from abris.

agolovenko commented on June 28, 2024

There's quite a difference between a quick patch and "doing it the right way". I simply commented out the checks which caused the slowdown.

I added a quick PR of how I see the fix could look like but it breaks the tests, and I'm not sure why since I didn't have time to dig deep in code. Suggestions appreciated. #86

from abris.

agolovenko commented on June 28, 2024

My other question still stands though: if the code is working as designed then why is it logging warnings? https://github.com/AbsaOSS/ABRiS/blob/master/src/main/scala/za/co/absa/abris/avro/read/confluent/SchemaManager.scala#L183

from abris.

cerveada commented on June 28, 2024

Hello @agolovenko, the issues were fixed. Would you mind trying it by yourself before we release new version of Abris?

The fix is already in master branch so you can just build it with Maven from there. By default, Abris is build for Scala 2.12, but here is a guide how to build it for 2.11 in case you need it.

from abris.

agolovenko commented on June 28, 2024

Hi @cerveada ! Just checked the fix and it works great. The performance is back to the exected level now. Thanks a lot!

from abris.

to_confluent_avro low performance + warnings: Schema Registry client is already configured. about abris HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent