Giter Site home page Giter Site logo

Comments (8)

math-g avatar math-g commented on May 19, 2024 1

If indeed messages with null key are dropped as for KTable, could you give a way to select the key from values when creating the table ? Maybe that's the purpose of WITH KEY but I did not make it work.
Otherwise, when used with Kafka Connect-JDBC and ValueToKey transformations are not possible (because not all the tables have the same name for the id column, and the ValueToKey transformation doesn't allows such a per table id_colunm name mapping), before being able to use CREATE TABLE in KSQL, you will need to code a Kafka Stream to create a KStream for each Kafka Connect output topic (JDBC tables) and select key for each.

from ksql.

hjafarpour avatar hjafarpour commented on May 19, 2024

@math-g Yes, you are right. The reason you see the SELECT * FROM table not showing any results over a topic with null key is that KSQL similar to KTable in Kafka Streams API needs keys for the underlying topic for TABLE and will drop the messages with null topic.
To define one of the columns as key you can use PARTITION BY clause for stream.
SO in your example, for the topic that was generated by connect, you can define a stream first over it with CREATE STREAM DDL statement and then create a new STREAM using CREATE STREAM AS SELECT statement and add PARTITION BY clause to the end. This will create a new stream with the selected key. Please refer here for PARTITION BY example: "https://github.com/confluentinc/ksql/blob/0.1.x/docs/examples.md#examples"
Now you can define your table over the topic that your new stream created since the key in that topic won't be null.

from ksql.

math-g avatar math-g commented on May 19, 2024

Ok thanks, I was able to make the SELECT FROM TABLE work. Interesting to see that the 'WITH kafka_topic' property works as input for CREATE and as output for CREATE AS SELECT.
But the partitioned stream was empty so when I create the table, I can only see the new messages and I don't have a table with all the data from beginning. Is there a way to avoid that ?

Do you also know if with Kafka Connect, you can apply a per table ValueToKey transformation ? That would be simpler overall.

from ksql.

hjafarpour avatar hjafarpour commented on May 19, 2024

@math-g by default KSQL reads topic from the current offset. One exception is the Stream-Table join where the table will be read from the beginning.
You can use the following statement to change this so KSQL would read topics from the begining:

SET 'auto.offset.reset'='earliest';

For more details please refer to https://github.com/confluentinc/ksql/blob/0.1.x/docs/examples.md#examples
I'll get back to you on the connect question.

from ksql.

math-g avatar math-g commented on May 19, 2024

Thanks, I already had set this variable, actually, it doesn't seem to work for the intermediary stream, for which the output topic only seem to receive the new messages.

Ok I will await you reply about Kafka Connect.

from ksql.

hjafarpour avatar hjafarpour commented on May 19, 2024

@math-g about Connect: It is possible to apply an SMT to populate or override the key in a message, but ValueToKey will duplicate the value into the key. Another SMT might work for a single table and specific fields in the value, but for more than 1 table probably requires a custom SMT.

from ksql.

makgroup avatar makgroup commented on May 19, 2024

@hjafarpour hjafarpour : i am facing same issue, where SELECT * FROM table hanging.
loaded simple data account(id, name) by using /etc/kafka/connect-standalone.properties.
test-oracle-jdbc-ACCOUNTS created. i am able to pull the data from TOPIC.

but

CREATE STREAM ACCOUNT_INFO (id INTEGER, name varchar) WITH (kafka_topic='test-oracle-jdbc-ACCOUNTS', value_format='JSON');
CREATE TABLE ACCOUNT_DATA (id INTEGER, name varchar) WITH (kafka_topic='test-oracle-jdbc-ACCOUNTS', KEY='ID', value_format='JSON');

stream, table selection hanging. can you please help me on this

from ksql.

rmoff avatar rmoff commented on May 19, 2024

See #1405 - closing.

from ksql.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.