Giter Site home page Giter Site logo

advent-of-code-flink-paimon's Introduction

Hi ๐Ÿ‘‹ I'm Giannis

and I'm an Architect & Trainer focusing on Streaming Data.

Passionate for Event Streaming Systems, Stateful Stream Processing and Streaming Lakehouses

  • ๐Ÿ”ญ Working @ Ververica
  • ๐ŸŒฑ Iโ€™m focusing on Apache Flink and Apache Paimon.
  • ๐Ÿ“ I write articles on medium and from time to time @ Rock The JVM Blog
  • ๐Ÿ’ฌ Ask me about Apache Flink, Apache Paimon and Streaming Data Systems.
  • ๐Ÿ“ซ How to reach me https://www.linkedin.com/in/polyzos/

advent-of-code-flink-paimon's People

Contributors

polyzos avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

advent-of-code-flink-paimon's Issues

failed to create file:/opt/flink/temp/paimon/default.db/measurements/bucket-0

hello dera;
i used your YAML from Flink Book, and i added Paimon to Jar folder of that and also on Dockerfile and create bellow YAML ( i attached YAML also ) , then i tried using this repositorys Readme.md and guide.md , when i create "measurements" table and other commands, until i run and create this job :
SET 'pipeline.name' = 'Sensor Info Ingestion';

INSERT INTO sensor_info
SELECT * FROM `default_catalog`.`default_database`.sensor_info;

in Flink, Running Jobs i recived this "Root Exception" :

2024-04-23 18:14:17
java.io.IOException: java.io.UncheckedIOException: java.io.IOException: Mkdirs failed to create file:/opt/flink/temp/paimon/default.db/measurements/bucket-0
	at org.apache.paimon.flink.sink.RowDataStoreWriteOperator.processElement(RowDataStoreWriteOperator.java:125)
	at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:237)
	at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:146)
	at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110)
	at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:550)
	at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)
	at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788)
	at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)
	at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:931)
	at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745)
	at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.io.UncheckedIOException: java.io.IOException: Mkdirs failed to create file:/opt/flink/temp/paimon/default.db/measurements/bucket-0
	at org.apache.paimon.io.SingleFileWriter.<init>(SingleFileWriter.java:77)
	at org.apache.paimon.io.StatsCollectingSingleFileWriter.<init>(StatsCollectingSingleFileWriter.java:58)
	at org.apache.paimon.io.RowDataFileWriter.<init>(RowDataFileWriter.java:58)
	at org.apache.paimon.io.RowDataRollingFileWriter.lambda$new$0(RowDataRollingFileWriter.java:51)
	at org.apache.paimon.io.RollingFileWriter.openCurrentWriter(RollingFileWriter.java:100)
	at org.apache.paimon.io.RollingFileWriter.write(RollingFileWriter.java:79)
	at org.apache.paimon.append.AppendOnlyWriter.write(AppendOnlyWriter.java:113)
	at org.apache.paimon.append.AppendOnlyWriter.write(AppendOnlyWriter.java:51)
	at org.apache.paimon.operation.AbstractFileStoreWrite.write(AbstractFileStoreWrite.java:114)
	at org.apache.paimon.table.sink.TableWriteImpl.writeAndReturn(TableWriteImpl.java:114)
	at org.apache.paimon.flink.sink.StoreSinkWriteImpl.write(StoreSinkWriteImpl.java:159)
	at org.apache.paimon.flink.sink.RowDataStoreWriteOperator.processElement(RowDataStoreWriteOperator.java:123)
	... 13 more
Caused by: java.io.IOException: Mkdirs failed to create file:/opt/flink/temp/paimon/default.db/measurements/bucket-0
	at org.apache.paimon.fs.local.LocalFileIO.newOutputStream(LocalFileIO.java:80)
	at org.apache.paimon.io.SingleFileWriter.<init>(SingleFileWriter.java:68)
	... 24 more 

i think its because, Paimont not or can't create needed files or folder on this direction,
also i checked direction, this direction exist :/opt/flink/temp/paimon/default.db/measurements/ and there is file by the name of schema-0 nad no bucket-0 and all parquet files.
schema-0 s files content is like informations of job:
what i did wrong?!

{
  "id" : 0,
  "fields" : [ {
    "id" : 0,
    "name" : "sensor_id",
    "type" : "BIGINT"
  }, {
    "id" : 1,
    "name" : "reading",
    "type" : "DECIMAL(5, 1)"
  } ],
  "highestFieldId" : 1,
  "partitionKeys" : [ ],
  "primaryKeys" : [ ],
  "options" : {
    "bucket" : "1",
    "schema.2.expr" : "PROCTIME()",
    "schema.2.data-type" : "TIMESTAMP(3) WITH LOCAL TIME ZONE NOT NULL",
    "schema.2.name" : "event_time",
    "bucket-key" : "sensor_id",
    "file.format" : "parquet"
  },
  "timeMillis" : 1713882941002
}
version: "3.7"
networks:
  redpanda_network:
    driver: bridge
volumes:
  redpanda: null
services:
  redpanda:
    command:
      - redpanda
      - start
      - --kafka-addr internal://0.0.0.0:9092,external://0.0.0.0:19092
      - --advertise-kafka-addr internal://redpanda:9092,external://localhost:19092
      - --pandaproxy-addr internal://0.0.0.0:8082,external://0.0.0.0:18082
      - --advertise-pandaproxy-addr internal://redpanda:8082,external://localhost:18082
      - --schema-registry-addr internal://0.0.0.0:8081,external://0.0.0.0:18081
      - --rpc-addr redpanda:33145
      - --advertise-rpc-addr redpanda:33145
      - --smp 1
      - --memory 1G
      - --mode dev-container
      - --default-log-level=debug
    image: docker.redpanda.com/redpandadata/redpanda:v23.2.3
    container_name: redpanda
    volumes:
      - redpanda:/var/lib/redpanda/data
    networks:
      - redpanda_network
    ports:
      - "18081:18081"
      - "18082:18082"
      - "19092:19092"
      - "19644:9644"
  console:
    container_name: redpanda-console
    image: docker.redpanda.com/vectorized/console:v2.3.0
    networks:
      - redpanda_network
    entrypoint: /bin/sh
    command: -c 'echo "$$CONSOLE_CONFIG_FILE" > /tmp/config.yml; /app/console'
    environment:
      CONFIG_FILEPATH: /tmp/config.yml
      CONSOLE_CONFIG_FILE: |
        kafka:
          brokers: ["redpanda:9092"]
          schemaRegistry:
            enabled: true
            urls: ["http://redpanda:8081"]
        redpanda:
          adminApi:
            enabled: true
            urls: ["http://redpanda:9644"]
    ports:
      - "8080:8080"
    depends_on:
      - redpanda
  jobmanager:
    build: .
    container_name: jobmanager
    ports:
      - "8081:8081"
      - "9249:9249"
    command: jobmanager
    networks:
      - redpanda_network
    volumes:
      - ./jars/:/opt/flink/jars
      - ./logs/flink/:/opt/flink/temp
    environment:
      - |
        FLINK_PROPERTIES=
        jobmanager.rpc.address: jobmanager
        jobmanager.scheduler: Adaptive
        metrics.reporters: prom
        metrics.reporter.prom.factory.class: org.apache.flink.metrics.prometheus.PrometheusReporterFactory
        metrics.reporter.prom.port: 9249
        state.backend.rocksdb.metrics.actual-delayed-write-rate: true
        state.backend.rocksdb.metrics.background-errors: true
        state.backend.rocksdb.metrics.block-cache-capacity: true
        state.backend.rocksdb.metrics.estimate-num-keys: true
        state.backend.rocksdb.metrics.estimate-live-data-size: true
        state.backend.rocksdb.metrics.estimate-pending-compaction-bytes: true
        state.backend.rocksdb.metrics.num-running-compactions: true
        state.backend.rocksdb.metrics.compaction-pending: true
        state.backend.rocksdb.metrics.is-write-stopped: true
        state.backend.rocksdb.metrics.num-running-flushes: true
        state.backend.rocksdb.metrics.mem-table-flush-pending: true
        state.backend.rocksdb.metrics.block-cache-usage: true
        state.backend.rocksdb.metrics.size-all-mem-tables: true
        state.backend.rocksdb.metrics.num-live-versions: true
        state.backend.rocksdb.metrics.block-cache-pinned-usage: true
        state.backend.rocksdb.metrics.estimate-table-readers-mem: true
        state.backend.rocksdb.metrics.num-snapshots: true
        state.backend.rocksdb.metrics.num-entries-active-mem-table: true
        state.backend.rocksdb.metrics.num-deletes-imm-mem-tables: true
  taskmanager1:
    build: .
    container_name: taskmanager1
    depends_on:
      - jobmanager
    command: taskmanager
    networks:
      - redpanda_network
    ports:
      - "9250:9249"
    volumes:
      - ./logs/flink/:/opt/flink/temp
    environment:
      - |
        FLINK_PROPERTIES=
        jobmanager.rpc.address: jobmanager
        taskmanager.numberOfTaskSlots: 5
        metrics.reporters: prom
        metrics.reporter.prom.factory.class: org.apache.flink.metrics.prometheus.PrometheusReporterFactory
        metrics.reporter.prom.port: 9249
        state.backend.rocksdb.metrics.actual-delayed-write-rate: true
        state.backend.rocksdb.metrics.background-errors: true
        state.backend.rocksdb.metrics.block-cache-capacity: true
        state.backend.rocksdb.metrics.estimate-num-keys: true
        state.backend.rocksdb.metrics.estimate-live-data-size: true
        state.backend.rocksdb.metrics.estimate-pending-compaction-bytes: true
        state.backend.rocksdb.metrics.num-running-compactions: true
        state.backend.rocksdb.metrics.compaction-pending: true
        state.backend.rocksdb.metrics.is-write-stopped: true
        state.backend.rocksdb.metrics.num-running-flushes: true
        state.backend.rocksdb.metrics.mem-table-flush-pending: true
        state.backend.rocksdb.metrics.block-cache-usage: true
        state.backend.rocksdb.metrics.size-all-mem-tables: true
        state.backend.rocksdb.metrics.num-live-versions: true
        state.backend.rocksdb.metrics.block-cache-pinned-usage: true
        state.backend.rocksdb.metrics.estimate-table-readers-mem: true
        state.backend.rocksdb.metrics.num-snapshots: true
        state.backend.rocksdb.metrics.num-entries-active-mem-table: true
        state.backend.rocksdb.metrics.num-deletes-imm-mem-tables: true
  taskmanager2:
    build: .
    container_name: taskmanager2
    depends_on:
      - jobmanager
    command: taskmanager
    networks:
      - redpanda_network
    ports:
      - "9251:9249"
    volumes:
      - ./logs/flink/:/opt/flink/temp
    environment:
      - |
        FLINK_PROPERTIES=
        jobmanager.rpc.address: jobmanager
        jobmanager.scheduler: Adaptive
        taskmanager.numberOfTaskSlots: 5
        metrics.reporters: prom
        metrics.reporter.prom.factory.class: org.apache.flink.metrics.prometheus.PrometheusReporterFactory
        metrics.reporter.prom.port: 9249
        state.backend.rocksdb.metrics.actual-delayed-write-rate: true
        state.backend.rocksdb.metrics.background-errors: true
        state.backend.rocksdb.metrics.block-cache-capacity: true
        state.backend.rocksdb.metrics.estimate-num-keys: true
        state.backend.rocksdb.metrics.estimate-live-data-size: true
        state.backend.rocksdb.metrics.estimate-pending-compaction-bytes: true
        state.backend.rocksdb.metrics.num-running-compactions: true
        state.backend.rocksdb.metrics.compaction-pending: true
        state.backend.rocksdb.metrics.is-write-stopped: true
        state.backend.rocksdb.metrics.num-running-flushes: true
        state.backend.rocksdb.metrics.mem-table-flush-pending: true
        state.backend.rocksdb.metrics.block-cache-usage: true
        state.backend.rocksdb.metrics.size-all-mem-tables: true
        state.backend.rocksdb.metrics.num-live-versions: true
        state.backend.rocksdb.metrics.block-cache-pinned-usage: true
        state.backend.rocksdb.metrics.estimate-table-readers-mem: true
        state.backend.rocksdb.metrics.num-snapshots: true
        state.backend.rocksdb.metrics.num-entries-active-mem-table: true
        state.backend.rocksdb.metrics.num-deletes-imm-mem-tables: true
  postgres:
    image: postgres:latest
    networks:
      - redpanda_network
    container_name: postgres
    restart: always
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres
    ports:
      - '5432:5432'
  prometheus:
    image: prom/prometheus:latest
    networks:
      - redpanda_network
    container_name: prometheus
    command:
      - '--config.file=/etc/prometheus/config.yaml'
    ports:
      - "9090:9090"
    volumes:
      - ./prometheus:/etc/prometheus
  #      - ./prom_data:/prometheus
  grafana:
    image: grafana/grafana
    networks:
      - redpanda_network
    container_name: grafana
    ports:
      - "3000:3000"
    restart: unless-stopped
    environment:
      - GF_SECURITY_ADMIN_USER=grafana
      - GF_SECURITY_ADMIN_PASSWORD=grafana
    volumes:
      - ./grafana/provisioning:/etc/grafana/provisioning
      - ./grafana/dashboards:/var/lib/grafana/dashboards

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.