Giter Site home page Giter Site logo

afedulov / fraud-detection-demo Goto Github PK

View Code? Open in Web Editor NEW
326.0 10.0 147.0 1.88 MB

Repository for Advanced Flink Application Patterns series

Home Page: https://flink.apache.org/news/2020/01/15/demo-fraud-detection.html

Dockerfile 1.23% HTML 0.53% Shell 0.81% TypeScript 13.84% Java 82.96% JavaScript 0.09% SCSS 0.53%

fraud-detection-demo's Introduction

NOTE: Older Docker images are not available for Apple silicon. If you face this issue, try this WIP branch https://github.com/afedulov/fraud-detection-demo/tree/with-1.15

Fraud Detection Demo with Apache Flink

This demo is related to a three-part blog post series on Advanced Flink Application Patterns:

Requirements:

Demo is bundled in a self-contained package. In order to build it from sources you will need:

  • git
  • docker
  • docker-compose

Recommended resources allocated to Docker:

  • 4 CPUs
  • 8GB RAM

You can checkout the repository and run the demo locally.

How to run:

In order to run the demo locally, execute the following commands which build the project from sources and start all required services, including the Apache Flink and Apache Kafka clusters.

git clone https://github.com/afedulov/fraud-detection-demo
cd fraud-detection-demo
docker build -t demo-fraud-webapp:latest -f webapp/webapp.Dockerfile webapp/
docker build -t flink-job-fraud-demo:latest -f flink-job/Dockerfile flink-job/
docker-compose -f docker-compose-local-job.yaml up

Note: Dependencies are stored in a cached Docker layer. If you later only modify the source code, not the dependencies, you can expect significantly shorter packaging times for the subsequent builds.

When all components are up and running, go to localhost:5656 in your browser.

Note: you might need to change exposed ports in docker-compose-local-job.yaml in case of collisions.

fraud-detection-demo's People

Contributors

afedulov avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fraud-detection-demo's Issues

Cannot compile with JDK 11

I had to upate Lombok to 1.18.2 to compile successfully, otherwise I have the following problem:

Fatal error compiling: java.lang.ExceptionInInitializerError: com.sun.tools.javac.code.TypeTags -> [Help 1]

Add Rule validation to check if all fields in rule are present or not.

First, I want to convey that it is a great demo to showcase the power of Flink.
As I am running this on my environment, if I add a rule whose windowMinutes is not defined, flink throws an exception.

java.lang.NullPointerException
    at com.ververica.field.dynamicrules.Rule.getWindowMillis(Rule.java:46)
    at com.ververica.field.dynamicrules.functions.DynamicAlertFunction.updateWidestWindowRule(DynamicAlertFunction.java:220)
    at com.ververica.field.dynamicrules.functions.DynamicAlertFunction.processBroadcastElement(DynamicAlertFunction.java:144)
    at com.ververica.field.dynamicrules.functions.DynamicAlertFunction.processBroadcastElement(DynamicAlertFunction.java:51)
    at org.apache.flink.streaming.api.operators.co.CoBroadcastWithKeyedOperator.processElement2(CoBroadcastWithKeyedOperator.java:133)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory.processRecord2(StreamTwoInputProcessorFactory.java:225)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory.lambda$create$1(StreamTwoInputProcessorFactory.java:194)
    at org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory$StreamTaskNetworkOutput.emitRecord(StreamTwoInputProcessorFactory.java:266)
    at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
    at org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
    at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
    at org.apache.flink.streaming.runtime.io.StreamMultipleInputProcessor.processInput(StreamMultipleInputProcessor.java:85)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:542)
    at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:831)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:780)
    at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:935)
    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:914)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:728)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:550)
    at java.base/java.lang.Thread.run(Unknown Source)

After analysis, if a new rule is added where windowMinutes is not defined, it throws an error.

I can take this up and add validation check for each new rule. :)

Thanks in advance!!

Why does the stringsStreamToRules method in the RulesSource class use Watermark?

public static DataStream stringsStreamToRules(DataStream ruleStrings) {
return ruleStrings
.flatMap(new RuleDeserializer())
.name("Rule Deserialization")
.setParallelism(RULES_STREAM_PARALLELISM)
.assignTimestampsAndWatermarks(
new BoundedOutOfOrdernessTimestampExtractor(Time.of(0, TimeUnit.MILLISECONDS)) {
@OverRide
public long extractTimestamp(Rule element) {
// Prevents connected data+update stream watermark stalling.
return Long.MAX_VALUE;
}
});

}

failed to create LLB definition

hi, when i execute the command, docker build -t demo-fraud-webapp:latest -f webapp/webapp.Dockerfile webapp/, it turn out to be an error:
failed to solve with frontend dockerfile.v0: failed to create LLB definition: no match for platform in manifest sha256:31378686fda613ca2a502a1731b12ff2d7a8dacabec2160715ebecb461fd3642: not found
image

my computer is mac, m1, 11.4
i don't know if it is related to this error

Execution failed for task ':verifyGoogleJavaFormat'.

hi, I try to play with the READ.ME , the third step :

docker build -t flink-job-fraud-demo:latest -f flink-job/Dockerfile flink-job/

I get the failed info .

My env : Docker version 18.09.1, build 4c52b90 on Mac OS 10.12.6

details as follow :

Step 10/16 : RUN ./gradlew build
 ---> Running in ad0d25cf8f75
Starting a Gradle Daemon, 1 incompatible and 1 stopped Daemons could not be reused, use --status for details
:compileJava
:processResources
:classes
:jar
:startScripts UP-TO-DATE
:distTar
:distZip
:shadowJar
:startShadowScripts
:shadowDistTar
:shadowDistZip
:assemble
:compileTestJava
:processTestResources NO-SOURCE
:testClasses
:test
:verifyGoogleJavaFormat


The following files are not formatted properly:

/home/gradle/app/src/main/java/com/ververica/field/dynamicrules/RulesEvaluator.java
/home/gradle/app/src/test/java/com/ververica/field/dynamicrules/RulesEvaluatorTest.java
:verifyGoogleJavaFormat FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':verifyGoogleJavaFormat'.
> Problems: formatting style violations

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

* Get more help at https://help.gradle.org

BUILD FAILED in 45s

Execution failed for task ':verifyGoogleJavaFormat'

Hi, this error happen again to me but with different file: #1

This is the details:
`Step 10/17 : RUN ./gradlew build
---> Running in 90a4f4b0407f
Starting a Gradle Daemon, 1 incompatible and 1 stopped Daemons could not be reused, use --status for details
:compileJava
:processResources
:classes
:jar
:startScripts UP-TO-DATE
:distTar
:distZip
:shadowJar
:startShadowScripts
:shadowDistTar
:shadowDistZip
:assemble
:compileTestJava
:processTestResources NO-SOURCE
:testClasses
:test
:verifyGoogleJavaFormat

The following files are not formatted properly:

/home/gradle/app/src/main/java/com/ververica/field/dynamicrules/functions/DynamicAlertFunction.java
:verifyGoogleJavaFormat FAILED

FAILURE: Build failed with an exception.

  • What went wrong:
    Execution failed for task ':verifyGoogleJavaFormat'.

Problems: formatting style violations

  • Try:
    Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.

  • Get more help at https://help.gradle.org

BUILD FAILED in 44s
13 actionable tasks: 12 executed, 1 up-to-date
The command '/bin/sh -c ./gradlew build' returned a non-zero code: 1`

M2 chip hangs while giving same error

I tried with the M1 branch however it gives below logs in a loop.

fraud-detection-demo-kafka-cp-kafka-headless-1  | [main-SendThread(zoo1:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket error occurred: zoo1/172.20.0.2:2181: Connection refused
fraud-detection-demo-schema-registry-1          | [kafka-admin-client-thread | adminclient-1] WARN org.apache.kafka.clients.NetworkClient - [AdminClient clientId=adminclient-1] Connection to node -1 (kafka-cp-kafka-headless/172.20.0.3:9092) could not be established. Broker may not be available.
fraud-detection-demo-demo-1                     | 2023-08-10 14:09:57.546  WARN 1 --- [           main] org.apache.kafka.clients.NetworkClient   : [Consumer clientId=consumer-1, groupId=latency] Connection to node -1 could not be established. Broker may not be available.
fraud-detection-demo-kafka-cp-kafka-headless-1  | [main-SendThread(zoo1:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server zoo1/172.20.0.2:2181. Will not attempt to authenticate using SASL (unknown error)



Changing zookeper image and config as below solved the problem:


zookeeper:
    image: confluentinc/cp-zookeeper:7.3.0
    hostname: zookeeper
    container_name: zookeeper
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000

nested rules (AND / OR) support

Firstly, I would like to thank you for sharing this project with the community -- this is an awesome implementation of a rules engine with Flink for simple aggregations.

@afedulov, I was wondering what would be your suggestion to add support for nesting/complex rules. Do you have an idea on how would be the best way to implement it?

Example:

  • transactionAmount is greater than 100 AND lower than 200`

Unable to build docker images

Getting error while building docker image

docker build -t demo-fraud-webapp:latest -f webapp/webapp.Dockerfile webapp/
[+] Building 1.7s (5/5) FINISHED
 => [internal] load build definition from webapp.Dockerfile                                                                                                                                                                                                                0.0s
 => => transferring dockerfile: 974B                                                                                                                                                                                                                                       0.0s
 => [internal] load .dockerignore                                                                                                                                                                                                                                          0.0s
 => => transferring context: 34B                                                                                                                                                                                                                                           0.0s
 => CANCELED [internal] load metadata for docker.io/library/openjdk:8-jdk-alpine                                                                                                                                                                                           1.6s
 => [internal] load metadata for docker.io/library/node:10                                                                                                                                                                                                                 1.4s
 => ERROR [internal] load metadata for docker.io/library/maven:3.6.2-jdk-8-openj9                                                                                                                                                                                          1.6s
------
 > [internal] load metadata for docker.io/library/maven:3.6.2-jdk-8-openj9:
------
failed to solve with frontend dockerfile.v0: failed to create LLB definition: no match for platform in manifest sha256:31378686fda613ca2a502a1731b12ff2d7a8dacabec2160715ebecb461fd3642: not found

Made some modification to the versions but the source code has errors with ClassNotFoundException

Changes to docker files

flik-job/Dockerfile

-FROM flink:1.8.2
+FROM flink:1.15.1-java8
webapp/webapp.Dockerfile

# --- Maven Build
-FROM maven:3.6.2-jdk-8-openj9 as maven-build
+FROM maven:3.8.6-openjdk-11 as maven-build

# --- Main container
-FROM openjdk:8-jdk-alpine as main
+FROM openjdk:11 as main

Making the above cdhanges, builds the docker images successfully but getting error while running the conatiners

java.lang.NoSuchMethodError: org.apache.flink.api.common.state.OperatorStateStore.getSerializableListState(Ljava/lang/String;)Lorg/apache/flink/api/common/state/ListState;


java.lang.NoClassDefFoundError: org/apache/flink/shaded/guava18/com/google/common/collect/Lists


java.lang.NoSuchMethodError: org.apache.flink.api.common.functions.RuntimeContext.getMetricGroup()Lorg/apache/flink/metrics/MetricGroup;

If statement in Throttler doesn't do anything

You have Preconditions.checkArgument( maxRecordsPerSecond == -1 ... )

Then in the next line you do if(maxRecordsPerSecond == -1) which will never be the case.

I was loooking at this because I had question about backpressure with a source and it looks like it is not possible to backpressure a datasource you just gotta Throttle . I thought doing thread sleep might have other consequences in blocking the entire task.

https://stackoverflow.com/questions/61465789/how-do-i-create-a-flink-richparallelsourcefunction-with-backpressure

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.