Giter Site home page Giter Site logo

yaowenqiang / alldata Goto Github PK

View Code? Open in Web Editor NEW

This project forked from alldatacenter/alldata

0.0 2.0 0.0 777.9 MB

🔥🔥 BigData 💥 大数据 💥大数据AllData平台,通过二开大数据BigData生态组件,以及大数据BigData采集、大数据BigData存储、大数据BigData计算、大数据BigData开发来建设开源社区大数据BigData平台。联系作者: https://docs.qq.com/doc/DVFVMYUp6cFhSRVJs

Home Page: https://alldatacenter.github.io

License: Apache License 2.0

Shell 0.48% JavaScript 3.43% Ruby 0.01% C++ 17.63% Python 4.13% Perl 0.13% C 0.86% PHP 0.01% Objective-C 0.01% Java 63.95% Lua 1.53% Scala 2.61% Groovy 0.69% R 0.01% Go 2.57% C# 0.01% Assembly 1.18% Rust 0.75% PowerShell 0.01% Kotlin 0.04%

alldata's Introduction

AllData 一站式大数据平台


中文 | English

Stargazers over time

Stargazers over time

http://43.138.157.47:8013/dashboard 账号: poc 密码:123456

ElAdmin登录页面


image


首页


image


数据集成


image



image



image



image



image



image



image



image



image



image


元数据管理


image



image


元数据拾取


image



image



image


应用分析


image



image


系统菜单管理


image


元数据管理


image


数据质量


image


数据市场


image


数据标准


image


BI报表


image


数据资产


image


流程编排


image


Flink数据血缘初体验

1 结果预览


image


2 创建FlinkDDL

参考Resource/FlinkDDLSQL.sql

CREATE TABLE data_gen (

amount BIGINT

) WITH (

'connector' = 'datagen',

'rows-per-second' = '1',

'number-of-rows' = '3',

'fields.amount.kind' = 'random',

'fields.amount.min' = '10',

'fields.amount.max' = '11');

CREATE TABLE mysql_sink (

amount BIGINT,

PRIMARY KEY (amount) NOT ENFORCED

) WITH (

'connector' = 'jdbc',

'url' = 'jdbc:mysql://localhost:3306/test_db',

'table-name' = 'test_table',

'username' = 'root',

'password' = '123456',

'lookup.cache.max-rows' = '5000',

'lookup.cache.ttl' = '10min'

);

INSERT INTO mysql_sink SELECT amount as amount FROM data_gen;

3 执行com.platform.FlinkLineageBuild

获取结果

1、Flink血缘构建结果-表:

[LineageTable{id='4', name='data_gen', columns=[LineageColumn{name='amount', title='amount'}]},

LineageTable{id='6', name='mysql_sink', columns=[LineageColumn{name='amount', title='amount'}]}]

表ID: 4

表Namedata_gen

表ID: 4

表Namedata_gen

表-列LineageColumn{name='amount', title='amount'}

表ID: 6

表Namemysql_sink

表ID: 6

表Namemysql_sink

表-列LineageColumn{name='amount', title='amount'}

2、Flink血缘构建结果-边:

[LineageRelation{id='1', srcTableId='4', tgtTableId='6', srcTableColName='amount', tgtTableColName='amount'}]

表-边: LineageRelation{id='1', srcTableId='4', tgtTableId='6', srcTableColName='amount', tgtTableColName='amount'}

AllData Doris


image


AllData全新定制一站式场景化大数据中台


image


AllData Ambari全新自定义Apache组件栈大数据中台


image


大数据组件管理DOCKER FOR DATA PLATFORM

1、配置主机服务HOST


image


2、启动大数据集群


image


3、YARN正常访问


image


4、HIVE正常使用


image


5、HDFS正常访问


image


6、ES健康检测


image


7、KIBANA UI访问


image


8、PRESTO UI访问


image


9、HBASE正常访问


image


10、FLIKN RUNTIME WEB 正常访问


image


使用Docker/K8S云原生方案-控制各种组件起停

1、BUSINESS FOR ALL DATA PLATFORM 商业项目

2、BUSINESS FOR ALL DATA PLATFORM 计算引擎

3、DEVOPS FOR ALL DATA PLATFORM 运维引擎

4、DATA GOVERN FOR ALL DATA PLATFORM 数据治理引擎

5、DATA Integrate FOR ALL DATA PLATFORM 数据集成引擎

6、AI FOR ALL DATA PLATFORM 人工智能引擎

7、DATA ODS FOR ALL DATA PLATFORM 数据采集引擎

8、OLAP FOR ALL DATA PLATFORM OLAP查询引擎

9、OPTIMIZE FOR ALL DATA PLATFORM 性能优化引擎

10、DATABASES FOR ALL DATA PLATFORM 分布式存储引擎

Flink Table Store && Lake Storage POC

2.1 SQL~Flink table store poc

set execution.checkpointing.interval=15sec;

CREATE CATALOG alldata_catalog WITH (

'type'='table-store',

'warehouse'='file:/tmp/table_store'

);

USE CATALOG alldata_catalog;

CREATE TABLE word_count (

word STRING PRIMARY KEY NOT ENFORCED,

cnt BIGINT

);

CREATE TEMPORARY TABLE word_table (

word STRING

) WITH (

'connector' = 'datagen',

'fields.word.length' = '1'

);

INSERT INTO word_count SELECT word, COUNT(*) FROM word_table GROUP BY word;

-- POC Test OLAP QUERY

SET sql-client.execution.result-mode = 'tableau';

RESET execution.checkpointing.interval;

SET execution.runtime-mode = 'batch';

SELECT * FROM word_count;

-- POC Test Stream QUERY

-- SET execution.runtime-mode = 'streaming';

-- SELECT interval, COUNT(*) AS interval_cnt FROM

-- (SELECT cnt / 10000 AS interval FROM word_count) GROUP BY interval;

2.2 Flink Runtime Web


image


2.3 Flink Batch


image


2.4 Flink Olap Read


image


2.5 Flink Stream Read


image


Dlink二开新增Flink1.16.0支持

1、Dlink配置Flink Table Store相关依赖


image


### 2、Dlink启动并运行成功

image


### 3、OLAP查询

image


4、Flink1.16.0 Dlink流式读

4.1 Stream Read 1


image


> 4.2 Stream Read 2

image


Architecture


image


image


Component Description Important Composition
aiStudio AI STUDIO FOR ALL DATA PLATFORM artificial intelligence engine 人工智能引擎
aiStudioTasks AI STUDIO TASKS FOR ALL DATA PLATFORM MLAPPS Engine 人工智能模型任务
assembly WHOLE PACKAGE BUILD FOR ALL DATA PLATFORM assembly engine 整包构建引擎
buried BURIED FOR ALL DATA PLATFORM data acquisition engine 埋点解决方案
buriedShop BURIED SHOP FOR ALL DATA PLATFORM commerce engine 多端商城
buriedTrade BURIED TRADE FOR ALL DATA PLATFORM commerce engine 商业系统
crawlerData CRAWLER DATA TRADE FOR ALL DATA PLATFORM commerce engine 爬虫任务
crawlerPlatform CRAWLER PLATFORM FOR ALL DATA PLATFORM commerce engine 爬虫引擎系统
dataOlap OLAP FOR ALL DATA PLATFORM OLAP query engine 混合OLAP查询引擎
dataSync DATA Integrate FOR ALL DATA PLATFORM Data Integration Engine 数据集成引擎
dataSRE DATA SRE FOR ALL DATA PLATFORM OLAP query engine 智能大数据运维引擎
deploy DEPLOY FOR ALL DATA PLATFORM OLAP query engine 安装部署
documents DOCUMENT FOR ALL DATA PLATFORM OLAP query engine 官方文档
govern DATA GOVERN FOR ALL DATA PLATFORM Data Governance Engine 数据治理引擎
oneHub ONE HUB FOR ALL DATA PLATFORM ONE HUB Engine AllData总部前后端解决方案
oneLake ONE LAKE FOR ALL DATA PLATFORM ONE LAKE engine 数据湖引擎
studioSystem STUDIO SYSTEM FOR ALL DATA PLATFORM DEVELOP IDE ENGINE 大数据流批计算平台
studioTasks STUDIO TASKS FOR ALL DATA PLATFORM Data Task Engine 大数据流批计算任务
docs Document 文档
AllData AllData社区项目通过二开大数据生态组件,以及大数据采集、大数据存储、大数据计算、大数据开发来建设一站式大数据平台 Github一站式开源大数据平台AllData社区项目

AllData社区商业计划图

image


AllData社区项目业务流程图

image


AllData社区项目树状图

image


全站式AllData产品路线图


image


AllData社区项目时间旅行

image


实时推荐系统业务流程图

image


AllData总部前后端解决方案

包括AllData前后端解决方案、多租户运维平台前后端

基于eladmin + tenant 建设AllData前后端解决方案

1、AllData前端解决方案 oneHub/eladmin-web

2、AllData后端解决方案 oneHub/eladmin

3、多租户运维平台前端 oneHub/tenant

4、多租户运维平台前端 oneHub/tenantBack

image


image


image



image

image

image

image


Integration

Data Quality


image


image


image


image


image


image



Livy访问查看JOB


image


image


Flink1.16 OGG-JSON解析


image

image

image

S3 Hudi成功写入

image

image

1、数据平台

AllData is one of the few open source big data platform projects on Github. It will develop into a successful solution to solve a series of problems in big data e-commerce scenarios. It will also become a general big data base for other developers to use and Contribution, my original intention is to create a product that is useful to society.

2、商城展示


image

image

image

image

image

image


image

image

image

image

image

image

image



image

image

image

image


3、数据来源

	商城前台:
		mall-shopping-app: 商城App
		mall-shopping-app-service: 商城App服务
		mall-shopping-wc: 商城小程序
		mall-shopping-mobile: 商城前台
		mall-shopping-pc: 商城pc端
		pcAdminService: 商城pc端服务
		mobileService: 商城前台服务(小程序和前台接入此接口)
	商城后台:
		mall-admin-web: 商城后台
		pcAdminService: 商城后台服务

4、数据收集

log-collect-server:
服务端日志收集系统
log-collect-client:
支持各app集成的客户端SDK,负责收集app客户端数据;
data-import-export:
基于DataX实现数据集成(导入导出)
data-spider:
爬虫平台支持可配置的爬取公网数据的任务开发;

image

image

image


5、数据存储

分布式文件系统:hdfs
分布式数据库:hbase、mongodb、elasticsearch
分布式内存存储系统:redis

6、数据计算

compute-mr(离线计算): Hive、MR
compute-realtime(流计算): storm、flink
multi-dimension-analysis(多维度分析): kylin, spark

7、数据开发

task-schedular: 任务调度
task-ops: 任务运维

image

image

image


8、数据产品

data-face: 数据可视化
data-insight: 用户画像分析

9、数据应用

system-recommender: 推荐
system-ad: 广告
system-search: 搜索
system-anti-cheating: 反作弊
system-report-analysis: 报表分析
system-elk: ELK日志系统,实现日志搜索平台
system-apm: skywalking监控平台
system-deploy: k8s,scala,playframework,docker打包平台。
job-schedule: 任务提交平台

image


Installation

10、启动配置教程

10.1 启动前,打包dubbo-servie项目,进入dubbo目录,

执行mvn clean package -DskipTests=TRUE打包,然后执行mvn install.

10.2 启动dubbo项目,配置tomcat端口为8091

image

10.3 启动商城项目的多个子系统

后台:访问http://localhost:8090

10.3.1、前端:启动mall-admin-web项目,进入项目目录,执行npm install,然后执行npm run dev;

10.3.2、后端:启动pcAdminService/mall-admin-search项目,

配置tomcat端口为8092,接着启动pcManage项目,tomcat端口配置为8093;

image

image

前台:小程序手机预览,移动端访问:http://localhost:6255

10.3.3、小程序和移动端

10.3.3.1、前端:商城小程序,启动mall-shopping-wc项目,

安装微信开发者工具,配置开发者key和secret,

使用微信开发者工具导入即可,然后点击编译,可以手机预览使用。

image

10.3.3.2、前端:商城移动端,启动mall-shopping-mobile,

进入项目目录,执行npm install和npm run dev;

10.3.3.3、后端:小程序和移动端用的是同一个后台服务,

启动mobileService项目,进入项目目录,配置tomcat端口8094

image

10.3.4、商城PC端 访问http://localhost:8099

10.3.4.1、前端:启动mall-shopping-pc项目,

进入项目目录,执行npm install和npm run dev;

10.3.4.2、后端:启动pcAdminService项目,配置tomcat端口为8095;

image

11、DevOPS

11.1 容器化部署system-deploy

image

image

11.2、自动化运维平台system-devops

image

11.3、使用Kong作为调用中心网关入口system-api-gateway

image

image

11.4、日志中心system-elk

image

11.5、告警平台system-alarm-platform

11.6 监控系统

image

11.7 数据采集

image

11.8 数据展示

image

11.9 监控中心system-apm

image

11.10 使用Apollo作为配置中心system-config

image

Community

12、近期进行社区修整,闭关期间,仅保留微信群

联系作者: https://docs.qq.com/doc/DVFVMYUp6cFhSRVJs

alldata's People

Contributors

1820586026 avatar alldatafounder avatar ccckdi avatar vue-penghong avatar yg9538 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.