uber-archive / athenax Goto Github PK

View Code? Open in Web Editor NEW

1.2K 1.2K 289.0 286 KB

SQL-based streaming analytics platform at scale

License: Apache License 2.0

Java 96.40% FreeMarker 0.94% Python 2.66%

analytics calcite data flink sql stream streaming uber

athenax's People

Contributors

Stargazers

Watchers

Forkers

mollyvorwerck emrul rtvt123 mbrukman haohui lwelch45 ashutoshpw hushuitian kalidindis tomzhang tchen0123 tstmkrsafrica forvendettaw euwen cgvarela wycharry xswtygirlx ppayaoxing flashtony2005 oopsoutofmemory chinpeng marvel-007 hongxiuzhe charlla fattyhan wassimz hedelma benjamesbabala guptam vbalaut kantkatze billliuatuber vincent-poon zinschj zmyer dhanuka84 christianscreen nvankaam innomentats micseb hexiaofeng luoxuehuan snowphy nguyenvanthan suhuaguo zhangwei5095 cybernetics songfang skalva404 joeyhacker makeyang dongbin86 naheedmk suez1224 vivekgarg20 vivian1020 liurenjie1024 xiashuijun inspirejar xiaodin1 amitkayal olddriver-lab royadityak georgehon wilesun 383747787 slgobinath imskyer binglind nunb alfiyazi rogervaas naveenholla yezhenbao raghu999 guruguha wangzaidali enterstudio vishalbelsare knowledgehacker lunasxk xueyumusic peppeweb himanshpal msocheat tcx1000 chaitanyaphalak phenixmzy xarles lhfei lubozhan vkanneganti55 deanwei jerryjzhang maduhu qianqian20170929 sansom jonathanbowker hbisme yanxiaobin-ben

athenax's Issues

compilation error.

Can't find class ExtendedJobDefinition JobDefinition JobDefinitionDesiredstate etc..

where is the flink.jar

hello, thanks your work, AthenaX is very useful for us. And I am trying to use the AthenaX, but I have encountered some problems:

how to get the flink.jar? when I run 'mvn clean install' in flink directory, I only find the build-target directory, can't find the flink.jar.
how the AthenaX read the flink.jar if we set flink.uber.jar.location: hdfs:///app/athenax/flink.jar.

thanks for your reply.

Some classes not found

Below classese are not found in project:
com.uber.athenax.backend.api.JobDefinition
com.uber.athenax.backend.api.JobsApiService
com.uber.athenax.backend.api.NotFoundException

Some questions for test

Hello.
I have some questions while following up the document for quick start.

It seems yarn need to be installed for working on. If it is right, how about adding on requirements in docs?
I'm not sure how localize.resources and additional.jars are working. Could you explain more in detail, or update a sample?
Is it okay to setup local file path of jar file if it is stored in local computer instead of hdfs path?
I couldn't figure out the roll of catalog.impl

Sorry for lots of stuffs. Maybe it will be better if you could offer a clear sample for this.

Thanks.

Need Detailed Documentation

It is useful project and it needs a detailed documentation.

how to start?

Follow the doc, however, i don't know how to start the athenax.

I must need yarn and hdfs?
Does it have standalong model just like Spark?

I think the detail doc is necessary for the open source project.

Thank you!

cannot compiler AthenaX

Downloading: https://repo.maven.apache.org/maven2/org/apache/calcite/calcite-core/1.16.0/calcite-core-1.16.0.jar
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] AthenaX ........................................... SUCCESS [0.192s]
[INFO] AthenaX Public APIs ............................... SUCCESS [34:56.549s]
[INFO] Common libraries for AthenaX connectors ........... SUCCESS [15:04.446s]
[INFO] AthenaX Compiler .................................. FAILURE [1:01:06.466s]
[INFO] AthenaX REST Backend .............................. SKIPPED
[INFO] AthenaX Kafka Connector ........................... SKIPPED
[INFO] Integration tests of AthenaX ...................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 1:51:07.791s
[INFO] Finished at: Thu Aug 30 16:50:59 CST 2018
[INFO] Final Memory: 31M/531M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-dependency-plugin:3.0.2:unpack (unpack-parser-template) on project athenax-vm-compiler: Unable to find/resolve artifact. Could not transfer artifact org.apache.calcite:calcite-core:jar:1.16.0 from/to central (https://repo.maven.apache.org/maven2): GET request of: org/apache/calcite/calcite-core/1.16.0/calcite-core-1.16.0.jar from central failed: Read timed out -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :athenax-vm-compiler
You have mail in /var/spool/mail/root

Hi, I'd like to try out AthenaX and I'm interested in the current status of AthenaX since the last commit is almost a year ago. Is the project still maintained by Uber? And is there any future plan, such as supporting higher versions of Flink?

how to fill config file AthenaX.yaml

1 catalog.impl need What to fill in and the class must need develop?
2 additional.jars: -- hdfs://namespace/athenax/connectors.jar if we are need use kafka-connect, is fill athenax-vm-connector-kafka-0.2-SNAPSHOT.jar ?

Suggestion

Supporting k8s

Does there exist any plan to migrate the project to k8s since Flink on k8s is natively supported

can not find com.uber.athenax.backend.api ?

clone the project,but can not find com.uber.athenax.backend.api ,such as
import com.uber.athenax.backend.api.JobDefinition;
import com.uber.athenax.backend.api.JobsApiService;
import com.uber.athenax.backend.api.NotFoundException;

Add backfill support in AthenaX

fail to build project

follow doc:
$ git clone https://github.com/uber/AthenaX.git
$ mvn clean install

env : mac pro , java8 , maven 3.5

should I skip maven test ? It seem to fail to run yarn container test.

2019-06-23 23:49:03,242 [ContainersLauncher #0] WARN nodemanager.DefaultContainerExecutor (DefaultContainerExecutor.java:launchContainer(230)) - Exception from container-launch with container ID: container_1561304918987_0001_01_000001 and exit code: 127
ExitCodeException exitCode=127:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
at org.apache.hadoop.util.Shell.run(Shell.java:482)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-06-23 23:49:03,244 [ContainersLauncher #0] INFO nodemanager.ContainerExecutor (ContainerExecutor.java:logOutput(286)) - Exception from container-launch.
2019-06-23 23:49:03,244 [ContainersLauncher #0] INFO nodemanager.ContainerExecutor (ContainerExecutor.java:logOutput(286)) - Container id: container_1561304918987_0001_01_000001
2019-06-23 23:49:03,244 [ContainersLauncher #0] INFO nodemanager.ContainerExecutor (ContainerExecutor.java:logOutput(286)) - Exit code: 127
2019-06-23 23:49:03,244 [ContainersLauncher #0] INFO nodemanager.ContainerExecutor (ContainerExecutor.java:logOutput(286)) - Stack trace: ExitCodeException exitCode=127:
2019-06-23 23:49:03,244 [ContainersLauncher #0] INFO nodemanager.ContainerExecutor (ContainerExecutor.java:logOutput(286)) - at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)

StartJobITest ERROR

window 10 , idea to run StartJobITest fail for message:
Caused by: ExitCodeException exitCode=1: CreateSymbolicLink error (1314)

at org.apache.hadoop.util.Shell.runCommand(Shell.java:585)
at org.apache.hadoop.util.Shell.run(Shell.java:482)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:776)
at org.apache.hadoop.yarn.server.MiniYARNCluster.<init>(MiniYARNCluster.java:177)
... 25 more

is there one complete Example?

I can not find one complete example !
how to write one complete sql about Kafka input and output?

missing codes for module athenax-vm-connector-common?

Dockerize AthenaX

Hi,

Are there any plans to dockerize AthenaX and provide in a container or someone is already working on this?

lack of com.uber.athenax.vm.compiler.parser.impl.ParseException and com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl in vm-compiler module,and this causes compile error

Unable to compile and build project

Please help.

Getting build failure.

window 10 , idea to run StartJobITest fail

When I run StartJobITest.java , it entered an infinite loop.
In debug mode , it happends in yarnCluster.init(yarnConf) , yarn cluster cannot change state from "NOTINITED" to "INITED" !
Can anyone help me ?

Help in getting started

Hi ,

AthenaX looks like a perfect fit for our usage, we are in the process of building a platform where in users can submit jobs , aggregate and perform sinks. Sadly the docs are not helping, can u share a sample docs which helps understand on how kafka works and sink with es Some of the docs are not relevant with class files not found

Invalid location of the db--LevelDB

Exception in thread "main" java.lang.NullPointerException: Invalid location of the db
at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:75)
at com.uber.athenax.backend.server.jobs.LevelDBJobStore.open(LevelDBJobStore.java:54)
at com.uber.athenax.backend.server.ServerContext.start(ServerContext.java:63)
at com.uber.athenax.backend.AthenaXServer.start(AthenaXServer.java:43)
at com.uber.athenax.backend.AthenaXServer.main(AthenaXServer.java:67)

some issue in this project?

1.Is there some Class missed in the project on Github? Class "ExtendedJobDefinition, JobDefinition, JobDefinitionDesiredstate and JobDefinitionResource" are not in package com.uber.athenax.backend.api.

How do i sumbit SQL?

@mranney @libber @haohui @djKooks @myhrvold

detail documentation

Any more detail documentation please

problem when use window operation

The sql "select * from input.transaction" is correct.

But the below sql "SELECT COUNT(1) AS total_record FROM input.transaction GROUP BY TUMBLE(proctime,INTERVAL '1' MINUTE)" is wrong. It seems that TUMBLE function is not registered.

2017-12-20 14:29:51,862 [pool-1-thread-1] WARN jobs.WatchdogPolicyDefault (WatchdogPolicyDefault.java:onHealthCheckReport(62)) - Failed to instantiate the query 'SELECT COUNT(1) AS total_record FROM input.transaction GROUP BY TUMBLE(proctime,INTERVAL 1 MINUTE)' on foo
com.uber.athenax.vm.compiler.parser.impl.ParseException: Encountered "" at line 1, column 81.

at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.generateParseException(SqlParserImpl.java:23700)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.jj_consume_token(SqlParserImpl.java:23511)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.Arg(SqlParserImpl.java:760)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.FunctionParameterList(SqlParserImpl.java:696)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.NamedFunctionCall(SqlParserImpl.java:5200)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.AtomicRowExpression(SqlParserImpl.java:3219)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.Expression3(SqlParserImpl.java:3070)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.Expression2b(SqlParserImpl.java:2881)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.Expression2(SqlParserImpl.java:2905)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.Expression(SqlParserImpl.java:2860)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.GroupingElement(SqlParserImpl.java:2220)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.GroupingElementList(SqlParserImpl.java:2172)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.GroupByOpt(SqlParserImpl.java:2162)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.SqlSelect(SqlParserImpl.java:962)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.LeafQuery(SqlParserImpl.java:571)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.LeafQueryOrExpr(SqlParserImpl.java:2846)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.QueryOrExpr(SqlParserImpl.java:2768)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.OrderedQueryOrExpr(SqlParserImpl.java:490)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.SqlStmt(SqlParserImpl.java:793)
at com.uber.athenax.vm.compiler.parser.impl.SqlParserImpl.SqlStmtsEof(SqlParserImpl.java:902)
at com.uber.athenax.vm.compiler.planner.Planner.parse(Planner.java:102)
at com.uber.athenax.vm.compiler.planner.Planner.sql(Planner.java:57)
at com.uber.athenax.backend.server.jobs.JobManager.compile(JobManager.java:81)
at com.uber.athenax.backend.server.jobs.WatchdogPolicyDefault.onHealthCheckReport(WatchdogPolicyDefault.java:59)
at com.uber.athenax.backend.server.jobs.JobManager.onUpdatedInstances(JobManager.java:89)
at com.uber.athenax.backend.server.yarn.InstanceManager.scanAll(InstanceManager.java:219)
at com.uber.athenax.backend.server.yarn.InstanceManager$1.run(InstanceManager.java:71)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)

@haohui

Need help for configuring AthenaX

Hi Folks,

I have raised my first question athenax-users - Google Groups This was that question Question

This question is extended version of that question.

I have setup my project as follows this is my .yaml configuration file

athenax.master.uri: http://master152:8083
catalog.impl: com.pearson.athenax.catalog.impl.MydashboardCatalogProvider
clusters:
1234567:
yarn.site.location: hdfs://master152:8020/athenax/yarn-site.xml
athenax.home.dir: hdfs://master152:8020/athenax
flink.uber.jar.location: hdfs://master152:8020/athenax/flink-dist_2.11-1.7-SNAPSHOT.jar
localize.resources:
- hdfs://master152:8020/athenax/log4j.properties
additional.jars:
- hdfs://master152:8020/athenax/connectors.jar
- hdfs://master152:8020/athenax/foo.jar
extras:
jobstore.leveldb.file: /home/cyclone/randika/athenax/db

After that I have start the AthenaX server with no errors. Then I have submitted the new job through the API call using postman here is that.

http://master152:8083/mydashboard/jobs/new-jobs

In here I need to mention i have changed the BASE_PATH

/ws/v1 as /mydashboard

and most important thing is I have change the implementation of this /jobs/new-jobs api call as follows

`@Override
  public Response allocateNewJob(SecurityContext securityContext) throws NotFoundException {
/*return Response.ok().entity(
    Collections.singletonMap("job-uuid", ctx.jobManager().newJobUUID())
).build();*/
// above is the original code...
  LOG.info("Allocate new job with default values... START...");
  JobsApi api = new JobsApi();
  UUID uuid = ctx.jobManager().newJobUUID();
    try {
		JobDefinitionDesiredstate state = new JobDefinitionDesiredstate()
		    .clusterId("1234567")
		    .resource(new JobDefinitionResource().vCores(1L).memory(2048L));
		JobDefinition job = new JobDefinition()
		    .query("SELECT * FROM input.foo")
		    .addDesiredStateItem(state);
		api.updateJob(UUID.fromString(uuid.toString()), job);
		LOG.info("Allocate new job with default values... END...");
		return Response.ok().entity(
		        Collections.singletonMap("job-uuid", uuid)
		    ).build();
	} catch (ApiException e) {
		LOG.error("Exception occur inside the");
		e.printStackTrace();
	}
    return null;

After calling that api call via postman, I think job will successfully stored inside the path jobstore.leveldb.file which is mention on configuration file. And it has return job id

{ "job-uuid": "33df9d53-420e-45cf-bd4c-669f8fec66d1" }

My First question is, earlier (which is i have configure on very first time) after submitting the job, yarn it will try to run that job on top of yarn (As per my understanding this can be totally wrong) and fail (I have mention how it fail at my First question which I have asked on here. but now it won't happen. what I figure out is, earlier it has call scanAll() method and while execution that method it has fail. but now I think scan all method is not execute

My second question is I need to understand what was the real purpose of using AthenaX and how to use that proper way. and can you give some brief Explanation about this AthenaX platform It was highly appreciated.
for Ex.
Earlier I thought using athenaX platform we can submit our own custom flink Streaming job and we can create dynamical query and get some result from that. After that using that result we can do some predictive analytics things.

This was the basic idea I had my own about athenaX

One more thing that i need to tell about my configuration file. on my Configuration file it has mention foo.jar and connections.jar under additional.jars: section this two jar are just jars which is have no content in side that i have added those jar because of AthenaX getting started doc it has mention those jars are required.

Thanks

submit to yarn fail

Hadoop version: 3.0.0-cdh6.2.0
Error: Could not find or load main class org.apache.flink.yarn.YarnApplicationMasterRunner

Dead project?

No ways to start, compilation failing...

How to start ?

In situations where documentation is extremely scarce .
I try to build environment via http://athenax.readthedocs.io/en/latest/getting_started
But can't find the com. Foo. MyCatalogProvider, I don't find this class, also does not have the detailed case can refer to .
I think detailed documentation is required for open source projects.
thank you

Classes are missing in package api

I can't find these classes in code base. Is that normal?