We are in Apache Incubator now, please visit Apache Griffin for latest updates.
ebay / griffin Goto Github PK
View Code? Open in Web Editor NEWModel driven data quality service
Home Page: https://ebay.github.io/griffin/
License: Other
Model driven data quality service
Home Page: https://ebay.github.io/griffin/
License: Other
We are in Apache Incubator now, please visit Apache Griffin for latest updates.
When I use yarn-cluster to submit spark jobs,the error log is as belows:
Application application_1512628890181_92576 failed 2 times due to AM Container for appattempt_1512628890181_92576_000002 exited with exitCode: -1000
For more detailed output, check application tracking page:http://xy180-wecloud-198:8088/proxy/application_1512628890181_92576/Then, click on links to logs of each attempt.
Diagnostics: File does not exist: hdfs://wecloud-cluster/user/pgxl/.sparkStaging/application_1512628890181_92576/com.databricks_spark-avro_2.10-2.0.1.jar
java.io.FileNotFoundException: File does not exist: hdfs://wecloud-cluster/user/pgxl/.sparkStaging/application_1512628890181_92576/com.databricks_spark-avro_2.10-2.0.1.jar
I don't kown why the parm in the config file doesn't work.I change to yarn-client ,the error log is as belows:
Warning: Skip remote jar hdfs://wecloud-cluster/project/pgxl/griffin/griffin-measure.jar.
Warning: Skip remote jar hdfs://wecloud-cluster/project/pgxl/griffin/datanucleus-api-jdo-3.2.6.jar.
Warning: Skip remote jar hdfs://wecloud-cluster/project/pgxl/griffin/datanucleus-core-3.2.10.jar.
Warning: Skip remote jar hdfs://wecloud-cluster/project/pgxl/griffin/datanucleus-rdbms-3.2.9.jar.
java.lang.ClassNotFoundException: org.apache.griffin.measure.Application
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.util.Utils$.classForName(Utils.scala:175)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:689)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
it seems the parm sparkJob.spark.jars.packages works,but using yarn-client mode i can't use jars on hdfs.I can't find source codes about how to process the config about sparkjars,
can you give me some suggestion?thank you very much
Hello author, I have two questions. I want to get your help and study this execllent framework.
First, In user guide, it describe asset type including hdfsfile and hivetable. But, it only has hivetable when I deploy the ROOT.war to tomcat.
Second. In my opinion, if i select hivetable asset type. It can provide a way to connect with our hive table. But, on web page, it only let us enter a hdfs path. How can I create a userful data asset ?
Thank you very much!
Avoid checking-in external dependency griffin-ui/bower_components
in griffin's git repository, it may cause license issues and also may make the first code clone too large and slow.
The suggested approach is to ignore the directory in .gitignore
and automatically create and download during bower install
.
In the dq_job table of unitdb0, column endtime should be Long type, but it is set as Integer
when i run in docker it works, but is there a way content with my local hive database? And when i try run GRIFFIN at local,i can use my hive database,but the job cant submit Caused by: org.springframework.web.client.HttpClientErrorException: 400 Missing Required Header for CSRF protection. so,i turn the csrf protection value from true to FALSE ,the job can submit and the state was running,but after that the state became unknow,and i got a WARN from my terminal BA, emp_testm-BA-test-1507603777000job Instance has some null filed (state or appId). java.lang.NullPointerException. could anybody hlep me with this?
THANKS!
Error mesages when starting docker container
2016-12-21T05:45:48.180 ERROR [org.springframework.scheduling.support.TaskUtils$LoggingErrorHandler] - Unexpected error occurred in scheduled task.
com.mongodb.MongoException$Network: Read operation to server localhost/127.0.0.1:27017 failed on database unitdb0
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:216)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:288)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:273)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)
at com.ebay.oss.griffin.repo.BaseRepo.getAll(BaseRepo.java:66)
at com.ebay.oss.griffin.service.DqScheduleServiceImpl.createJobToRunBySchedule(DqScheduleServiceImpl.java:122)
at com.ebay.oss.griffin.service.DqScheduleServiceImpl.schedulingJobs(DqScheduleServiceImpl.java:110)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:64)
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:53)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at com.mongodb.DBPort._open(DBPort.java:223)
at com.mongodb.DBPort.go(DBPort.java:125)
at com.mongodb.DBPort.call(DBPort.java:92)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:244)
... 22 more
I can't delete /var/lib/mongo/mongo.lock
bash-4.1# mongod -f /etc/mongod.conf about to fork child process, waiting until server is ready for connections. forked process: 3608 ERROR: child process failed, exited with error number 100
Environment
docker1.11
centos7.2
After a new model is created, the statistics and metrics in the right side of the page are not changed. These data should be updated.
Hello, I followed the user guide.Firstly, I created a assert, and then,I created a model . But, the model was always inital status ,and it has never been executed. Can you help me solve this problem ? Thank you very much.
Exception in thread "main" javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true, username = hive. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:436)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:788)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:333)
at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:202)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at javax.jdo.JDOHelper$16.run(JDOHelper.java:1965)
at java.security.AccessController.doPrivileged(Native Method)
at javax.jdo.JDOHelper.invoke(JDOHelper.java:1960)
at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1166)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:365)
at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:394)
at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:291)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:258)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:57)
at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:66)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:593)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:571)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:624)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5756)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5751)
at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:5984)
at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:5909)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
when post2Livy ,the griffin service get exception as belown :
[ERROR 5251 --- [ryBean_Worker-5] o.a.g.c.j.SparkSubmitJob : Post to livy error. 500 Internal Server Erro]
ant the livy get exception as belown :
18/06/02 11:18:24 ERROR server.SessionServlet$: internal error
java.lang.NullPointerException
at org.apache.livy.server.batch.CreateBatchRequest.toString(CreateBatchRequest.scala:54)
at java.lang.String.valueOf(String.java:2994)
at java.lang.StringBuilder.append(StringBuilder.java:131)
at scala.StringContext.standardInterpolator(StringContext.scala:125)
at scala.StringContext.s(StringContext.scala:95)
at org.apache.livy.server.batch.BatchSession$$anonfun$create$1.apply(BatchSession.scala:111)
at org.apache.livy.server.batch.BatchSession$$anonfun$create$1.apply(BatchSession.scala:111)
at org.apache.livy.Logging$class.info(Logging.scala:39)
at org.apache.livy.server.batch.BatchSession$.info(BatchSession.scala:45)
at org.apache.livy.server.batch.BatchSession$.create(BatchSession.scala:111)
at org.apache.livy.server.batch.BatchSessionServlet.createSession(BatchSessionServlet.scala:46)
at org.apache.livy.server.batch.BatchSessionServlet.createSession(BatchSessionServlet.scala:35)
at org.apache.livy.server.SessionServlet$$anonfun$16.apply(SessionServlet.scala:132)
at org.scalatra.ScalatraBase$class.org$scalatra$ScalatraBase$$liftAction(ScalatraBase.scala:270)
at org.scalatra.ScalatraBase$$anonfun$invoke$1.apply(ScalatraBase.scala:265)
at org.scalatra.ScalatraBase$$anonfun$invoke$1.apply(ScalatraBase.scala:265)
at org.scalatra.ApiFormats$class.withRouteMultiParams(ApiFormats.scala:178)
at org.apache.livy.server.JsonServlet.withRouteMultiParams(JsonServlet.scala:39)
at org.scalatra.ScalatraBase$class.invoke(ScalatraBase.scala:264)
at org.scalatra.ScalatraServlet.invoke(ScalatraServlet.scala:49)
at org.scalatra.ScalatraBase$$anonfun$runRoutes$1$$anonfun$apply$8.apply(ScalatraBase.scala:240)
at org.scalatra.ScalatraBase$$anonfun$runRoutes$1$$anonfun$apply$8.apply(ScalatraBase.scala:238)
at scala.Option.flatMap(Option.scala:171)
at org.scalatra.ScalatraBase$$anonfun$runRoutes$1.apply(ScalatraBase.scala:238)
at org.scalatra.ScalatraBase$$anonfun$runRoutes$1.apply(ScalatraBase.scala:237)
at scala.collection.immutable.Stream.flatMap(Stream.scala:489)
at org.scalatra.ScalatraBase$class.runRoutes(ScalatraBase.scala:237)
at org.scalatra.ScalatraServlet.runRoutes(ScalatraServlet.scala:49)
at org.scalatra.ScalatraBase$class.runActions$1(ScalatraBase.scala:163)
at org.scalatra.ScalatraBase$$anonfun$executeRoutes$1.apply$mcV$sp(ScalatraBase.scala:175)
at org.scalatra.ScalatraBase$$anonfun$executeRoutes$1.apply(ScalatraBase.scala:175)
at org.scalatra.ScalatraBase$$anonfun$executeRoutes$1.apply(ScalatraBase.scala:175)
at org.scalatra.ScalatraBase$class.org$scalatra$ScalatraBase$$cradleHalt(ScalatraBase.scala:193)
at org.scalatra.ScalatraBase$class.executeRoutes(ScalatraBase.scala:175)
at org.scalatra.ScalatraServlet.executeRoutes(ScalatraServlet.scala:49)
at org.scalatra.ScalatraBase$$anonfun$handle$1.apply$mcV$sp(ScalatraBase.scala:113)
at org.scalatra.ScalatraBase$$anonfun$handle$1.apply(ScalatraBase.scala:113)
at org.scalatra.ScalatraBase$$anonfun$handle$1.apply(ScalatraBase.scala:113)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at org.scalatra.DynamicScope$class.withResponse(DynamicScope.scala:80)
at org.scalatra.ScalatraServlet.withResponse(ScalatraServlet.scala:49)
at org.scalatra.DynamicScope$$anonfun$withRequestResponse$1.apply(DynamicScope.scala:60)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at org.scalatra.DynamicScope$class.withRequest(DynamicScope.scala:71)
at org.scalatra.ScalatraServlet.withRequest(ScalatraServlet.scala:49)
at org.scalatra.DynamicScope$class.withRequestResponse(DynamicScope.scala:59)
at org.scalatra.ScalatraServlet.withRequestResponse(ScalatraServlet.scala:49)
at org.scalatra.ScalatraBase$class.handle(ScalatraBase.scala:111)
at org.scalatra.ScalatraServlet.org$scalatra$servlet$ServletBase$$super$handle(ScalatraServlet.scala:49)
at org.scalatra.servlet.ServletBase$class.handle(ServletBase.scala:43)
at org.apache.livy.server.SessionServlet.org$scalatra$MethodOverride$$super$handle(SessionServlet.scala:41)
at org.scalatra.MethodOverride$class.handle(MethodOverride.scala:28)
at org.apache.livy.server.SessionServlet.org$scalatra$GZipSupport$$super$handle(SessionServlet.scala:41)
at org.scalatra.GZipSupport$$anonfun$handle$1.apply$mcV$sp(GZipSupport.scala:36)
at org.scalatra.GZipSupport$$anonfun$handle$1.apply(GZipSupport.scala:19)
at org.scalatra.GZipSupport$$anonfun$handle$1.apply(GZipSupport.scala:19)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at org.scalatra.DynamicScope$class.withResponse(DynamicScope.scala:80)
at org.scalatra.ScalatraServlet.withResponse(ScalatraServlet.scala:49)
at org.scalatra.DynamicScope$$anonfun$withRequestResponse$1.apply(DynamicScope.scala:60)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
at org.scalatra.DynamicScope$class.withRequest(DynamicScope.scala:71)
at org.scalatra.ScalatraServlet.withRequest(ScalatraServlet.scala:49)
at org.scalatra.DynamicScope$class.withRequestResponse(DynamicScope.scala:59)
at org.scalatra.ScalatraServlet.withRequestResponse(ScalatraServlet.scala:49)
at org.scalatra.GZipSupport$class.handle(GZipSupport.scala:18)
at org.apache.livy.server.SessionServlet.handle(SessionServlet.scala:41)
at org.scalatra.ScalatraServlet.service(ScalatraServlet.scala:54)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:812)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
at org.eclipse.jetty.server.Server.handle(Server.java:499)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:748)
For some model with Anomaly Detection, the big chart of the metrics cannot be open. The error in the inspect console shows "TypeError: Cannot read property 'lower' of null".
The reason is that some 'bolling' value in the metrics details is null.
i have deploy griffin and run at local,but when i run java -jar service/target/service.jar
, it says port 8080 is used.how can i change the port to others?
Hello,
I was interresting by your project, but I found a strange error.
I followed the readme on https://github.com/eBay/griffin but spark-submit command failed with this message :
17/04/07 20:10:27 INFO ParseDriver: Parsing command: SELECT * FROM griffin.users_info_target where start = 20170406
17/04/07 20:10:28 INFO ParseDriver: Parse Completed
Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve 'start' given input columns: [first_name, phone, email, post_code, user_id, last_name, address]; line 1 pos 46
at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:60)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:57)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:281)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:321)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:332)
at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionUp$1(QueryPlan.scala:108)
at org.apache.spark.sql.catalyst.plans.QueryPlan.org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$2(QueryPlan.scala:118)
at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$2.apply(QueryPlan.scala:127)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
at scala.collection.AbstractIterator.to(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
at org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsUp(QueryPlan.scala:127)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:57)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:50)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:121)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:120)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$foreachUp$1.apply(TreeNode.scala:120)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:120)
at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.checkAnalysis(CheckAnalysis.scala:50)
at org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:44)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:34)
at org.apache.spark.sql.DataFrame.(DataFrame.scala:133)
at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:52)
at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:817)
at org.apache.griffin.dataLoaderUtils.HiveDataLoader.getValiDataFrame(HiveDataLoader.scala:17)
at org.apache.griffin.validility.Vali$.main(Vali.scala:48)
at org.apache.griffin.validility.Vali.main(Vali.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Compared with the tutorial, I created a griffin database in hive with two tables (users_info_src and users_info_target) and the name of dataasset is griffin.users_info_src ( if the name of dataasset is just griffin.users_info_src I had an exeption Exception in thread "main" org.apache.spark.sql.AnalysisException: Table not found: users_info_src; line 1 pos 14 )
Thank a lot
total count model result should be integer, but it is percentage now
use incubator-griffin-master submit job the livy get exception [
stdout: 18/06/03 03:17:02 ERROR measure.Application$: java.net.URISyntaxException: Relative path in absolute URI: {
stdout: "measure.type" :%20%22griffin%22,%0A%20%20%22id%22%20:%201,%0A%20%20%22name%22%20:%20%22hive-test-01%22,%0A%20%20%22owner%22%20:%20%22test%22,%0A%20%20%22description%22%20:%20%22hive_studentno_test01%22,%0A%20%20%22deleted%22%20:%20false,%0A%20%20%22timestamp%22%20:%201527880618561,%0A%20%20%22dq.type%22%20:%20%22profiling%22,%0A%20%20%22process.type%22%20:%20%22batch%22,%0A%20%20%22rule.description%22%20:%20%7B%0A%20%20%20%20%22details%22%20:%20%5B%20%7B%0A%20%20%20%20%20%20%22name%22%20:%20%22stuno%22,%0A%20%20%20%20%20%20%22infos%22%20:%20%22Total%20Count%22%0A%20%20%20%20%7D%20%5D%0A%20%20%7D,%0A%20%20%22data.sources%22%20:%20%5B%20%7B%0A%20%20%20%20%22id%22%20:%204,%0A%20%20%20%20%22name%22%20:%20%22source%22,%0A%20%20%20%20%22connectors%22%20:%20%5B%20%7B%0A%20%20%20%20%20%20%22id%22%20:%205,%0A%20%20%20%20%20%20%22name%22%20:%20%22source1527922226576%22,%0A%20%20%20%20%20%20%22type%22%20:%20%22hive%22,%0A%20%20%20%20%20%20%22version%22%20:%20%221.2%22,%0A%20%20%20%20%20%20%22predicates%22%20:%20%5B%20%5D,%0A%20%20%20%20%20%20%22data.unit%22%20:%20%221day%22,%0A%20%20%20%20%20%20%22data.time.zone%22%20:%20%22%22,%0A%20%20%20%20%20%20%22config%22%20:%20%7B%0A%20%20%20%20%20%20%20%20%22database%22%20:%20%22default%22,%0A%20%20%20%20%20%20%20%20%22table.name%22%20:%20%22studentno%22%0A%20%20%20%20%20%20%7D%0A%20%20%20%20%7D%20%5D%0A%20%20%7D%20%5D,%0A%20%20%22evaluate.rule%22%20:%20%7B%0A%20%20%20%20%22id%22%20:%202,%0A%20%20%20%20%22rules%22%20:%20%5B%20%7B%0A%20%20%20%20%20%20%22id%22%20:%203,%0A%20%20%20%20%20%20%22rule%22%20:%20%22count(source.%5C%60stuno%5C%60)%20AS%20%5C%60stuno-count%5C%60%22,%0A%20%20%20%20%20%20%22name%22%20:%20%22profiling%22,%0A%20%20%20%20%20%20%22dsl.type%22%20:%20%22griffin-dsl%22,%0A%20%20%20%20%20%20%22dq.type%22%20:%20%22profiling%22%0A%20%20%20%20%7D%20%5D%0A%20%20%7D,%0A%20%20%22measure.type%22%20:%20%22griffin%22%0A%7D
spark-submit exited with code 254
spark-submit exited with code 254
]
following the docker installation instruction, I met 'Cannot create container with more than 127 parents' when I built the griffin-env. I retried it on mac and centos 6.7 . Is there any solution ?
In metrics heatmap, when mouse is over the new validity model such as a total count model, the "dq" value in floating pad should be a count number, but it has a "%" signal in the end.
On webUI, "Create Date" column in Data Assets and Models page, the value does not equal to backend date time, it should be the same with the server end.
For metrics dq(k) charts in both the metrics page and the right side, the value in the popped text should be divided by 1k.
I'm on the document, and there's nothing wrong with it(https://github.com/apache/incubator-griffin/blob/master/griffin-doc/userguide.md)
Some numbers in metrics are too long, for example, 12,809,156K.
We can shorten it to 12,809M.
The previous step can be executed, and the final step shows the graph failure
Suggest to add checkstyle.xml
[1] according to CONTRIBUTING.md#java-guidelines
[2] at day one since open source.
[1] https://github.com/checkstyle/checkstyle/blob/master/src/main/resources/google_checks.xml
[2] https://github.com/eBay/griffin/blob/master/CONTRIBUTING.md#java-guidelines
Hi,
Is there a way to compare data from different sources using this tool? In my case, the comparison is between a mysql db and hive table.
Thanks,
Navin
Hi,
I followed the guide to deploy the docker images. I have both (griffin and es running). Then I enter the griffin container in interactive mode and I execute
java -jar service/service.jar
But I obtain the follow:
APPLICATION FAILED TO START
Description:
The Tomcat connector configured to listen on port 8080 failed to start. The port may already be in use or the connector may be misconfigured.
So, I change the port as follow:
java -jar service/service.jar --server.port=8181
Then I obtain the follow:
2017-11-10 16:12:18.535 INFO 4845 --- [ main] oConfiguration$WelcomePageHandlerMapping : Adding welcome page: class path resource [public/index.html]
2017-11-10 16:12:19.013 INFO 4845 --- [ main] o.s.j.e.a.AnnotationMBeanExporter : Registering beans for JMX exposure on startup
2017-11-10 16:12:19.026 INFO 4845 --- [ main] o.s.c.support.DefaultLifecycleProcessor : Starting beans in phase 2147483647
2017-11-10 16:12:19.026 INFO 4845 --- [ main] o.s.s.quartz.SchedulerFactoryBean : Starting Quartz Scheduler now
2017-11-10 16:12:19.076 INFO 4845 --- [ main] org.quartz.core.QuartzScheduler : Scheduler schedulerFactoryBean_$_griffin1510330337340 started.
2017-11-10 16:12:19.087 INFO 4845 --- [ main] s.a.ScheduledAnnotationBeanPostProcessor : No TaskScheduler/ScheduledExecutorService bean found for scheduled processing
2017-11-10 16:12:19.147 INFO 4845 --- [pool-3-thread-1] o.a.griffin.core.common.CacheEvictor : Evict hive cache
2017-11-10 16:12:19.165 INFO 4845 --- [ main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 8181 (http)
2017-11-10 16:12:19.171 INFO 4845 --- [ main] o.a.griffin.core.GriffinWebApplication : Started GriffinWebApplication in 8.062 seconds (JVM running for 8.683)
2017-11-10 16:12:19.261 INFO 4845 --- [pool-3-thread-1] o.a.griffin.core.common.CacheEvictor : After evict hive cache,automatically refresh hive tables cache.
Hibernate: select distinct jobinstanc0_.group_name as col_0_0_, jobinstanc0_.job_name as col_1_0_ from job_instance jobinstanc0_
Hibernate: select distinct jobinstanc0_.group_name as col_0_0_, jobinstanc0_.job_name as col_1_0_ from job_instance jobinstanc0_
and I cannot access to the :8181 address.
change it to griffin
17/01/22 04:29:23 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hive/warehouse/users_info_src/users_info_src.dat.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy14.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy15.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
put: File /user/hive/warehouse/users_info_src/users_info_src.dat.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
17/01/22 04:29:27 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hive/warehouse/users_info_target/users_info_target.dat.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
at org.apache.hadoop.ipc.Client.call(Client.java:1468)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy14.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy15.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)
put: File /user/hive/warehouse/users_info_target/users_info_target.dat.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
The command '/bin/sh -c ./hadoop-start.sh && ./pre-start.sh && ./hd-before-hive.sh && ./hd-after-hive.sh && ./hd-test-json.sh && ./hadoop-end.sh' returned a non-zero code: 1
Hi, author
I find no place to indicate a partition column for hive table when register data asset with web UI, while it also shows 'empty hive partition columns' in the submit page.
Is this may be a UI bug?
starting org.apache.spark.deploy.master.Master, logging to /usr/local/spark/logs/spark--org.apache.spark.deploy.master.Master-1-sandbox.out
localhost: starting org.apache.spark.deploy.worker.Worker, logging to /usr/local/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-sandbox.out
localhost: failed to launch org.apache.spark.deploy.worker.Worker:
localhost: at java.lang.ClassLoader.loadClass(libgcj.so.10)
localhost: at gnu.java.lang.MainThread.run(libgcj.so.10)
localhost: full log in /usr/local/spark/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-sandbox.out
about to fork child process, waiting until server is ready for connections.
On "Models" page, there's a "Model Stage" column. It's better to have some explanation of all stages.
Step 18 : RUN chmod -R 755 $GRIFFIN_HOME
---> Running in 6c152e9159cb
---> c9ea121d4b74
Removing intermediate container 6c152e9159cb
Step 19 : RUN cd /apache && wget https://www.apache.org/dist/tomcat/tomcat-7/v7.0.72/bin/apache-tomcat-7.0.72.tar.gz && tar -xvf apache-tomcat-7.0.72.tar.gz && ln -s apache-tomcat-7.0.72 tomcat
---> Running in 0f27c2c8f3bd
--2017-01-22 02:21:09-- https://www.apache.org/dist/tomcat/tomcat-7/v7.0.72/bin/apache-tomcat-7.0.72.tar.gz
Resolving www.apache.org... 88.198.26.2, 140.211.11.105, 2a01:4f8:130:2192::2
Connecting to www.apache.org|88.198.26.2|:443... failed: Connection timed out.
Connecting to www.apache.org|140.211.11.105|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2017-01-22 02:23:18 ERROR 404: Not Found.
The command '/bin/sh -c cd /apache && wget https://www.apache.org/dist/tomcat/tomcat-7/v7.0.72/bin/apache-tomcat-7.0.72.tar.gz && tar -xvf apache-tomcat-7.0.72.tar.gz && ln -s apache-tomcat-7.0.72 tomcat' returned a non-zero code: 8
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.