apache / accumulo-testing Goto Github PK
View Code? Open in Web Editor NEWApache Accumulo Testing
Home Page: https://accumulo.apache.org
License: Apache License 2.0
Apache Accumulo Testing
Home Page: https://accumulo.apache.org
License: Apache License 2.0
If I've already run a continuous ingest verify and I want to run it again, I get the following error:
Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://localhost:8020/tmp/ci-verify already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:164)
at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:277)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:143)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
at org.apache.accumulo.testing.continuous.ContinuousVerify.run(ContinuousVerify.java:193)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.accumulo.testing.continuous.ContinuousVerify.main(ContinuousVerify.java:204)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
It'd be better if the script noticed that a verify job had already run, and asked to clean up this directory for me, so I didn't have to manually execute: hdfs dfs -rm -r /tmp/ci-verify
To exercise more code paths during testing it would be nice to make the agitation scripts call stop-here.sh sometimes instead of killing processes.
Describe the bug
RandomCachedLookupsPT alters some of the tserver configs. If the cluster that is running the test lacks the memory it will throw the exception below but will continuously hang until manually canceled. The main issue here is the exception is hidden in the logs and isn't properly handled by the PT to check for it.
2021-09-27T12:21:11,595 [start.Main] ERROR: Thread 'tserver' died.
java.lang.IllegalArgumentException: Maximum tablet server map
memory 265,751,101 block cache sizes 3,301,756,108 and mutation
queue size 40,265,318 is too large for this JVM configuration 805,306,368
To Reproduce
Expected behavior
Ideally, the PT should check beforehand to make sure that the system it is testing on can handle the system config changes it makes or at least exit nicely when an exception is thrown.
Additional context
This PT does pass as expected when using the larger performance profile for fluo-uno. It is possible that the simplest solution of documenting the need for users to configure these settings before running the PT might also be good enough.
It was suggested that the scalability test be run and updated. After a bit of investigation it seems like there is a lot that has changed since the test was in working order and subsequently may take a bit of work to get it in a usable state again.
Before I put too much work into it wanted to make a ticket to see if anyone has an opinion whether this test is still useful and would be worth reviving.
This performance test should measure writing and reading with locality groups.
Building /home/charbel/accumulo-testing/core/target/accumulo-testing-core-2.0.0-SNAPSHOT-shaded.jar
[INFO] Scanning for projects...
[WARNING]
[WARNING] Some problems were encountered while building the effective model for org.apache.accumulo:accumulo-testing-core:jar:2.0.0-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-shade-plugin is missing. @ org.apache.accumulo:accumulo-testing-core:[unknown-version], /home/charbel/accumulo-testing/core/pom.xml, line 93, column 19
[WARNING]
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING]
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING]
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO]
[INFO] Apache Accumulo Testing Parent
[INFO] Apache Accumulo Testing Core
[INFO] Apache Accumulo Testing YARN
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building Apache Accumulo Testing Parent 2.0.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO]
[INFO] --- maven-clean-plugin:3.0.0:clean (default-clean) @ accumulo-testing ---
[INFO] Deleting /home/charbel/accumulo-testing/target
[INFO]
[INFO] --- formatter-maven-plugin:0.5.2:format (default) @ accumulo-testing ---
[INFO] Using 'UTF-8' encoding to format source files.
[INFO] Number of files to be formatted: 0
[INFO]
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) @ accumulo-testing ---
[INFO]
[INFO] --- maven-site-plugin:3.5.1:attach-descriptor (attach-descriptor) @ accumulo-testing ---
[INFO]
[INFO] ------------------------------------------------------------------------
[INFO] Building Apache Accumulo Testing Core 2.0.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[WARNING] The POM for org.apache.accumulo:accumulo-client-mapreduce:jar:1.8.0 is missing, no dependency information available
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Apache Accumulo Testing Parent ..................... SUCCESS [ 1.505 s]
[INFO] Apache Accumulo Testing Core ....................... FAILURE [ 1.160 s]
[INFO] Apache Accumulo Testing YARN ....................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3.688 s
[INFO] Finished at: 2018-05-24T17:23:08-07:00
[INFO] Final Memory: 26M/278M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project accumulo-testing-core: Could not resolve dependencies for project org.apache.accumulo:accumulo-testing-core:jar:2.0.0-SNAPSHOT: Failure to find org.apache.accumulo:accumulo-client-mapreduce:jar:1.8.0 in https://repo.maven.apache.org/maven2 was cached in the local repository, resolution will not be reattempted until the update interval of central has elapsed or updates are forced -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :accumulo-testing-core
Error: Could not find or load main class org.apache.accumulo.testing.core.continuous.CreateTable
accumulo version : 1.8.0
YieldingScanExecutorPT fails to complete with the following output:
Exception in thread "main" java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.accumulo.core.clientImpl.AccumuloServerException: Error on server thor:9997
at org.apache.accumulo.testing.performance.util.TestExecutor.lambda$stream$0(TestExecutor.java:52)
at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.base/java.util.stream.LongPipeline.collect(LongPipeline.java:491)
at java.base/java.util.stream.LongPipeline.summaryStatistics(LongPipeline.java:468)
at org.apache.accumulo.testing.performance.tests.YieldingScanExecutorPT.runShortScans(YieldingScanExecutorPT.java:215)
at org.apache.accumulo.testing.performance.tests.YieldingScanExecutorPT.runTest(YieldingScanExecutorPT.java:114)
at org.apache.accumulo.testing.performance.impl.PerfTestRunner.main(PerfTestRunner.java:51)
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.accumulo.core.clientImpl.AccumuloServerException: Error on server thor:9997
at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
at org.apache.accumulo.testing.performance.util.TestExecutor.lambda$stream$0(TestExecutor.java:50)
... 11 more
Caused by: java.lang.RuntimeException: org.apache.accumulo.core.clientImpl.AccumuloServerException: Error on server thor:9997
at org.apache.accumulo.core.clientImpl.ScannerIterator.getNextBatch(ScannerIterator.java:185)
at org.apache.accumulo.core.clientImpl.ScannerIterator.hasNext(ScannerIterator.java:110)
at com.google.common.collect.Iterators.size(Iterators.java:163)
at com.google.common.collect.Iterables.size(Iterables.java:126)
at org.apache.accumulo.testing.performance.tests.YieldingScanExecutorPT.scan(YieldingScanExecutorPT.java:170)
at org.apache.accumulo.testing.performance.tests.YieldingScanExecutorPT.lambda$runShortScans$1(YieldingScanExecutorPT.java:212)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.accumulo.core.clientImpl.AccumuloServerException: Error on server thor:9997
at org.apache.accumulo.core.clientImpl.ThriftScanner.scan(ThriftScanner.java:324)
at org.apache.accumulo.core.clientImpl.ScannerIterator.readBatch(ScannerIterator.java:156)
at org.apache.accumulo.core.clientImpl.ScannerIterator.getNextBatch(ScannerIterator.java:174)
... 9 more
Caused by: org.apache.thrift.TApplicationException: Internal error processing startScan
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startScan(TabletClientService.java:249)
at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startScan(TabletClientService.java:221)
at org.apache.accumulo.core.clientImpl.ThriftScanner.scan(ThriftScanner.java:453)
at org.apache.accumulo.core.clientImpl.ThriftScanner.scan(ThriftScanner.java:317)
... 11 more
java.lang.NullPointerException at
org.apache.accumulo.server.conf.TableConfiguration.createScanDispatcher(TableConfiguration.java:215) at
org.apache.accumulo.server.conf.TableConfiguration.lambda$new$1(TableConfiguration.java:82) at
org.apache.accumulo.core.conf.AccumuloConfiguration$DeriverImpl.derive(AccumuloConfiguration.java:482) at
org.apache.accumulo.server.conf.TableConfiguration.getScanDispatcher(TableConfiguration.java:270) at
org.apache.accumulo.tserver.ThriftClientHandler.getScanDispatcher(ThriftClientHandler.java:272) at
org.apache.accumulo.tserver.ThriftClientHandler.continueScan(ThriftClientHandler.java:378) at
org.apache.accumulo.tserver.ThriftClientHandler.startScan(ThriftClientHandler.java:342) at
jdk.internal.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at
java.base/java.lang.reflect.Method.invoke(Method.java:566) at
org.apache.accumulo.core.trace.TraceUtil.lambda$wrapService$6(TraceUtil.java:235) at
com.sun.proxy.$Proxy38.startScan(Unknown Source) at
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2944) at
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2923) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) at
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) at
org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:63) at
org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) at
org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:114) at org.apache.thrift.server.Invocation.run(Invocation.java:18) at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at
java.base/java.lang.Thread.run(Thread.java:829)
java.lang.ClassNotFoundException: org.apache.accumulo.testing.performance.tests.TimedScanDispatcher at
java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:471) at
java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589) at
org.apache.accumulo.start.classloader.AccumuloClassLoader$1.loadClass(AccumuloClassLoader.java:213) at
java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) at
org.apache.accumulo.core.classloader.ClassLoaderUtil.loadClass(ClassLoaderUtil.java:85) at
org.apache.accumulo.core.conf.ConfigurationTypeHelper.getClassInstance(ConfigurationTypeHelper.java:203) at
org.apache.accumulo.core.conf.ConfigurationTypeHelper.getClassInstance(ConfigurationTypeHelper.java:176) at
org.apache.accumulo.core.conf.Property.createTableInstanceFromPropertyName(Property.java:1747) at
org.apache.accumulo.server.conf.TableConfiguration.createScanDispatcher(TableConfiguration.java:209) at
org.apache.accumulo.server.conf.TableConfiguration.lambda$new$1(TableConfiguration.java:82) at
org.apache.accumulo.core.conf.AccumuloConfiguration$DeriverImpl.derive(AccumuloConfiguration.java:482) at
org.apache.accumulo.server.conf.TableConfiguration.getScanDispatcher(TableConfiguration.java:270) at
org.apache.accumulo.tserver.ThriftClientHandler.getScanDispatcher(ThriftClientHandler.java:272) at
org.apache.accumulo.tserver.ThriftClientHandler.continueScan(ThriftClientHandler.java:378) at
org.apache.accumulo.tserver.ThriftClientHandler.startScan(ThriftClientHandler.java:342) at
jdk.internal.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at
java.base/java.lang.reflect.Method.invoke(Method.java:566) at
org.apache.accumulo.core.trace.TraceUtil.lambda$wrapService$6(TraceUtil.java:235) at
com.sun.proxy.$Proxy38.startScan(Unknown Source) at
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2944) at
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2923) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) at
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) at
org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:63) at
org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) at
org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:114) at org.apache.thrift.server.Invocation.run(Invocation.java:18) at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at
java.base/java.lang.Thread.run(Thread.java:829)
I ran CI locally with agitation and everything seemed to start back up except the Garbage Collector. It looks like the issue is in <testing-home>/libexec/master-agitator.pl
When running the ./bin/cingest
script, maven will fail to build the shaded jar. This is due to the accumulo version
command now logging debug information. The conf/env.sh script gets the version from the command and after printing it to a file, I saw it was setting accumulo.version
to this:
2020-12-03T10:53:35,524 [classloader.AccumuloClassLoader] DEBUG: Using Accumulo configuration at /home/mike/workspace/uno/install/accumulo-2.1.0-SNAPSHOT/conf/accumulo.properties 2020-12-03T10:53:35,593 [classloader.AccumuloClassLoader] DEBUG: Create 2nd tier ClassLoader using URLs: [] 2.1.0-SNAPSHOT
With the recent property and class name changes to Manager (here I think) builds of accumulo-testing now fail. There are also a few warnings that relate to other recent changes.
[ERROR] COMPILATION ERROR : [INFO] ------------------------------------------------------------- [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/Module.java:[46,37] cannot find symbol symbol: class SimpleThreadPool location: package org.apache.accumulo.core.util [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Replication.java:[20,1] cannot find symbol symbol: static MASTER_REPLICATION_SCAN_INTERVAL location: enum org.apache.accumulo.core.conf.Property [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/bulk/Setup.java:[29,37] cannot find symbol symbol: class SimpleThreadPool location: package org.apache.accumulo.core.util [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/Module.java:[227,35] cannot find symbol symbol: class SimpleThreadPool location: class org.apache.accumulo.testing.randomwalk.Module [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Replication.java:[77,22] cannot find symbol symbol: variable MASTER_REPLICATION_SCAN_INTERVAL location: class org.apache.accumulo.testing.randomwalk.concurrent.Replication [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[87,35] cannot find symbol symbol: variable MASTER_BULK_THREADPOOL_SIZE location: class org.apache.accumulo.core.conf.Property [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[88,35] cannot find symbol symbol: variable MASTER_BULK_RETRIES location: class org.apache.accumulo.core.conf.Property [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[89,35] cannot find symbol symbol: variable MASTER_BULK_TIMEOUT location: class org.apache.accumulo.core.conf.Property [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[90,35] cannot find symbol symbol: variable MASTER_FATE_THREADPOOL_SIZE location: class org.apache.accumulo.core.conf.Property [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[91,35] cannot find symbol symbol: variable MASTER_RECOVERY_DELAY location: class org.apache.accumulo.core.conf.Property [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[92,35] cannot find symbol symbol: variable MASTER_LEASE_RECOVERY_WAITING_PERIOD location: class org.apache.accumulo.core.conf.Property [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[93,35] cannot find symbol symbol: variable MASTER_THREADCHECK location: class org.apache.accumulo.core.conf.Property [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[94,35] cannot find symbol symbol: variable MASTER_MINTHREADS location: class org.apache.accumulo.core.conf.Property [ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/bulk/Setup.java:[64,32] cannot find symbol symbol: class SimpleThreadPool location: class org.apache.accumulo.testing.randomwalk.bulk.Setup [INFO] 14 errors
WARNING] COMPILATION WARNING : [INFO] ------------------------------------------------------------- [WARNING] Cannot find annotation method 'since()' in type 'java.lang.Deprecated' [WARNING] Cannot find annotation method 'since()' in type 'java.lang.Deprecated' [WARNING] Cannot find annotation method 'since()' in type 'java.lang.Deprecated' [WARNING] Cannot find annotation method 'since()' in type 'java.lang.Deprecated' [WARNING] Cannot find annotation method 'since()' in type 'java.lang.Deprecated'
[WARNING] Invalid project model for artifact [commons-vfs2:org.apache.commons:2.6.0]. It will be ignored by the remote resources Mojo. [WARNING] Invalid project model for artifact [accumulo-hadoop-mapreduce:org.apache.accumulo:2.1.0-SNAPSHOT]. It will be ignored by the remote resources Mojo. [WARNING] Invalid project model for artifact [accumulo-core:org.apache.accumulo:2.1.0-SNAPSHOT]. It will be ignored by the remote resources Mojo. [WARNING] Invalid project model for artifact [accumulo-start:org.apache.accumulo:2.1.0-SNAPSHOT]. It will be ignored by the remote resources Mojo.
Versions:
To Reproduce
accumulo-testing/bin
When testing Accumulo I often go through the following task manually.
It may be nice to have script that does this. The argument for the script would be the following :
This script would setup the EC2 cluster with the version of Accumulo from the git repo. It would also setup accumulo testing on the cluster from the git repo.
Could possibly use the following for inspiration.
Support non default HDFS (in case of multiple volumes) path as a parameter for continuous bulk ingest. For e.g
bin/cingest bulk abfs://[email protected]/azbulk-multi
In this case abfs://[email protected]/
HDFS filesystem is not the default HDFS and has been added as a additional volume to accumulo
The default HDFS fileystem configured is hdfs://accucluster
Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: abfs://[email protected]/azbulk-multi, expected: hdfs://accucluster:8020
I belive the dockerfile for creating a docker image was broken by changes to the scripts in edbc7cd. Trying to run the cingest
script in the docker image gives this error:
/opt/at/bin/cingest: line 19: /opt/at/bin/build: No such file or directory
It would be great if accumulo-testing could be simplified and run in Docker. Below is possible usage.
$ ./bin/accumulo-testing
Usage: accumulo-testing <test> (<argument>)
Available tests:
ci <application> Runs continous ingest <application>.
Possible applications: createtable, ingest, walk, batchwalk, scan, verify, moru
rw <module> Runs random walk <module>
Modules located in core/src/main/resources/randomwalk/modules
Something has changed where the cluster control script for uno no longer seems to work. This script can be used by performance test.
The following commands are being run by the scripts. These used to work and no longer work with the latest Uno. Not sure what changed. The goal of all of these commands is to avoid setting up hadoop and zookeeper from scratch between performance test.
uno install accumulo
uno run zookeeper
uno run hadoop
uno setup accumulo --no-deps
uno stop accumulo --no-deps
uno start accumulo --no-deps
See https://issues.apache.org/jira/browse/ACCUMULO-2145 for a description of work and prior work done.
This could use Uno. For testing apache/accumulo#1111 I wrote the following script that uses Uno.
#! /usr/bin/env bash
ACCUMULO_DIR=~/git/accumulo
UNO_DIR=~/git/uno
BULK=/tmp/upt
cd $ACCUMULO_DIR
git checkout 1.9
git clean -xfd
cd $UNO_DIR
./bin/uno fetch accumulo
./bin/uno setup accumulo
(
eval "$(./bin/uno env)"
hadoop fs -ls /accumulo/version
hadoop fs -rmr "$BULK"
hadoop fs -mkdir -p "$BULK/fail"
accumulo org.apache.accumulo.test.TestIngest -i uno -u root -p secret --rfile $BULK/bulk/test --timestamp 1 --size 50 --random 56 --rows 200000 --start 200000 --cols 1
accumulo org.apache.accumulo.test.TestIngest -i uno -u root -p secret --timestamp 1 --size 50 --random 56 --rows 200000 --start 0 --cols 1 --createTable --splits 10
accumulo shell -u root -p secret <<EOF
table test_ingest
importdirectory $BULK/bulk $BULK/fail false
createtable foo
config -t foo -s table.compaction.major.ratio=2
insert r1 f1 q1 v1
flush -t foo -w
scan -t accumulo.metadata -c file
insert r1 f1 q2 v2
insert r2 f1 q1 v3
EOF
)
pkill -9 -f accumulo\\.start
cd $ACCUMULO_DIR
git checkout accumulo-1111
git clean -xfd
cd $UNO_DIR
./bin/uno fetch accumulo
./bin/uno install accumulo --no-deps
./install/accumulo*/bin/accumulo-cluster start
(
eval "$(./bin/uno env)"
hadoop fs -ls /accumulo/version
accumulo shell -u root -p secret <<EOF
config -t foo -f table.compaction.major.ratio
scan -t foo -np
scan -t accumulo.metadata -c file
compact -t foo -w
scan -t foo -np
scan -t accumulo.metadata -c file
EOF
accumulo org.apache.accumulo.test.VerifyIngest --size 50 --timestamp 1 --random 56 --rows 400000 --start 0 --cols 1
)
It was raised on apache/accumulo#1312 that the testing repo's 1.9 branch was not in a good state, and it was argued that we shouldn't be updating it, instead keeping the 1.9 testing code with the main 1.9 code base, like it has been.
We could merge the 1.9 branch into the master branch with -sours
to preserve the 1.9 history before deleting it.
I'll leave this issue open for at least a few days, for comment, before taking any action.
There are several java files missing license headers in the accumulo-testing repo. We should add headers as needed and add the apache-rat-plugin to the pom.xml so that CI will catch this problem when running mvn verify.
Running ./bin/performance list
gives the following error:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/accumulo/testing/performance/tests/SplitBalancingPT (wrong name: target/classes/org/apache/accumulo/testing/performance/tests/SplitBalancingPT)
at java.base/java.lang.ClassLoader.defineClass1(Native Method)
at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1016)
at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:174)
at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:802)
at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:700)
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:623)
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
at com.google.common.reflect.ClassPath$ClassInfo.load(ClassPath.java:328)
at org.apache.accumulo.testing.performance.impl.ListTests.main(ListTests.java:34)
When running bulk import continuous ingest test it can take a while to generate a good bit of data to start testing. Not sure, but it may be faster to generate a data set once and store it in S3. Then future test could possibly use that data set.
I think it would be interesting to experiment with this and if it works well add documentation to the bulk import test docs explaining how to do it. One gotcha with this approach is that anyone running a test needs to be consistent with split points. A simple way to address this problem would be store a file of split points in S3 with the data.
Authorizations configured for continous scanners and walkers in accumulo-testing.properties in "test.ci.common.auths" incorrectly interpreted as single character each instead of | delimited. Like if
test.ci.common.auths=SYS,HR,ADM|SYS,ADM is defined. Authorizations chosen are S,Y,S,H,R ... and so on instead of "SYS,HR,ADM" , "SYS,ADM" and so on.
The issue looks to be in authValue.split("|") call in ContinuousEnv.java. The code is not considering | as a metacharacter in regex. It will require escape character to define | in the split function call. something like authValue.split("\|") to handle the desired functionality correctly.
With the import-control checkstyle, it is now easier to detect use of non-API code used. The following exceptions should be refactored and removed from the config file:
https://github.com/apache/accumulo-testing/blob/master/contrib/import-control.xml#L35
This error occurs when running any module in rwalk running within docker:
java.lang.RuntimeException: Failed to connect to zookeeper (localhost:2181) within 2x zookeeper timeout period 30000
at org.apache.accumulo.fate.zookeeper.ZooSession.connect(ZooSession.java:157)
at org.apache.accumulo.fate.zookeeper.ZooSession.getSession(ZooSession.java:201)
at org.apache.accumulo.fate.zookeeper.ZooReader.getSession(ZooReader.java:42)
at org.apache.accumulo.fate.zookeeper.ZooReader.getZooKeeper(ZooReader.java:46)
at org.apache.accumulo.fate.zookeeper.ZooCache.getZooKeeper(ZooCache.java:148)
at org.apache.accumulo.fate.zookeeper.ZooCache.access$900(ZooCache.java:48)
at org.apache.accumulo.fate.zookeeper.ZooCache$2.run(ZooCache.java:406)
at org.apache.accumulo.fate.zookeeper.ZooCache$2.run(ZooCache.java:379)
at org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:271)
at org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:434)
at org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:364)
at org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:398)
at org.apache.accumulo.core.clientImpl.Tables.getTableMap(Tables.java:179)
at org.apache.accumulo.core.clientImpl.Tables.getTableMap(Tables.java:167)
at org.apache.accumulo.core.clientImpl.Tables.getNameToIdMap(Tables.java:151)
at org.apache.accumulo.core.clientImpl.TableOperationsImpl.exists(TableOperationsImpl.java:192)
at org.apache.accumulo.testing.randomwalk.bulk.Setup.visit(Setup.java:49)
at org.apache.accumulo.testing.randomwalk.Module.visit(Module.java:237)
at org.apache.accumulo.testing.randomwalk.Framework.run(Framework.java:48)
at org.apache.accumulo.testing.randomwalk.Framework.main(Framework.java:92)
It would be nice if there was a test suite that developers and users could run on an Accumulo instance to sanity check an upgrade/install and verify the basic functionality of Accumulo.
It would be helpful if the JSON contained units like 'milliseconds' or 'ms'
Would be nice to be able to generate bulk import data for the CI test that covers a subset of the table instead of the entire table. This may be possible with a -o min=y -o max=z
config command line options, not sure. If it is possible, could update example test scripts to suggest using it.
I am testing the 2.0 branch at commit 9e32263e7 using Uno and keep seeing this same error while trying to run RW Concurrent module:
2019-06-20 14:29:52,354 [testing.randomwalk.Framework] INFO : Running random walk test with module: Concurrent.xml
2019-06-20 14:30:19,161 [testing.randomwalk.Framework] ERROR: Error during random walk
java.lang.Exception: Error running node ct.BulkImport
at org.apache.accumulo.testing.randomwalk.Module.visit(Module.java:370)
at org.apache.accumulo.testing.randomwalk.Framework.run(Framework.java:48)
at org.apache.accumulo.testing.randomwalk.Framework.main(Framework.java:92)
Caused by: org.apache.accumulo.core.client.AccumuloException: Bulk import directory /tmp/concurrent_bulk/b_4640b9365aa878f3 does not exist!
at org.apache.accumulo.core.clientImpl.TableOperationsImpl.checkPath(TableOperationsImpl.java:1173)
at org.apache.accumulo.core.clientImpl.TableOperationsImpl.importDirectory(TableOperationsImpl.java:1197)
at org.apache.accumulo.testing.randomwalk.concurrent.BulkImport.visit(BulkImport.java:134)
at org.apache.accumulo.testing.randomwalk.Module$1.call(Module.java:303)
at org.apache.accumulo.testing.randomwalk.Module$1.call(Module.java:298)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
at java.lang.Thread.run(Thread.java:748)
2019-06-20 14:30:19,163 [testing.randomwalk.Framework] INFO : Test finished
The BulkImport module is currently using the old deprecated importDirectory but it should still work.
The bin/cingest bulk
command starts a map reduce that generates data to bulk import. When the map reduce job runs its name is accumulo-testing-shaded.jar
. The code should set a better name.
Could possibly use the following for inspiration.
Accumulo-testing has MapReduce jobs that needed the following configuration added to work with Hadoop 3. This ticket is revisit this configuration added in #49 so clients don't have to specify location of Hadoop home directory on servers.
hadoopConfig.set("yarn.app.mapreduce.am.env", "HADOOP_MAPRED_HOME=" + hadoopHome);
In #24 a PT was added for lots of short randoms scans. It would be nice to have another PT for long running scans. For example measure the performance of reading X million entries from an Accumulo table with multiple tablets. Could also measure running 1,2,4,8,and 16 concurrent long running scans.
I wrote the following test while working on apache/accumulo#990. This could be turned into a performance test.
public class HerdTest {
private static final byte[] E = new byte[] {};
private static final byte[] FAM = "pinky".getBytes();
private static final int NUM_ROWS = 1_000_000;
private static final int NUM_COLS = 10;
public static void main(String[] args) throws Exception {
Connector conn = CmdUtil.getConnector();
if (!conn.tableOperations().exists("herd")) {
conn.tableOperations().create("herd", new NewTableConfiguration().setProperties(
Collections.singletonMap(Property.TABLE_BLOCKCACHE_ENABLED.getKey(), "true")));
write(conn);
conn.tableOperations().flush("herd", null, null, true);
}
testHerd(conn, 32);
}
private static void testHerd(Connector conn, int nt)
throws InterruptedException, ExecutionException {
ExecutorService tp = Executors.newFixedThreadPool(nt);
final CyclicBarrier cb = new CyclicBarrier(nt);
long t1 = System.currentTimeMillis();
List<Future<?>> futures = new ArrayList<>();
for (int i = 0; i < nt; i++) {
Future<?> f = tp.submit(new Runnable() {
@Override
public void run() {
try {
Scanner scanner = conn.createScanner("herd", Authorizations.EMPTY);
for (int i = 0; i < 1000; i++) {
// System.out.println(Thread.currentThread().getId()+" "+i);
cb.await();
byte[] row = FastFormat.toZeroPaddedString(i * 1000, 8, 16, E);
scanner.setRange(Range.exact(new String(row)));
for (Entry<Key,Value> entry : scanner) {
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
});
futures.add(f);
}
for (Future<?> future : futures) {
future.get();
}
long t2 = System.currentTimeMillis();
System.out.println(t2 - t1);
// scanner.close();
tp.shutdown();
}
private static void write(Connector conn) throws Exception {
try (BatchWriter bw = conn.createBatchWriter("herd", new BatchWriterConfig())) {
Random rand = new Random();
for (int r = 0; r < NUM_ROWS; r++) {
byte[] row = FastFormat.toZeroPaddedString(r, 8, 16, E);
Mutation m = new Mutation(row);
for (int c = 0; c < NUM_COLS; c++) {
byte[] qual = FastFormat.toZeroPaddedString(c, 4, 16, E);
byte[] val = new byte[32];
rand.nextBytes(val);
m.put(FAM, qual, val);
}
bw.addMutation(m);
}
}
}
}
When building the testing shaded jar using Accumulo 2.0.0-SNAP and Hadoop 2.8.4 the hadoop dependencies are not properly converged in the shaded jar. Seeing warnings like the following.
[WARNING] hadoop-client-api-3.0.2.jar, hadoop-hdfs-client-2.8.4.jar define 1642 overlapping classes:
[WARNING] - org.apache.hadoop.hdfs.protocol.CacheDirectiveInfo$Expiration
[WARNING] - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$SetOwnerResponseProto$Builder
[WARNING] - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetFileLinkInfoRequestProto$Builder
[WARNING] - org.apache.hadoop.hdfs.web.URLConnectionFactory$1
[WARNING] - org.apache.hadoop.hdfs.protocol.proto.XAttrProtos$SetXAttrRequestProto$1
[WARNING] - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ModifyCacheDirectiveResponseProto$1
[WARNING] - org.apache.hadoop.fs.XAttr$1
[WARNING] - org.apache.hadoop.hdfs.protocol.CachePoolStats$Builder
[WARNING] - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ReportBadBlocksResponseProtoOrBuilder
[WARNING] - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetBlockLocationsResponseProto
[WARNING] - 1632 more...
When many clients write data to the same tablet server at around the same time their data should be synced to the write ahead log as a group. If group commit is not working properly it can cause performance problem for many clients that will not be seen for a single client. It would be nice to have a performance test that specifically checks for this.
I created a project to do this in the past. Not sure what state it is in.
https://github.com/keith-turner/mutslam
Would be nice to test group commit performance for mutations and conditional mutations.
Continuous Ingest is still using BasicCompactionStrategy
, which has been deprecated in favor of the new compaction code. This needs to be replaced with a CompactionPlanner, most likely DefaultCompactionPlanner
will work fine. This may just be a configuration change.
This would be great to do for the 2.1 release testing.
With the recent changes in accumulo #1828, some of the modules for RWalk now throw permission exceptions.
The specific one I ran into while running ./bin./rwalk All.xml
and specifically./bin/rwalk Security.xml
is below:
ThriftSecurityException(user:system_flash_superheroes_local, code:PERMISSION_DENIED)
at org.apache.accumulo.server.security.SecurityOperation.authenticateUser(SecurityOperation.java:238)
at org.apache.accumulo.server.client.ClientServiceHandler.authenticateUser(ClientServiceHandler.java:150)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.apache.accumulo.core.trace.TraceUtil.lambda$wrapService$6(TraceUtil.java:235)
at com.sun.proxy.$Proxy38.authenticateUser(Unknown Source)
at org.apache.accumulo.core.clientImpl.thrift.ClientService$Processor$authenticateUser.getResult(ClientService.java:2608)
at org.apache.accumulo.core.clientImpl.thrift.ClientService$Processor$authenticateUser.getResult(ClientService.java:2587)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:63)
at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518)
at org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:114)
at org.apache.thrift.server.Invocation.run(Invocation.java:18)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
There are also several warnings thrown now for each permission in our auditing. Some examples below:
operation: failed; user: system_flash_superheroes_local;
action: changeAuthorizations; targetUser: system_flash_superheroes_local;
authorizations: Towels,Paper,Brush,Asparagus,Fishsticks,PotatoSkins,Ribs,Celery;
exception: ThriftSecurityException(user:system_flash_superheroes_local,
code:PERMISSION
operation: failed; user: root; checking permission DROP_USER on table_flash_superheroes_local denied;
exception: ThriftSecurityException(user:table_flash_superheroes_local,
code:USER_DOESNT_EXIST)
operation: failed; user: table_flash_superheroes_local;
action: revokeTablePermission;
permission: BULK_IMPORT; targetTable: security_flash_superheroes_local; targetUser: system_flash_superheroes_local;;
exception: ThriftSecurityException(user:table_flash_superheroes_local,
code:PERMISSION_DENIED)
This seems to happen for each permission type, either with Permission_Denied or for User_Doesn't_Exist.
New one below:
ERROR Framework Error during random walk
java.lang.Exception: Error running node Security.xml
at org.apache.accumulo.testing.randomwalk.Module.visit(Module.java:370)
at org.apache.accumulo.testing.randomwalk.Framework.run(Framework.java:48)
at org.apache.accumulo.testing.randomwalk.Framework.main(Framework.java:92)
Caused by: org.apache.accumulo.core.client.AccumuloSecurityException: Error BAD_CREDENTIALS for user system_flash_superheroes_local - Username or Password is Invalid
Follow on from issue #130 / PR #140
This testing repository should migrate to log4j2, and any configured console logging should be configured to use STDERR instead of STDOUT in the log4j2 configuration files, so that console output won't interfere with output from executed commands used for scripts (see #130).
When running the performance tests (via ./bin/performance run
), ConditionalMutationsPT errors out with the following:
Exception in thread "main" java.lang.NumberFormatException: For input string: "∞"
at java.base/jdk.internal.math.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2054)
at java.base/jdk.internal.math.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
at java.base/java.lang.Double.parseDouble(Double.java:543)
at org.apache.accumulo.testing.performance.tests.ConditionalMutationsPT.runConditionalMutationsTest(ConditionalMutationsPT.java:104)
at org.apache.accumulo.testing.performance.tests.ConditionalMutationsPT.runTest(ConditionalMutationsPT.java:76)
at org.apache.accumulo.testing.performance.impl.PerfTestRunner.main(PerfTestRunner.java:51)
I think the issue is occuring here:
t1
and t2
, have a difference of less than 1000000000 (1 second). TimeUnit.NANOSECONDS.toSeconds
will convert the nanos to the nearest second rounding down. So anything less than 1000000000 (1 second) will result in 0 seconds which, in this case, is in the denominator which returns "∞" and throws an error when being parsed.
I think this can be corrected by manually converting from nanos to seconds.
To debug a recent bulk ingest test I wrote the following summarizer. This summarizer counted the number of times each UUID was seen. I used to count the number of entries each map reduce job had created.
package test.ci;
import org.apache.accumulo.core.client.summary.CountingSummarizer;
public class CiUuidSummarizer extends CountingSummarizer<String>{
@Override
protected Converter<String> converter() {
return (k,v,c) -> c.accept(v.toString().split(":")[0]);
}
}
While working on #156 I realized it would be helpful to be able to run individual performance tests. As it stands, the only option is to run all the tests in order which takes quite a while. If any of the tests hang or error out, it stops the run of the remaining tests. Would be nice to be able to pass a parameter, the name of an individual test to run, similar to how cinigest works with its multiple components.
It doesn't seem there is any place that explains how to run a single Performance test. It would be nice if that was explained somewhere.
I was unable to run continuous ingest for 1.9.2RC1 because of the following problem. I was using hadoop 2.8.4.
java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.HAUtil.isLogicalUri(Lorg/apache/hadoop/conf/Configuration;Ljava/net/URI;)Z
at org.apache.twill.filesystem.FileContextLocation.toURI(FileContextLocation.java:149)
at org.apache.twill.yarn.YarnTwillPreparer.createLocalFile(YarnTwillPreparer.java:446)
at org.apache.twill.yarn.YarnTwillPreparer.createLocalFile(YarnTwillPreparer.java:442)
at org.apache.twill.yarn.YarnTwillPreparer.createAppMasterJar(YarnTwillPreparer.java:468)
at org.apache.twill.yarn.YarnTwillPreparer.access$100(YarnTwillPreparer.java:111)
at org.apache.twill.yarn.YarnTwillPreparer$1.call(YarnTwillPreparer.java:338)
at org.apache.twill.yarn.YarnTwillPreparer$1.call(YarnTwillPreparer.java:329)
at org.apache.twill.yarn.YarnTwillController.doStartUp(YarnTwillController.java:97)
at org.apache.twill.internal.AbstractZKServiceController.startUp(AbstractZKServiceController.java:75)
at org.apache.twill.internal.AbstractExecutionServiceController$ServiceDelegate.startUp(AbstractExecutionServiceController.java:175)
at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43)
at java.lang.Thread.run(Thread.java:748)
I saw this while running a rwalk on a 5 node cluster.
2019-09-12 21:26:51,842 [testing.randomwalk.Module] ERROR: Caught error executing BulkImport
java.util.concurrent.ExecutionException: java.lang.AssertionError: org.apache.accumulo.core.client.TableNotFoundException: Table (Id=a0) does not exist (Table (Id=a0) does not exist)
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:206)
at org.apache.accumulo.testing.randomwalk.Module.visit(Module.java:318)
at org.apache.accumulo.testing.randomwalk.Framework.run(Framework.java:48)
at org.apache.accumulo.testing.randomwalk.Framework.main(Framework.java:92)
Caused by: java.lang.AssertionError: org.apache.accumulo.core.client.TableNotFoundException: Table (Id=a0) does not exist (Table (Id=a0) does not exist)
at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doBulkFateOperation(TableOperationsImpl.java:334)
at org.apache.accumulo.core.clientImpl.bulk.BulkImport.load(BulkImport.java:142)
at org.apache.accumulo.testing.randomwalk.concurrent.BulkImport.visit(BulkImport.java:130)
at org.apache.accumulo.testing.randomwalk.Module$1.call(Module.java:303)
at org.apache.accumulo.testing.randomwalk.Module$1.call(Module.java:298)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.accumulo.core.client.TableNotFoundException: Table (Id=a0) does not exist (Table (Id=a0) does not exist)
at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:376)
at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:342)
at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doBulkFateOperation(TableOperationsImpl.java:329)
... 11 more
Caused by: ThriftTableOperationException(tableId:a0, tableName:null, op:BULK_IMPORT, type:NOTFOUND, description:Table (Id=a0) does not exist)
at org.apache.accumulo.core.master.thrift.FateService$executeFateOperation_result$executeFateOperation_resultStandardScheme.read(FateService.java:3474)
at org.apache.accumulo.core.master.thrift.FateService$executeFateOperation_result$executeFateOperation_resultStandardScheme.read(FateService.java:3451)
at org.apache.accumulo.core.master.thrift.FateService$executeFateOperation_result.read(FateService.java:3385)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:88)
at org.apache.accumulo.core.master.thrift.FateService$Client.recv_executeFateOperation(FateService.java:124)
at org.apache.accumulo.core.master.thrift.FateService$Client.executeFateOperation(FateService.java:105)
at org.apache.accumulo.core.clientImpl.TableOperationsImpl.executeFateOperation(TableOperationsImpl.java:270)
at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:353)
... 13 more
Then looking at where the error is thrown, it seems like this could be bad:
public String doBulkFateOperation(List<ByteBuffer> args, String tableName)
throws AccumuloSecurityException, AccumuloException {
try {
return doFateOperation(FateOperation.TABLE_BULK_IMPORT2, args, Collections.emptyMap(),
tableName);
} catch (TableExistsException | TableNotFoundException | NamespaceNotFoundException
| NamespaceExistsException e) {
// should not happen
throw new AssertionError(e);
}
}
PR apache/accumulo#2171 renamed WALOG properties to be WAL instead. There are a few instances inside accumulo-testing that use the deprecated WALOG variant that should be replaced by the WAL counterpart (see #147 as an example of such issue).
Could measure writing large batches of data and lots of small batches for data. The large and small batches could be separate test or the same.
2019-06-26 16:43:06,014 [testing.randomwalk.Framework] ERROR: Error during random walk
java.lang.Exception: Error running node seq.MapRedVerify
at org.apache.accumulo.testing.randomwalk.Module.visit(Module.java:370)
at org.apache.accumulo.testing.randomwalk.Framework.run(Framework.java:48)
at org.apache.accumulo.testing.randomwalk.Framework.main(Framework.java:92)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 5
at org.apache.accumulo.testing.randomwalk.sequential.MapRedVerify.visit(MapRedVerify.java:48)
at org.apache.accumulo.testing.randomwalk.Module$1.call(Module.java:303)
at org.apache.accumulo.testing.randomwalk.Module$1.call(Module.java:298)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
at java.lang.Thread.run(Thread.java:748)
apache/accumulo#537 was a WAL recovery bug where deleted data could possibly come back. This flaw was not found through testing. It would be nice to delete data in the continuous ingest test to cover this case. This could be done by periodically deleting a previously written set of linked list. The linked list would need to be deleted in reverse order to avoid false positives in the test. Could do something like the following.
while (true) {
//write 1,000,000 linked list of 25 nodes
if(random.nextInt(10) == 0) {
//delete previously written 1,000,000 linked list of 25 nodes in reverse order
}
Original Jira ticket: https://issues.apache.org/jira/browse/ACCUMULO-1982
Main snippet:
It would be nice if we tested that sudoing to the desired user worked
for the current user and gave better error messages in the case of failure.
The code here has changed significantly since the original ticket was created so it is possible the need for this has lessened but from my quick investigation I did not find any validation checks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.