apache / accumulo-testing Goto Github PK

View Code? Open in Web Editor NEW

15.0 15.0 40.0 1.37 MB

Apache Accumulo Testing

Home Page: https://accumulo.apache.org

License: Apache License 2.0

Shell 9.64% Java 72.96% Perl 1.02% Python 6.58% Dockerfile 0.18% HCL 9.62%

accumulo big-data hacktoberfest

accumulo-testing's People

Contributors

Stargazers

Watchers

Forkers

mikewalch keith-turner milleruntime jmark99 jkrdev busbey pzhao12testa dalavancloud elinaawise plainolneesh rcarterjr jzgithub1 jzeiberg hkeebler arvindshmicrosoft billierinaldi tushardhadiwal phrocker lhfei mandarinamdar tynyttie neotim domgarguilo rchoksi-hw albertwhitlock manno15 bukrosszabolcs chaseknowlden isabella232 edcoleman ctubbsii dlmarion brianloss makpatelqa cshannon kevinrr888 ddanielr arbaazkhan1 gomboc-ai-dev broadsides-sapr

accumulo-testing's Issues

bin/cingest verify fails with 'Output directory hdfs://localhost:8020/tmp/ci-verify already exists'

If I've already run a continuous ingest verify and I want to run it again, I get the following error:

Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://localhost:8020/tmp/ci-verify already exists
	at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:164)
	at org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:277)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:143)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1570)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1567)
	at java.base/java.security.AccessController.doPrivileged(Native Method)
	at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1567)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1588)
	at org.apache.accumulo.testing.continuous.ContinuousVerify.run(ContinuousVerify.java:193)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
	at org.apache.accumulo.testing.continuous.ContinuousVerify.main(ContinuousVerify.java:204)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:236)

It'd be better if the script noticed that a verify job had already run, and asked to clean up this directory for me, so I didn't have to manually execute: hdfs dfs -rm -r /tmp/ci-verify

Add stop here to agitation

To exercise more code paths during testing it would be nice to make the agitation scripts call stop-here.sh sometimes instead of killing processes.

RandomCachedLookupsPT will error and then hang if not enough memory is available

Describe the bug
RandomCachedLookupsPT alters some of the tserver configs. If the cluster that is running the test lacks the memory it will throw the exception below but will continuously hang until manually canceled. The main issue here is the exception is hidden in the logs and isn't properly handled by the PT to check for it.

2021-09-27T12:21:11,595 [start.Main] ERROR: Thread 'tserver' died.
java.lang.IllegalArgumentException: Maximum tablet server map
 memory 265,751,101 block cache sizes 3,301,756,108 and mutation
 queue size 40,265,318 is too large for this JVM configuration 805,306,368

To Reproduce

Use the default (smaller) performance profile for fluo-uno
Run the RandomCachedLookupsPT
It will appear that the test is hanging. Checking the tserver logs will show the error above.

Expected behavior
Ideally, the PT should check beforehand to make sure that the system it is testing on can handle the system config changes it makes or at least exit nicely when an exception is thrown.

Additional context
This PT does pass as expected when using the larger performance profile for fluo-uno. It is possible that the simplest solution of documenting the need for users to configure these settings before running the PT might also be good enough.

Update Scalability test

It was suggested that the scalability test be run and updated. After a bit of investigation it seems like there is a lot that has changed since the test was in working order and subsequently may take a bit of work to get it in a usable state again.

Before I put too much work into it wanted to make a ticket to see if anyone has an opinion whether this test is still useful and would be worth reviving.

Add performance test for locality groups

This performance test should measure writing and reading with locality groups.

cannot run ci-createtable

Building /home/charbel/accumulo-testing/core/target/accumulo-testing-core-2.0.0-SNAPSHOT-shaded.jar
[INFO] Scanning for projects...
[WARNING] 
[WARNING] Some problems were encountered while building the effective model for org.apache.accumulo:accumulo-testing-core:jar:2.0.0-SNAPSHOT
[WARNING] 'build.plugins.plugin.version' for org.apache.maven.plugins:maven-shade-plugin is missing. @ org.apache.accumulo:accumulo-testing-core:[unknown-version], /home/charbel/accumulo-testing/core/pom.xml, line 93, column 19
[WARNING] 
[WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
[WARNING] 
[WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
[WARNING] 
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO] 
[INFO] Apache Accumulo Testing Parent
[INFO] Apache Accumulo Testing Core
[INFO] Apache Accumulo Testing YARN
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building Apache Accumulo Testing Parent 2.0.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-clean-plugin:3.0.0:clean (default-clean) @ accumulo-testing ---
[INFO] Deleting /home/charbel/accumulo-testing/target
[INFO] 
[INFO] --- formatter-maven-plugin:0.5.2:format (default) @ accumulo-testing ---
[INFO] Using 'UTF-8' encoding to format source files.
[INFO] Number of files to be formatted: 0
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) @ accumulo-testing ---
[INFO] 
[INFO] --- maven-site-plugin:3.5.1:attach-descriptor (attach-descriptor) @ accumulo-testing ---
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building Apache Accumulo Testing Core 2.0.0-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[WARNING] The POM for org.apache.accumulo:accumulo-client-mapreduce:jar:1.8.0 is missing, no dependency information available
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Accumulo Testing Parent ..................... SUCCESS [  1.505 s]
[INFO] Apache Accumulo Testing Core ....................... FAILURE [  1.160 s]
[INFO] Apache Accumulo Testing YARN ....................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3.688 s
[INFO] Finished at: 2018-05-24T17:23:08-07:00
[INFO] Final Memory: 26M/278M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project accumulo-testing-core: Could not resolve dependencies for project org.apache.accumulo:accumulo-testing-core:jar:2.0.0-SNAPSHOT: Failure to find org.apache.accumulo:accumulo-client-mapreduce:jar:1.8.0 in https://repo.maven.apache.org/maven2 was cached in the local repository, resolution will not be reattempted until the update interval of central has elapsed or updates are forced -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :accumulo-testing-core
Error: Could not find or load main class org.apache.accumulo.testing.core.continuous.CreateTable

accumulo version : 1.8.0

YieldingScanExecutorPT fails to run. Throws AccumuloServerException

YieldingScanExecutorPT fails to complete with the following output:

Shell output

Exception in thread "main" java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.accumulo.core.clientImpl.AccumuloServerException: Error on server thor:9997
	at org.apache.accumulo.testing.performance.util.TestExecutor.lambda$stream$0(TestExecutor.java:52)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.LongPipeline.collect(LongPipeline.java:491)
	at java.base/java.util.stream.LongPipeline.summaryStatistics(LongPipeline.java:468)
	at org.apache.accumulo.testing.performance.tests.YieldingScanExecutorPT.runShortScans(YieldingScanExecutorPT.java:215)
	at org.apache.accumulo.testing.performance.tests.YieldingScanExecutorPT.runTest(YieldingScanExecutorPT.java:114)
	at org.apache.accumulo.testing.performance.impl.PerfTestRunner.main(PerfTestRunner.java:51)
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.accumulo.core.clientImpl.AccumuloServerException: Error on server thor:9997
	at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
	at org.apache.accumulo.testing.performance.util.TestExecutor.lambda$stream$0(TestExecutor.java:50)
	... 11 more
Caused by: java.lang.RuntimeException: org.apache.accumulo.core.clientImpl.AccumuloServerException: Error on server thor:9997
	at org.apache.accumulo.core.clientImpl.ScannerIterator.getNextBatch(ScannerIterator.java:185)
	at org.apache.accumulo.core.clientImpl.ScannerIterator.hasNext(ScannerIterator.java:110)
	at com.google.common.collect.Iterators.size(Iterators.java:163)
	at com.google.common.collect.Iterables.size(Iterables.java:126)
	at org.apache.accumulo.testing.performance.tests.YieldingScanExecutorPT.scan(YieldingScanExecutorPT.java:170)
	at org.apache.accumulo.testing.performance.tests.YieldingScanExecutorPT.lambda$runShortScans$1(YieldingScanExecutorPT.java:212)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.accumulo.core.clientImpl.AccumuloServerException: Error on server thor:9997
	at org.apache.accumulo.core.clientImpl.ThriftScanner.scan(ThriftScanner.java:324)
	at org.apache.accumulo.core.clientImpl.ScannerIterator.readBatch(ScannerIterator.java:156)
	at org.apache.accumulo.core.clientImpl.ScannerIterator.getNextBatch(ScannerIterator.java:174)
	... 9 more
Caused by: org.apache.thrift.TApplicationException: Internal error processing startScan
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
	at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.recv_startScan(TabletClientService.java:249)
	at org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Client.startScan(TabletClientService.java:221)
	at org.apache.accumulo.core.clientImpl.ThriftScanner.scan(ThriftScanner.java:453)
	at org.apache.accumulo.core.clientImpl.ThriftScanner.scan(ThriftScanner.java:317)
	... 11 more

Server logs

java.lang.NullPointerException at 
org.apache.accumulo.server.conf.TableConfiguration.createScanDispatcher(TableConfiguration.java:215) at 
org.apache.accumulo.server.conf.TableConfiguration.lambda$new$1(TableConfiguration.java:82) at 
org.apache.accumulo.core.conf.AccumuloConfiguration$DeriverImpl.derive(AccumuloConfiguration.java:482) at 
org.apache.accumulo.server.conf.TableConfiguration.getScanDispatcher(TableConfiguration.java:270) at 
org.apache.accumulo.tserver.ThriftClientHandler.getScanDispatcher(ThriftClientHandler.java:272) at 
org.apache.accumulo.tserver.ThriftClientHandler.continueScan(ThriftClientHandler.java:378) at 
org.apache.accumulo.tserver.ThriftClientHandler.startScan(ThriftClientHandler.java:342) at 
jdk.internal.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at 

java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at 
java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
org.apache.accumulo.core.trace.TraceUtil.lambda$wrapService$6(TraceUtil.java:235) at 
com.sun.proxy.$Proxy38.startScan(Unknown Source) at 
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2944) at 
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2923) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) at 
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) at 
org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:63) at 
org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) at 
org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:114) at org.apache.thrift.server.Invocation.run(Invocation.java:18) at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at 
java.base/java.lang.Thread.run(Thread.java:829) 

java.lang.ClassNotFoundException: org.apache.accumulo.testing.performance.tests.TimedScanDispatcher at 
java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:471) at 
java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:589) at 
org.apache.accumulo.start.classloader.AccumuloClassLoader$1.loadClass(AccumuloClassLoader.java:213) at 
java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:522) at 
org.apache.accumulo.core.classloader.ClassLoaderUtil.loadClass(ClassLoaderUtil.java:85) at 
org.apache.accumulo.core.conf.ConfigurationTypeHelper.getClassInstance(ConfigurationTypeHelper.java:203) at 
org.apache.accumulo.core.conf.ConfigurationTypeHelper.getClassInstance(ConfigurationTypeHelper.java:176) at 
org.apache.accumulo.core.conf.Property.createTableInstanceFromPropertyName(Property.java:1747) at 
org.apache.accumulo.server.conf.TableConfiguration.createScanDispatcher(TableConfiguration.java:209) at 
org.apache.accumulo.server.conf.TableConfiguration.lambda$new$1(TableConfiguration.java:82) at 
org.apache.accumulo.core.conf.AccumuloConfiguration$DeriverImpl.derive(AccumuloConfiguration.java:482) at 
org.apache.accumulo.server.conf.TableConfiguration.getScanDispatcher(TableConfiguration.java:270) at 
org.apache.accumulo.tserver.ThriftClientHandler.getScanDispatcher(ThriftClientHandler.java:272) at 
org.apache.accumulo.tserver.ThriftClientHandler.continueScan(ThriftClientHandler.java:378) at 
org.apache.accumulo.tserver.ThriftClientHandler.startScan(ThriftClientHandler.java:342) at 
jdk.internal.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at 
java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
org.apache.accumulo.core.trace.TraceUtil.lambda$wrapService$6(TraceUtil.java:235) at 
com.sun.proxy.$Proxy38.startScan(Unknown Source) at 
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2944) at 
org.apache.accumulo.core.tabletserver.thrift.TabletClientService$Processor$startScan.getResult(TabletClientService.java:2923) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38) at 
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38) at 
org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:63) at 
org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:524) at 
org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:114) at org.apache.thrift.server.Invocation.run(Invocation.java:18) at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at 
java.base/java.lang.Thread.run(Thread.java:829)

GC does not restart during agitation

I ran CI locally with agitation and everything seemed to start back up except the Garbage Collector. It looks like the issue is in <testing-home>/libexec/master-agitator.pl

New debug logging broke scripts

When running the ./bin/cingest script, maven will fail to build the shaded jar. This is due to the accumulo version command now logging debug information. The conf/env.sh script gets the version from the command and after printing it to a file, I saw it was setting accumulo.version to this:

2020-12-03T10:53:35,524 [classloader.AccumuloClassLoader] DEBUG: Using Accumulo configuration at /home/mike/workspace/uno/install/accumulo-2.1.0-SNAPSHOT/conf/accumulo.properties
2020-12-03T10:53:35,593 [classloader.AccumuloClassLoader] DEBUG: Create 2nd tier ClassLoader using URLs: []
2.1.0-SNAPSHOT

Build errors and warnings due to recent property name changes

With the recent property and class name changes to Manager (here I think) builds of accumulo-testing now fail. There are also a few warnings that relate to other recent changes.

[ERROR] COMPILATION ERROR : 
[INFO] -------------------------------------------------------------
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/Module.java:[46,37] cannot find symbol
  symbol:   class SimpleThreadPool
  location: package org.apache.accumulo.core.util
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Replication.java:[20,1] cannot find symbol
  symbol:   static MASTER_REPLICATION_SCAN_INTERVAL
  location: enum org.apache.accumulo.core.conf.Property
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/bulk/Setup.java:[29,37] cannot find symbol
  symbol:   class SimpleThreadPool
  location: package org.apache.accumulo.core.util
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/Module.java:[227,35] cannot find symbol
  symbol:   class SimpleThreadPool
  location: class org.apache.accumulo.testing.randomwalk.Module
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Replication.java:[77,22] cannot find symbol
  symbol:   variable MASTER_REPLICATION_SCAN_INTERVAL
  location: class org.apache.accumulo.testing.randomwalk.concurrent.Replication
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[87,35] cannot find symbol
  symbol:   variable MASTER_BULK_THREADPOOL_SIZE
  location: class org.apache.accumulo.core.conf.Property
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[88,35] cannot find symbol
  symbol:   variable MASTER_BULK_RETRIES
  location: class org.apache.accumulo.core.conf.Property
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[89,35] cannot find symbol
  symbol:   variable MASTER_BULK_TIMEOUT
  location: class org.apache.accumulo.core.conf.Property
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[90,35] cannot find symbol
  symbol:   variable MASTER_FATE_THREADPOOL_SIZE
  location: class org.apache.accumulo.core.conf.Property
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[91,35] cannot find symbol
  symbol:   variable MASTER_RECOVERY_DELAY
  location: class org.apache.accumulo.core.conf.Property
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[92,35] cannot find symbol
  symbol:   variable MASTER_LEASE_RECOVERY_WAITING_PERIOD
  location: class org.apache.accumulo.core.conf.Property
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[93,35] cannot find symbol
  symbol:   variable MASTER_THREADCHECK
  location: class org.apache.accumulo.core.conf.Property
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/concurrent/Config.java:[94,35] cannot find symbol
  symbol:   variable MASTER_MINTHREADS
  location: class org.apache.accumulo.core.conf.Property
[ERROR] /home/jeffreymanno/git/accumulo-testing/src/main/java/org/apache/accumulo/testing/randomwalk/bulk/Setup.java:[64,32] cannot find symbol
  symbol:   class SimpleThreadPool
  location: class org.apache.accumulo.testing.randomwalk.bulk.Setup
[INFO] 14 errors

WARNING] COMPILATION WARNING : 
[INFO] -------------------------------------------------------------
[WARNING] Cannot find annotation method 'since()' in type 'java.lang.Deprecated'
[WARNING] Cannot find annotation method 'since()' in type 'java.lang.Deprecated'
[WARNING] Cannot find annotation method 'since()' in type 'java.lang.Deprecated'
[WARNING] Cannot find annotation method 'since()' in type 'java.lang.Deprecated'
[WARNING] Cannot find annotation method 'since()' in type 'java.lang.Deprecated'

[WARNING] Invalid project model for artifact [commons-vfs2:org.apache.commons:2.6.0]. It will be ignored by the remote resources Mojo.
[WARNING] Invalid project model for artifact [accumulo-hadoop-mapreduce:org.apache.accumulo:2.1.0-SNAPSHOT]. It will be ignored by the remote resources Mojo.
[WARNING] Invalid project model for artifact [accumulo-core:org.apache.accumulo:2.1.0-SNAPSHOT]. It will be ignored by the remote resources Mojo.
[WARNING] Invalid project model for artifact [accumulo-start:org.apache.accumulo:2.1.0-SNAPSHOT]. It will be ignored by the remote resources Mojo.

Versions:

Affected version(s) of this project: recent changes of 2.1.0-SNAPSHOT

To Reproduce

Use fluo-uno to start up a cluster with the most recent changes in 2.1.0-snapshot
Configure accumulo-testing to use that cluster
rebuild with maven if necessary and then run the build script in accumulo-testing/bin

Create script that automates setting up EC2 test cluster

When testing Accumulo I often go through the following task manually.

Build snapshot version of Accumulo locally and copy it to Muchos dir
Use Muchos to setup a cluster with that snapshot tarball
After Muchos sets up the cluster :
- git clone and mvn install the same snapshot version of Accumulo
- git clone accumulo testing repo and build it.

It may be nice to have script that does this. The argument for the script would be the following :

URL for Accumuo git repo and a branch in that repo
URL for Accumuo testing repo and a branch in that repo
Local dir where muchos is setup.

This script would setup the EC2 cluster with the version of Accumulo from the git repo. It would also setup accumulo testing on the cluster from the git repo.

Add performance test for bulk import

Could possibly use the following for inspiration.

https://github.com/keith-turner/bulky

Support non default HDFS path as a parameter for continuous bulk ingest

Support non default HDFS (in case of multiple volumes) path as a parameter for continuous bulk ingest. For e.g
bin/cingest bulk abfs://[email protected]/azbulk-multi

In this case abfs://[email protected]/ HDFS filesystem is not the default HDFS and has been added as a additional volume to accumulo
The default HDFS fileystem configured is hdfs://accucluster

Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: abfs://[email protected]/azbulk-multi, expected: hdfs://accucluster:8020

Scripts print error in Docker

I belive the dockerfile for creating a docker image was broken by changes to the scripts in edbc7cd. Trying to run the cingest script in the docker image gives this error:

/opt/at/bin/cingest: line 19: /opt/at/bin/build: No such file or directory

Simplify accumulo-testing command and create Docker image

It would be great if accumulo-testing could be simplified and run in Docker. Below is possible usage.

$ ./bin/accumulo-testing 

Usage: accumulo-testing <test> (<argument>)

Available tests:

   ci <application>    Runs continous ingest <application>.
                                Possible applications: createtable, ingest, walk, batchwalk, scan, verify, moru

   rw <module>         Runs random walk <module>
                                Modules located in core/src/main/resources/randomwalk/modules

Performance cluster control script no longer working with Uno

Something has changed where the cluster control script for uno no longer seems to work. This script can be used by performance test.

The following commands are being run by the scripts. These used to work and no longer work with the latest Uno. Not sure what changed. The goal of all of these commands is to avoid setting up hadoop and zookeeper from scratch between performance test.

  uno install accumulo
  uno run zookeeper
  uno run hadoop
  uno setup accumulo --no-deps
  uno stop accumulo --no-deps 
  uno start accumulo --no-deps

Create upgrade testing framework

See https://issues.apache.org/jira/browse/ACCUMULO-2145 for a description of work and prior work done.

This could use Uno. For testing apache/accumulo#1111 I wrote the following script that uses Uno.

#! /usr/bin/env bash

ACCUMULO_DIR=~/git/accumulo
UNO_DIR=~/git/uno
BULK=/tmp/upt

cd $ACCUMULO_DIR
git checkout 1.9
git clean -xfd
cd $UNO_DIR
./bin/uno fetch accumulo
./bin/uno setup accumulo
(
  eval "$(./bin/uno env)"

  hadoop fs -ls /accumulo/version


  hadoop fs -rmr "$BULK"
  hadoop fs -mkdir -p "$BULK/fail"
  accumulo org.apache.accumulo.test.TestIngest -i uno -u root -p secret --rfile $BULK/bulk/test --timestamp 1 --size 50 --random 56 --rows 200000 --start 200000 --cols 1

  accumulo org.apache.accumulo.test.TestIngest -i uno -u root -p secret --timestamp 1 --size 50 --random 56 --rows 200000 --start 0 --cols 1  --createTable --splits 10

  accumulo shell -u root -p secret <<EOF
   table test_ingest
   importdirectory $BULK/bulk $BULK/fail false
   createtable foo
   config -t foo -s table.compaction.major.ratio=2
   insert r1 f1 q1 v1
   flush -t foo -w
   scan -t accumulo.metadata -c file
   insert r1 f1 q2 v2
   insert r2 f1 q1 v3
EOF
)
pkill -9 -f accumulo\\.start
cd $ACCUMULO_DIR
git checkout accumulo-1111
git clean -xfd
cd $UNO_DIR
./bin/uno fetch accumulo
./bin/uno install accumulo --no-deps
./install/accumulo*/bin/accumulo-cluster start
(
  eval "$(./bin/uno env)"
  hadoop fs -ls /accumulo/version
  accumulo shell -u root -p secret <<EOF
    config -t foo -f table.compaction.major.ratio
    scan -t foo -np
    scan -t accumulo.metadata -c file
    compact -t foo -w
    scan -t foo -np
    scan -t accumulo.metadata -c file
EOF

  accumulo org.apache.accumulo.test.VerifyIngest --size 50 --timestamp 1 --random 56 --rows 400000 --start 0 --cols 1
)

Drop the 1.9 testing branch

It was raised on apache/accumulo#1312 that the testing repo's 1.9 branch was not in a good state, and it was argued that we shouldn't be updating it, instead keeping the 1.9 testing code with the main 1.9 code base, like it has been.

We could merge the 1.9 branch into the master branch with -sours to preserve the 1.9 history before deleting it.

I'll leave this issue open for at least a few days, for comment, before taking any action.

Missing license headers

There are several java files missing license headers in the accumulo-testing repo. We should add headers as needed and add the apache-rat-plugin to the pom.xml so that CI will catch this problem when running mvn verify.

Performance tests not found with list command

Running ./bin/performance list gives the following error:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/accumulo/testing/performance/tests/SplitBalancingPT (wrong name: target/classes/org/apache/accumulo/testing/performance/tests/SplitBalancingPT)
	at java.base/java.lang.ClassLoader.defineClass1(Native Method)
	at java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1016)
	at java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:174)
	at java.base/jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:802)
	at java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:700)
	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:623)
	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:581)
	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
	at com.google.common.reflect.ClassPath$ClassInfo.load(ClassPath.java:328)
	at org.apache.accumulo.testing.performance.impl.ListTests.main(ListTests.java:34)

Explore storing continuous ingest bulk import files in S3

When running bulk import continuous ingest test it can take a while to generate a good bit of data to start testing. Not sure, but it may be faster to generate a data set once and store it in S3. Then future test could possibly use that data set.

I think it would be interesting to experiment with this and if it works well add documentation to the bulk import test docs explaining how to do it. One gotcha with this approach is that anyone running a test needs to be consistent with split points. A simple way to address this problem would be store a file of split points in S3 with the data.

Incorrect authorizations selected by continous scanners and walkers

Authorizations configured for continous scanners and walkers in accumulo-testing.properties in "test.ci.common.auths" incorrectly interpreted as single character each instead of | delimited. Like if
test.ci.common.auths=SYS,HR,ADM|SYS,ADM is defined. Authorizations chosen are S,Y,S,H,R ... and so on instead of "SYS,HR,ADM" , "SYS,ADM" and so on.

The issue looks to be in authValue.split("|") call in ContinuousEnv.java. The code is not considering | as a metacharacter in regex. It will require escape character to define | in the split function call. something like authValue.split("\|") to handle the desired functionality correctly.

Refactor out Non API code

With the import-control checkstyle, it is now easier to detect use of non-API code used. The following exceptions should be refactored and removed from the config file:
https://github.com/apache/accumulo-testing/blob/master/contrib/import-control.xml#L35

Error running RW in Docker

This error occurs when running any module in rwalk running within docker:

java.lang.RuntimeException: Failed to connect to zookeeper (localhost:2181) within 2x zookeeper timeout period 30000
	at org.apache.accumulo.fate.zookeeper.ZooSession.connect(ZooSession.java:157)
	at org.apache.accumulo.fate.zookeeper.ZooSession.getSession(ZooSession.java:201)
	at org.apache.accumulo.fate.zookeeper.ZooReader.getSession(ZooReader.java:42)
	at org.apache.accumulo.fate.zookeeper.ZooReader.getZooKeeper(ZooReader.java:46)
	at org.apache.accumulo.fate.zookeeper.ZooCache.getZooKeeper(ZooCache.java:148)
	at org.apache.accumulo.fate.zookeeper.ZooCache.access$900(ZooCache.java:48)
	at org.apache.accumulo.fate.zookeeper.ZooCache$2.run(ZooCache.java:406)
	at org.apache.accumulo.fate.zookeeper.ZooCache$2.run(ZooCache.java:379)
	at org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:271)
	at org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:434)
	at org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:364)
	at org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:398)
	at org.apache.accumulo.core.clientImpl.Tables.getTableMap(Tables.java:179)
	at org.apache.accumulo.core.clientImpl.Tables.getTableMap(Tables.java:167)
	at org.apache.accumulo.core.clientImpl.Tables.getNameToIdMap(Tables.java:151)
	at org.apache.accumulo.core.clientImpl.TableOperationsImpl.exists(TableOperationsImpl.java:192)
	at org.apache.accumulo.testing.randomwalk.bulk.Setup.visit(Setup.java:49)
	at org.apache.accumulo.testing.randomwalk.Module.visit(Module.java:237)
	at org.apache.accumulo.testing.randomwalk.Framework.run(Framework.java:48)
	at org.apache.accumulo.testing.randomwalk.Framework.main(Framework.java:92)

Create smoke test suite

It would be nice if there was a test suite that developers and users could run on an Accumulo instance to sanity check an upgrade/install and verify the basic functionality of Accumulo.

Add units to performance test results

It would be helpful if the JSON contained units like 'milliseconds' or 'ms'

Support generating bulk import data that covers a subset of the table.

Would be nice to be able to generate bulk import data for the CI test that covers a subset of the table instead of the entire table. This may be possible with a -o min=y -o max=z config command line options, not sure. If it is possible, could update example test scripts to suggest using it.

RW Concurrent test errors on BulkImport for 2.0

I am testing the 2.0 branch at commit 9e32263e7 using Uno and keep seeing this same error while trying to run RW Concurrent module:

2019-06-20 14:29:52,354 [testing.randomwalk.Framework] INFO : Running random walk test with module: Concurrent.xml
2019-06-20 14:30:19,161 [testing.randomwalk.Framework] ERROR: Error during random walk
java.lang.Exception: Error running node ct.BulkImport
	at org.apache.accumulo.testing.randomwalk.Module.visit(Module.java:370)
	at org.apache.accumulo.testing.randomwalk.Framework.run(Framework.java:48)
	at org.apache.accumulo.testing.randomwalk.Framework.main(Framework.java:92)
Caused by: org.apache.accumulo.core.client.AccumuloException: Bulk import  directory /tmp/concurrent_bulk/b_4640b9365aa878f3 does not exist!
	at org.apache.accumulo.core.clientImpl.TableOperationsImpl.checkPath(TableOperationsImpl.java:1173)
	at org.apache.accumulo.core.clientImpl.TableOperationsImpl.importDirectory(TableOperationsImpl.java:1197)
	at org.apache.accumulo.testing.randomwalk.concurrent.BulkImport.visit(BulkImport.java:134)
	at org.apache.accumulo.testing.randomwalk.Module$1.call(Module.java:303)
	at org.apache.accumulo.testing.randomwalk.Module$1.call(Module.java:298)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
	at java.lang.Thread.run(Thread.java:748)
2019-06-20 14:30:19,163 [testing.randomwalk.Framework] INFO : Test finished

The BulkImport module is currently using the old deprecated importDirectory but it should still work.

Add performance test for bloom filters

cingest bulk command should give map reduce job a name

The bin/cingest bulk command starts a map reduce that generates data to bulk import. When the map reduce job runs its name is accumulo-testing-shaded.jar. The code should set a better name.

Add performance test for conditional mutations

Could possibly use the following for inspiration.

https://github.com/keith-turner/CMPT

Run Shellcheck on bash scripts

Revisit MapReduce configuration changes for Hadoop 3

Accumulo-testing has MapReduce jobs that needed the following configuration added to work with Hadoop 3. This ticket is revisit this configuration added in #49 so clients don't have to specify location of Hadoop home directory on servers.

hadoopConfig.set("yarn.app.mapreduce.am.env", "HADOOP_MAPRED_HOME=" + hadoopHome);

Add performance test for long running scans

In #24 a PT was added for lots of short randoms scans. It would be nice to have another PT for long running scans. For example measure the performance of reading X million entries from an Accumulo table with multiple tablets. Could also measure running 1,2,4,8,and 16 concurrent long running scans.

Create herding performance test

I wrote the following test while working on apache/accumulo#990. This could be turned into a performance test.

public class HerdTest {

  private static final byte[] E = new byte[] {};
  private static final byte[] FAM = "pinky".getBytes();

  private static final int NUM_ROWS = 1_000_000;
  private static final int NUM_COLS = 10;

  public static void main(String[] args) throws Exception {

    Connector conn = CmdUtil.getConnector();

    if (!conn.tableOperations().exists("herd")) {
      conn.tableOperations().create("herd", new NewTableConfiguration().setProperties(
          Collections.singletonMap(Property.TABLE_BLOCKCACHE_ENABLED.getKey(), "true")));
      write(conn);
      conn.tableOperations().flush("herd", null, null, true);
    }

    testHerd(conn, 32);
  }

  private static void testHerd(Connector conn, int nt)
      throws InterruptedException, ExecutionException {
    ExecutorService tp = Executors.newFixedThreadPool(nt);
    final CyclicBarrier cb = new CyclicBarrier(nt);

    long t1 = System.currentTimeMillis();
    List<Future<?>> futures = new ArrayList<>();
    for (int i = 0; i < nt; i++) {
      Future<?> f = tp.submit(new Runnable() {
        @Override
        public void run() {
          try {
            Scanner scanner = conn.createScanner("herd", Authorizations.EMPTY);

            for (int i = 0; i < 1000; i++) {

              // System.out.println(Thread.currentThread().getId()+" "+i);

              cb.await();

              byte[] row = FastFormat.toZeroPaddedString(i * 1000, 8, 16, E);

              scanner.setRange(Range.exact(new String(row)));
              for (Entry<Key,Value> entry : scanner) {

              }
            }
          } catch (Exception e) {
            e.printStackTrace();
          }
        }
      });
      futures.add(f);
    }

    for (Future<?> future : futures) {
      future.get();
    }

    long t2 = System.currentTimeMillis();

    System.out.println(t2 - t1);

    // scanner.close();
    tp.shutdown();
  }

  private static void write(Connector conn) throws Exception {

    try (BatchWriter bw = conn.createBatchWriter("herd", new BatchWriterConfig())) {
      Random rand = new Random();

      for (int r = 0; r < NUM_ROWS; r++) {
        byte[] row = FastFormat.toZeroPaddedString(r, 8, 16, E);
        Mutation m = new Mutation(row);
        for (int c = 0; c < NUM_COLS; c++) {
          byte[] qual = FastFormat.toZeroPaddedString(c, 4, 16, E);

          byte[] val = new byte[32];
          rand.nextBytes(val);

          m.put(FAM, qual, val);
        }

        bw.addMutation(m);
      }
    }
  }
}

Hadoop dependencies not properly converged in shaded jar

When building the testing shaded jar using Accumulo 2.0.0-SNAP and Hadoop 2.8.4 the hadoop dependencies are not properly converged in the shaded jar. Seeing warnings like the following.

[WARNING] hadoop-client-api-3.0.2.jar, hadoop-hdfs-client-2.8.4.jar define 1642 overlapping classes: 
[WARNING]   - org.apache.hadoop.hdfs.protocol.CacheDirectiveInfo$Expiration
[WARNING]   - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$SetOwnerResponseProto$Builder
[WARNING]   - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetFileLinkInfoRequestProto$Builder
[WARNING]   - org.apache.hadoop.hdfs.web.URLConnectionFactory$1
[WARNING]   - org.apache.hadoop.hdfs.protocol.proto.XAttrProtos$SetXAttrRequestProto$1
[WARNING]   - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ModifyCacheDirectiveResponseProto$1
[WARNING]   - org.apache.hadoop.fs.XAttr$1
[WARNING]   - org.apache.hadoop.hdfs.protocol.CachePoolStats$Builder
[WARNING]   - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ReportBadBlocksResponseProtoOrBuilder
[WARNING]   - org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetBlockLocationsResponseProto
[WARNING]   - 1632 more...

Add performance test for group commit

When many clients write data to the same tablet server at around the same time their data should be synced to the write ahead log as a group. If group commit is not working properly it can cause performance problem for many clients that will not be seen for a single client. It would be nice to have a performance test that specifically checks for this.

I created a project to do this in the past. Not sure what state it is in.

https://github.com/keith-turner/mutslam

Would be nice to test group commit performance for mutations and conditional mutations.

Update CI to use Compaction Planner

Continuous Ingest is still using BasicCompactionStrategy, which has been deprecated in favor of the new compaction code. This needs to be replaced with a CompactionPlanner, most likely DefaultCompactionPlanner will work fine. This may just be a configuration change.

This would be great to do for the 2.1 release testing.

Permission Denied when running Rwalk scripts

With the recent changes in accumulo #1828, some of the modules for RWalk now throw permission exceptions.

The specific one I ran into while running ./bin./rwalk All.xml and specifically./bin/rwalk Security.xml is below:

ThriftSecurityException(user:system_flash_superheroes_local, code:PERMISSION_DENIED)
	at org.apache.accumulo.server.security.SecurityOperation.authenticateUser(SecurityOperation.java:238)
	at org.apache.accumulo.server.client.ClientServiceHandler.authenticateUser(ClientServiceHandler.java:150)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.apache.accumulo.core.trace.TraceUtil.lambda$wrapService$6(TraceUtil.java:235)
	at com.sun.proxy.$Proxy38.authenticateUser(Unknown Source)
	at org.apache.accumulo.core.clientImpl.thrift.ClientService$Processor$authenticateUser.getResult(ClientService.java:2608)
	at org.apache.accumulo.core.clientImpl.thrift.ClientService$Processor$authenticateUser.getResult(ClientService.java:2587)
	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
	at org.apache.accumulo.server.rpc.TimedProcessor.process(TimedProcessor.java:63)
	at org.apache.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518)
	at org.apache.accumulo.server.rpc.CustomNonBlockingServer$CustomFrameBuffer.invoke(CustomNonBlockingServer.java:114)
	at org.apache.thrift.server.Invocation.run(Invocation.java:18)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

There are also several warnings thrown now for each permission in our auditing. Some examples below:

operation:  failed; user: system_flash_superheroes_local; 
action:  changeAuthorizations; targetUser: system_flash_superheroes_local;  
authorizations:  Towels,Paper,Brush,Asparagus,Fishsticks,PotatoSkins,Ribs,Celery;  
exception: ThriftSecurityException(user:system_flash_superheroes_local,  
code:PERMISSION

operation: failed; user: root; checking permission DROP_USER on table_flash_superheroes_local denied; 
exception: ThriftSecurityException(user:table_flash_superheroes_local, 
code:USER_DOESNT_EXIST)

operation:  failed; user: table_flash_superheroes_local; 
action:  revokeTablePermission;
permission: BULK_IMPORT; targetTable:  security_flash_superheroes_local; targetUser:  system_flash_superheroes_local;; 
exception:  ThriftSecurityException(user:table_flash_superheroes_local,  
code:PERMISSION_DENIED)

This seems to happen for each permission type, either with Permission_Denied or for User_Doesn't_Exist.
New one below:

ERROR Framework Error during random walk
 java.lang.Exception: Error running node Security.xml
	at org.apache.accumulo.testing.randomwalk.Module.visit(Module.java:370)
	at org.apache.accumulo.testing.randomwalk.Framework.run(Framework.java:48)
	at org.apache.accumulo.testing.randomwalk.Framework.main(Framework.java:92)
Caused by: org.apache.accumulo.core.client.AccumuloSecurityException: Error BAD_CREDENTIALS for user system_flash_superheroes_local - Username or Password is Invalid

Migrate to log4j2

Follow on from issue #130 / PR #140

This testing repository should migrate to log4j2, and any configured console logging should be configured to use STDERR instead of STDOUT in the log4j2 configuration files, so that console output won't interfere with output from executed commands used for scripts (see #130).

ConditionalMutationsPT fails with NumberFormatException

When running the performance tests (via ./bin/performance run), ConditionalMutationsPT errors out with the following:

Exception in thread "main" java.lang.NumberFormatException: For input string: "∞"
	at java.base/jdk.internal.math.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2054)
	at java.base/jdk.internal.math.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
	at java.base/java.lang.Double.parseDouble(Double.java:543)
	at org.apache.accumulo.testing.performance.tests.ConditionalMutationsPT.runConditionalMutationsTest(ConditionalMutationsPT.java:104)
	at org.apache.accumulo.testing.performance.tests.ConditionalMutationsPT.runTest(ConditionalMutationsPT.java:76)
	at org.apache.accumulo.testing.performance.impl.PerfTestRunner.main(PerfTestRunner.java:51)

I think the issue is occuring here:

accumulo-testing/src/main/java/org/apache/accumulo/testing/performance/tests/ConditionalMutationsPT.java

Line 153 in d415164

return 10000.0 / TimeUnit.NANOSECONDS.toSeconds(t2 - t1);

where a nanosecond value is being converted to a second value. The error happens when the two nanosecond values, t1 and t2, have a difference of less than 1000000000 (1 second). TimeUnit.NANOSECONDS.toSeconds will convert the nanos to the nearest second rounding down. So anything less than 1000000000 (1 second) will result in 0 seconds which, in this case, is in the denominator which returns "∞" and throws an error when being parsed.

I think this can be corrected by manually converting from nanos to seconds.

Explore writng a specialized summarizer for bulk ingest

To debug a recent bulk ingest test I wrote the following summarizer. This summarizer counted the number of times each UUID was seen. I used to count the number of entries each map reduce job had created.

package test.ci;

import org.apache.accumulo.core.client.summary.CountingSummarizer;

public class CiUuidSummarizer extends CountingSummarizer<String>{

  @Override
  protected Converter<String> converter() {
    return (k,v,c) -> c.accept(v.toString().split(":")[0]);
  }

}

Clarify how to run individual performance tests

While working on #156 I realized it would be helpful to be able to run individual performance tests. As it stands, the only option is to run all the tests in order which takes quite a while. If any of the tests hang or error out, it stops the run of the remaining tests. Would be nice to be able to pass a parameter, the name of an individual test to run, similar to how cinigest works with its multiple components.

It doesn't seem there is any place that explains how to run a single Performance test. It would be nice if that was explained somewhere.

Could not run continuous ingest

I was unable to run continuous ingest for 1.9.2RC1 because of the following problem. I was using hadoop 2.8.4.

java.lang.NoSuchMethodError: org.apache.hadoop.hdfs.HAUtil.isLogicalUri(Lorg/apache/hadoop/conf/Configuration;Ljava/net/URI;)Z
	at org.apache.twill.filesystem.FileContextLocation.toURI(FileContextLocation.java:149)
	at org.apache.twill.yarn.YarnTwillPreparer.createLocalFile(YarnTwillPreparer.java:446)
	at org.apache.twill.yarn.YarnTwillPreparer.createLocalFile(YarnTwillPreparer.java:442)
	at org.apache.twill.yarn.YarnTwillPreparer.createAppMasterJar(YarnTwillPreparer.java:468)
	at org.apache.twill.yarn.YarnTwillPreparer.access$100(YarnTwillPreparer.java:111)
	at org.apache.twill.yarn.YarnTwillPreparer$1.call(YarnTwillPreparer.java:338)
	at org.apache.twill.yarn.YarnTwillPreparer$1.call(YarnTwillPreparer.java:329)
	at org.apache.twill.yarn.YarnTwillController.doStartUp(YarnTwillController.java:97)
	at org.apache.twill.internal.AbstractZKServiceController.startUp(AbstractZKServiceController.java:75)
	at org.apache.twill.internal.AbstractExecutionServiceController$ServiceDelegate.startUp(AbstractExecutionServiceController.java:175)
	at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43)
	at java.lang.Thread.run(Thread.java:748)

AssertionError thrown where never should happen

I saw this while running a rwalk on a 5 node cluster.

2019-09-12 21:26:51,842 [testing.randomwalk.Module] ERROR: Caught error executing BulkImport
java.util.concurrent.ExecutionException: java.lang.AssertionError: org.apache.accumulo.core.client.TableNotFoundException: Table (Id=a0) does not exist (Table (Id=a0) does not exist)
	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.util.concurrent.FutureTask.get(FutureTask.java:206)
	at org.apache.accumulo.testing.randomwalk.Module.visit(Module.java:318)
	at org.apache.accumulo.testing.randomwalk.Framework.run(Framework.java:48)
	at org.apache.accumulo.testing.randomwalk.Framework.main(Framework.java:92)
Caused by: java.lang.AssertionError: org.apache.accumulo.core.client.TableNotFoundException: Table (Id=a0) does not exist (Table (Id=a0) does not exist)
	at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doBulkFateOperation(TableOperationsImpl.java:334)
	at org.apache.accumulo.core.clientImpl.bulk.BulkImport.load(BulkImport.java:142)
	at org.apache.accumulo.testing.randomwalk.concurrent.BulkImport.visit(BulkImport.java:130)
	at org.apache.accumulo.testing.randomwalk.Module$1.call(Module.java:303)
	at org.apache.accumulo.testing.randomwalk.Module$1.call(Module.java:298)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.accumulo.core.client.TableNotFoundException: Table (Id=a0) does not exist (Table (Id=a0) does not exist)
	at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:376)
	at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:342)
	at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doBulkFateOperation(TableOperationsImpl.java:329)
	... 11 more
Caused by: ThriftTableOperationException(tableId:a0, tableName:null, op:BULK_IMPORT, type:NOTFOUND, description:Table (Id=a0) does not exist)
	at org.apache.accumulo.core.master.thrift.FateService$executeFateOperation_result$executeFateOperation_resultStandardScheme.read(FateService.java:3474)
	at org.apache.accumulo.core.master.thrift.FateService$executeFateOperation_result$executeFateOperation_resultStandardScheme.read(FateService.java:3451)
	at org.apache.accumulo.core.master.thrift.FateService$executeFateOperation_result.read(FateService.java:3385)
	at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:88)
	at org.apache.accumulo.core.master.thrift.FateService$Client.recv_executeFateOperation(FateService.java:124)
	at org.apache.accumulo.core.master.thrift.FateService$Client.executeFateOperation(FateService.java:105)
	at org.apache.accumulo.core.clientImpl.TableOperationsImpl.executeFateOperation(TableOperationsImpl.java:270)
	at org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:353)
	... 13 more

Then looking at where the error is thrown, it seems like this could be bad:

public String doBulkFateOperation(List<ByteBuffer> args, String tableName)
      throws AccumuloSecurityException, AccumuloException {
    try {
      return doFateOperation(FateOperation.TABLE_BULK_IMPORT2, args, Collections.emptyMap(),
          tableName);
    } catch (TableExistsException | TableNotFoundException | NamespaceNotFoundException
        | NamespaceExistsException e) {
      // should not happen
      throw new AssertionError(e);
    }
  }

Replace deprecated properties with their replacements

PR apache/accumulo#2171 renamed WALOG properties to be WAL instead. There are a few instances inside accumulo-testing that use the deprecated WALOG variant that should be replaced by the WAL counterpart (see #147 as an example of such issue).

Add performance test for writing data.

Could measure writing large batches of data and lots of small batches for data. The large and small batches could be separate test or the same.

RW Error running Sequential

2019-06-26 16:43:06,014 [testing.randomwalk.Framework] ERROR: Error during random walk
java.lang.Exception: Error running node seq.MapRedVerify
        at org.apache.accumulo.testing.randomwalk.Module.visit(Module.java:370)
        at org.apache.accumulo.testing.randomwalk.Framework.run(Framework.java:48)
        at org.apache.accumulo.testing.randomwalk.Framework.main(Framework.java:92)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 5
        at org.apache.accumulo.testing.randomwalk.sequential.MapRedVerify.visit(MapRedVerify.java:48)
        at org.apache.accumulo.testing.randomwalk.Module$1.call(Module.java:303)
        at org.apache.accumulo.testing.randomwalk.Module$1.call(Module.java:298)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
        at java.lang.Thread.run(Thread.java:748)

Make continuous ingest delete data

apache/accumulo#537 was a WAL recovery bug where deleted data could possibly come back. This flaw was not found through testing. It would be nice to delete data in the continuous ingest test to cover this case. This could be done by periodically deleting a previously written set of linked list. The linked list would need to be deleted in reverse order to avoid false positives in the test. Could do something like the following.

while (true) {
   //write 1,000,000 linked list of 25 nodes
   if(random.nextInt(10) == 0) {
       //delete previously written 1,000,000 linked list of 25 nodes in reverse order
   }

Add validation that 'sudo' works for a given user inside of the agitator scripts

Original Jira ticket: https://issues.apache.org/jira/browse/ACCUMULO-1982

Main snippet:

It would be nice if we tested that sudoing to the desired user worked 
for the current user and gave better error messages in the case of failure.

The code here has changed significantly since the original ticket was created so it is possible the need for this has lessened but from my quick investigation I did not find any validation checks.