Giter Site home page Giter Site logo

common's Introduction

common

CircleCI

A collection of useful utility classes and functions. Slowly on the path to deprecation.

testkit - Unit test classes and utilities.

guice - Guice-specific libraries.

core - Catchall collection of utilities.

Using this project as a library

common is published to CodeArtifact. You will need to add a resolver via the sbt-codeartifact plugin to use these libraries.

Releasing new versions

To make a release:

> release

Guideline for Contributing to common

There is no strict process for contributing to common. However, following are some general guidelines.

Discuss in Pull Request Code Reviews

If you have implemented something in a repository other than common and that you think could be a candidate to be migrated into common, ask reviewers for feedback when issuing your pull request.

Create a GitHub Issue

Feel free create a GitHub issue in the common project to provide traceability and a forum for discussion.

Use TODO Comments

While working on a task, go ahead and implement the functionality that you think would be a good fit for common, and comment the implementation with a TODO suggesting it belongs in common. An example:

// TODO(mygithubusername): migrate to common
object ResourceHandling {
  type Resource = { def close(): Unit }
  def using[A](resource: => Resource)(f: Resource => A) {
    try {
      f(resource)
    finally {
      resource.close()
    }
  }
}

If you have created a GitHub issue for the common candidate, it is a good idea for traceability to reference the issue number in your TODO comment:

// TODO(mygithubusername): migrate to common. See https://github.com/allenai/common/issues/123
...

Have Two Code Reviewers to common Pull Requests

Try and always have at least two reviewers for a pull request to common

common's People

Contributors

afader avatar andrewlmurray avatar ashish33 avatar bbstilson avatar codeviking avatar colinarenz avatar cristipp avatar danxmoran avatar dirkgr avatar gmjabs avatar jakemannix avatar jkinkead avatar joelgrus avatar markschaake avatar matt-gardner avatar mjlangan avatar oyvindtafjord avatar rodneykinney avatar rreas avatar ryanai3 avatar sbhaktha avatar schmmd avatar smishmash avatar vha14 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

common's Issues

(core) Enum is not deserializable due to being an abstract class

Noticed this when trying to use Enum in a distributed application (S2 Crawler). When attempting to deserialize an Enum via default Java deserialization you'll get a runtime error that is due to Enum being an abstract class that requires a constructor. To solve this, we should make Enum a trait instead of an abstract class.

Use of assembly in common

I'd prefer not to use this plugin in common. Why is it necessary for the datastore?

assemblySettings

jarName in assembly := "DatastoreCli.jar"

mainClass in assembly := Some("org.allenai.datastore.cli.DatastoreCli")

I think the datastore should become its own repository.

(Datastore) /tmp is being cleaned up

@markschaake found the Arizona solver had crashed due to the datastore change. Thanks to @OyvindTafjord for debugging as well.

[2:19 PM] Mark Schaake: Wow, it is already finished. Looks like something wrong with the Arizona solver?
[2:19 PM] Michael Schmitz: I'm really in no rush... take your time...
[2:22 PM] Mark Schaake: Anybody know why Arizona is failing?
[2:23 PM] Michael Schmitz: I think Dirk changed it to use the datastore--another auth issue?
I'm checking to see if it's in the JSON response from the controller.

2:23 PM
Sunil Mishra joined the room
[2:24 PM] Michael Schmitz: 
              "analysis": {
                "featureVectors": {
                  "questionHash": "c25fcea7",
                  "answerKey": "(A)",
                  "numericFeatures": {
                    "bestInference": 1,
                    "scoreInference": 0.23759467569042972,
                    "questionLength": 44,
                    "softmaxScoreInference": 0.2707396438410162,
                    "questionParensCount": 5,
                    "maxEdgeAlignmentScore": 0.6477272727272727,
                    "diffFromMedianArizona": 0,
                    "diffFromMedianInference": 0.0797324583374445,
                    "setupIsaCount": 2,
                    "bestArizona": 0,
                    "normalizedScoreArizona": 0,
                    "normalizedScoreInference": 0.38188240196415446,
                    "queryIsaCount": 1,
                    "consequentsIsaCount": 1,
                    "antecedentsIsaCount": 2,
                    "inferenceSupportsCount": 0,
                    "softmaxScoreArizona": 0,
                    "scoreArizona": 0,
                    "minWordAlignmentScore": 0.29545454545454547
                  }
                }
Show more

All arizona features are 0
@jkinkead should there be a report that Arizona failed in the response?

2:24 PM
Sam Skjonsberg left the room (user disconnected)
[2:25 PM] Mark Schaake: In the arizona-solver.out log file, I get this:
[2:25 PM] Mark Schaake: java.io.FileNotFoundException: /tmp/ai2-datastore-cache/public/org.allenai.ari.solvers.arizona/SistaNLP-d1/index/campbellsbio/index/index/_8.prx (No such file or directory)
[2:25 PM] Michael Schmitz: ...
it's a bit surprising to have all of these issues all of a sudden
there was an early iteration of his code that used a path macos cleans up automatically
but that doesn't make sense here...
try restarting it?
[2:26 PM] Jesse Kinkead: the failure would be in the trail or in the answer from arizona
not in the feature vectors
[2:26 PM] Mark Schaake: I'll restart Arizona now and see if it fixes itself
[2:26 PM] Michael Schmitz: if I search for "Arizona" I don't see much
Is Arizona enabled on the frontend?
I used to have an option to select the solvers I wanted.
http://ari.dev.allenai.org/?mode=debug#/
[2:27 PM] Oyvind Tafjord: That file is indeed not there, but there are a bunch of others in the same directory
I wonder if it could be due to partial /tmp cleanup in the OS? That turned out to be a killer on OSX for the datastore caching
[2:27 PM] Jesse Kinkead: @MichaelSchmitz - if there's an error, it should show up at the bottom, in the trail
[2:27 PM] Mark Schaake: Your hash is in the wrong spot. Try http://ari.dev.allenai.org/#/?mode=debug
[2:27 PM] Jesse Kinkead: look at the bottom of: http://ari.dev.allenai.org:8080/ask?text=Which+part+of+a+plant+produces+the+seeds%3F+%28A%29+flower+...
[2:27 PM] Oyvind Tafjord: pwd
[2:28 PM] Jesse Kinkead: there *is* an exception on that URL i sent
[2:28 PM] Michael Schmitz: Thanks @MarkSchaake , and yes @jkinkead I get a nice exception
[2:30 PM] Mark Schaake: Also, Arizona logging is not outputting the errors to the arizona-solver.log file - only to the arizona-solver.out file
[2:30 PM] Michael Schmitz: I'm guessing that's not our fault... but yes it is unfortunate...
[2:30 PM] Mark Schaake: restart did not fix anything

Update library versions in common

Update to the latest scala, 2.10.4
Update to the latest sbt, 0.13.2
Update to the latest scalatest, 2.1.3
etc.

Goal is to enable strict dependency checking in solvers project (set conflict manager to ConflictManager.strict), which currently reports the scala version in use in conflict with the version in common.

Docs for org.allenai.json

I'd love to see some documentation (even just a README file), for the org.allenai.json package and it's associated goodness!

Redundant calls to Producer.create when producing Iterator output

If a Producer produces an Iterator instance, the create method will be called twice.

https://github.com/allenai/common/blob/master/pipeline/src/main/scala/org/allenai/pipeline/Producer.scala#L16

The value computed within cachedValue is thrown away. In most cases, this results in no unnecessary computation, because the Iterator gets thrown away without being consumed. However, it's possible for the create method to consume some resources (e.g. https://github.com/allenai/common/blob/master/pipeline/src/main/scala/org/allenai/pipeline/Producer.scala#L16 which consumes the input Iterator)

Workaround is to call disableCaching on Producers that consume their input before their output is consumed.

(Common) add timing methods that handle Future's

We need to be able to time blocks that execute in scala.concurrent.Future's.

See: https://github.com/allenai/ari-core/blob/ef4bd14305e9e63a955f9503ded0bd9e3ca18ca7/models/src/main/scala/org/allenai/ari/models/traits/Trailable.scala#L57 for an example.

Examples could be:

/** Maps over the future and captures time since dispatch of future for exection
  * until time of map callback
  */
def timedFuture[A](future: => Future[A]): Future[(A, Duration)]

/** Creates a Future[(A, Duration)] by executing the block 
  * in a Future with the provided ExecutionContext
  * The timing is true to the time it took to execute the block. Contrast this to
  * `timedFuture` which calculates the elapsed time between future dispatch
  * and future callback (i.e. not a true elapsed time of the block execution).
  */
def timedInFuture[A](block: => A)(implicit ec: ExecutionContext): Future[(A, Duration)]

Pipeline test cases fail

[info] TestArtifactIo:
[info] JSONFormat
[info] - should persist primitive types in Tuples *** FAILED ***
[info]   java.lang.IllegalArgumentException: requirement failed: Unable to create ioTest.json
[info]   at scala.Predef$.require(Predef.scala:233)
[info]   at org.allenai.pipeline.FileArtifact.write(FileArtifact.scala:27)
[info]   at org.allenai.pipeline.LineIteratorIo.write(ArtifactIo.scala:78)
[info]   at org.allenai.pipeline.LineCollectionIo.write(ArtifactIo.scala:56)
[info]   at org.allenai.pipeline.LineCollectionIo.write(ArtifactIo.scala:51)
[info]   at org.allenai.pipeline.TestArtifactIo$$anonfun$1.apply$mcV$sp(TestArtifactIO.scala:25)
[info]   at org.allenai.pipeline.TestArtifactIo$$anonfun$1.apply(TestArtifactIO.scala:17)
[info]   at org.allenai.pipeline.TestArtifactIo$$anonfun$1.apply(TestArtifactIO.scala:17)
[info]   at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
[info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
[info]   ...
[info] TSVFormat
[info] - should persist case classes *** FAILED ***
[info]   java.lang.IllegalArgumentException: requirement failed: Unable to create tsvTest.txt
[info]   at scala.Predef$.require(Predef.scala:233)
[info]   at org.allenai.pipeline.FileArtifact.write(FileArtifact.scala:27)
[info]   at org.allenai.pipeline.LineIteratorIo.write(ArtifactIo.scala:78)
[info]   at org.allenai.pipeline.LineCollectionIo.write(ArtifactIo.scala:56)
[info]   at org.allenai.pipeline.LineCollectionIo.write(ArtifactIo.scala:51)
[info]   at org.allenai.pipeline.TestArtifactIo$$anonfun$2.apply$mcV$sp(TestArtifactIO.scala:43)
[info]   at org.allenai.pipeline.TestArtifactIo$$anonfun$2.apply(TestArtifactIO.scala:35)
[info]   at org.allenai.pipeline.TestArtifactIo$$anonfun$2.apply(TestArtifactIO.scala:35)
[info]   at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
[info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
[info]   ...
[info] TSVFormat
[info] - should persist Tuples *** FAILED ***
[info]   java.lang.IllegalArgumentException: requirement failed: Unable to create tsvTupleTest.txt
[info]   at scala.Predef$.require(Predef.scala:233)
[info]   at org.allenai.pipeline.FileArtifact.write(FileArtifact.scala:27)
[info]   at org.allenai.pipeline.LineIteratorIo.write(ArtifactIo.scala:78)
[info]   at org.allenai.pipeline.LineCollectionIo.write(ArtifactIo.scala:56)
[info]   at org.allenai.pipeline.LineCollectionIo.write(ArtifactIo.scala:51)
[info]   at org.allenai.pipeline.TestArtifactIo$$anonfun$3.apply$mcV$sp(TestArtifactIO.scala:59)
[info]   at org.allenai.pipeline.TestArtifactIo$$anonfun$3.apply(TestArtifactIO.scala:53)
[info]   at org.allenai.pipeline.TestArtifactIo$$anonfun$3.apply(TestArtifactIO.scala:53)
[info]   at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
[info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
[info]   ...

(Common) add testkit subproject

Would be nice to have a testkit library to which we could add test helpers and dependencies on other testing libraries. Then, in our other projects we can just depend on the testkit project.

This approach also has the advantage of encouraging standardized testing approaches. For example, we could define the following abstract classes:

import org.scalatest._
import akka.testkit.TestKit

/** Regular unit test */
abstract class Ai2Test extends FlatSpec with Matchers

/** Actor unit test */
abstract class Ai2ActorTest(_system: ActorSystem) extends TestKit(_system) with Ai2Test

Actual tests would then be declared like:

class MyTest extends Ai2Test {
...
}

class MyActorTest extends Ai2ActorTest(testActorSystem) {
...
}

Other helpers that would be nice to have:

  • Dealing with Future's (i.e. Await.result boilerplate)
  • Test actor system management

Race condition in FileArtifact

There is a race condition at https://github.com/allenai/common/blob/master/pipeline/src/main/scala/org/allenai/pipeline/FileArtifact.scala#L15

If multiple artifacts are created simultaneously in the same directory, this requirement can fail. Both will think the directory doesn't exist. The first will create the directory and the second will fail to create the directory. Logic should be:

require (!parentDir.exists || parentDir.isDirectory, "Invalid path")
if (!parentDir.exists) parentDir.mkdirs
require(parentDir.exists,"Unable to create directory")

Add JSON format for Exceptions

Now that we have a dependency on spray-json, we can start defining JsonFormats in common for use in multiple projects. One that is commonly needed is Exception serialization.

bintray publishing is broken since 1.0.5

File from 1.0.4: common-core_2.11-1.0.4-javadoc.jar
File from 1.0.5 (broken): org.allenai.common:common-core_2.11_2.11-1.0.5-javadoc.jar
File from 1.0.9 (broken): org.allenai.common:common-core_2.11_2.11-1.0.9-javadoc.jar

Migrating from bintray resolver for a public project

I have an old public project (https://github.com/allenai/pdffigures2) the depends on common.

It used to have a jfrog bintray resolver that would allow it to include common as a dependency, that service is not longer available breaking the build.

I understand I should be able to fix this by using the newer sbt-codeartifact plugin, but I can't figure out exactly how to go about doing that. I have tried adding the plugin and setting:

codeArtifactUrl := "https://org-allenai-s2-896129387501.d.codeartifact.us-west-2.amazonaws.com/maven/private"

But I am getting software.amazon.awssdk.services.codeartifact.model.AccessDeniedException:User is not authorized to perform: codeartifact:GetAuthorizationToken on resource: arn:aws:codeartifact:us-west-2:896129387501:domain/org-allenai-s2 because no resource-based policy allows the codeartifact:GetAuthorizationToken action

Am I missing something? Is there a way to fix pdffigures? I have gotten several external people requesting we fix the repo so it seems like it is still in use.

Easy API for accessing Nested Object Properties in a JsObject

It'd be great if we extended the get API to support queries for deeply nested properties. I could see both nested object property support like so:

 json.get[JsObject]("prop.anotherProp")

And also the ability to select items at a specific index of a JsArray, like so:

 json.get[String]("prop.anotherProp.someOtherProp[3]")

Update readme

Readme contain some instructions regarding "installing the right version" of elasticsearch, which need a little update after upgrading elasticsearch by @ckarenz :
e352caa

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.