Giter Site home page Giter Site logo

usdot-jpo-ode / jpo-s3-deposit Goto Github PK

View Code? Open in Web Editor NEW
4.0 4.0 4.0 465 KB

SUBMODULE: Generic Kafka stream to S3 depositing modules. Packages JSON kafka streams into files for depositing into research data environments.

Java 39.75% Dockerfile 4.78% HTML 27.93% Shell 16.65% JavaScript 10.89%

jpo-s3-deposit's People

Contributors

codygarver avatar dan-du-car avatar dependabot[bot] avatar dmccoystephenson avatar drewjj avatar hmusavi avatar jtbaird avatar michael7371 avatar mvs5465 avatar mwodahl avatar paynebrandon avatar saikrishnabairamoni avatar schwartz-matthew-bah avatar snallamothu avatar tonychen091 avatar tonyenglish avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

jpo-s3-deposit's Issues

Dockerfile Group Configuration

The Dockerfile should allow for the environment value of group to be passed through. Currently this is set at a constant "group1" as seen here

Dynamic Credentials for AWS Firehose/Kinesis support

The USDOT SDC environment is moving from the public AWS cloud to the private USDOT ECS cloud. A new restriction (based on ECS) requires moving from static credentials (stored in the .env file) for Firehose/Kinesis connections to dynamic (time limited) credentials. To support this the SDC is providing a Python script that needs to be run (or could be used for sample code if it works better to port to Java) that generates the dynamic credentials. This script is available from Tony English. The request is to update the jpo-s3-deposit code (https://github.com/usdot-jpo-ode/jpo-s3-deposit/blob/010e221cf96cbf0ebfe9bea9c1bcf5d9d24549f1/src/main/java/us/dot/its/jpo/ode/aws/depositor/AwsDepositor.java) to support dynamic credentials.

Logging Settings

The current logging settings are proving problematic - I'm seeing growing log file sizes in excess of 37GB on a single container. To help combat this issue, the AWS Depositor should be configured with a default logging level of warn rather than debug. Additionally, the logback.xml file should include a maxFileSize so the file cannot grow indefinitely.

It can also be a bit complex to work out the differences between the logback settings and docker logs. I recommend that the documentation is updated to reflect these differences and how to adjust the logback settings to be in line with the docker logging settings.

Command Arguments Failing

The latest changes to make the API_ENDPOINT an optional field included some unintended consequences and now the system will not read in the value. As can be seen in the apache documentation the Option constructor includes a field to determine whether the Option object should take an argument or not. There are several values in the AwsDepositor.java file that should be taking in an argument and are currently set to false for this value. As a result, no data can be passed in for those parameters.

Topic Subscription Failure

On occasion the s3 deposit fails to subscribe to the Kafka topic. This seems to occur after a reboot of the machine or a restart of the ODE and the logs show a message of "LEADER_NOT_AVAILABLE" for the FilteredOdeBsmJson topic. I've attached an associated log to help debug.
bsm_error.log

System Freeze

While sending data to Firehose, an error occurred but doesn’t seem to have been thrown. Nothing in the logs indicated that the system was stuck, frozen, in a failed state. The container was online, but the logs show that no messages were processed for the past few days even though the corresponding topic has received thousands of messages in that timeframe. Kafka no longer tracked the corresponding consumer group because it had been > 24 hours since the consumer checked in.

This type of error should instead be handled rather than allow the system to become stuck in a failed state without indication of the failure in the logs.

Dev Branch Dockerfile Error

While attempting to run the latest from Dev we ran into an issue with the Dockerfile. The CMD line is failing to execute due to missing '' at the end of line 26 and 27. Once added in, the module ran.

No Offset Commit

In AwsDepositor.java line 144 the auto commit is set to false. There is also no manual commit of the Kafka offset, which means if we lose connectivity the module doesn't know where to start up from again. This also makes it impossible to trace lag with this module which adds to difficulties when debugging issues.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.