Giter Site home page Giter Site logo

donut's People

Contributors

adamjarling avatar bmquinn avatar carrickr avatar csyversen avatar davidschober avatar dependabot-preview[bot] avatar kdid avatar mbklein avatar toputnal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

donut's Issues

Host fits zip in an S3 bucket

Relates to #89

We probably shouldn't rely on Harvard for hosting this zip file since it stopped working for us last week.

Done looks like:

Fits zip file uploaded to an S3 bucket, and .ebextensions/01_packages.config updated to point to our hosted version.

Thumbnails not showing up

Here's a good example: http://donut.repo.rdc-staging.library.northwestern.edu/concern/images/cee2e75c-1d2e-4551-a46b-661878aa9b5d?locale=en#?c=0&m=0&s=0&cv=0&xywh=-783%2C-58%2C2588%2C1137

The images show up in the universal viewer, but there aren't any representative images showing up on the #show page.

I'm seeing this error in the logs:

I, [2018-01-23T18:23:09.387540 #26275]  INFO -- : [239a4161-c3be-4b85-a51b-0e01a357da65] Started POST "/" for 127.0.0.1 at 2018-01-23 18:23:09 +0000
D, [2018-01-23T18:23:09.427695 #26275] DEBUG -- : [239a4161-c3be-4b85-a51b-0e01a357da65]   Load LDP (21.5ms) http://fcrepo.repo.vpc.rdc-staging.library.northwestern.edu/rest/bb/52/0f/f8/bb520ff8-9d94-47db-9107-cd0b275b9ad0 Service: 47398585891020
D, [2018-01-23T18:23:09.493206 #26275] DEBUG -- : [239a4161-c3be-4b85-a51b-0e01a357da65]   Hyrax::Operation Load (1.8ms)  SELECT  "curation_concerns_operations".* FROM "curation_concerns_operations" WHERE "curation_concerns_operations"."id" = $1 LIMIT $2  [["id", 167], ["LIMIT", 1]]
F, [2018-01-23T18:23:09.495636 #26275] FATAL -- : [239a4161-c3be-4b85-a51b-0e01a357da65]
F, [2018-01-23T18:23:09.496104 #26275] FATAL -- : [239a4161-c3be-4b85-a51b-0e01a357da65] ActiveRecord::RecordNotFound (Couldn't find Hyrax::Operation with 'id'=167):
F, [2018-01-23T18:23:09.496198 #26275] FATAL -- : [239a4161-c3be-4b85-a51b-0e01a357da65]
F, [2018-01-23T18:23:09.496325 #26275] FATAL -- : [239a4161-c3be-4b85-a51b-0e01a357da65] activerecord (5.1.4) lib/active_record/relation/finder_methods.rb:343:in `raise_record_not_found_exception!'

which is weird, because i can pull up that record in the rails console. Maybe it's a race condition or something?

Anyway i'm looking into this now

Deal with Admin Sets in batches

Description

Per our workflow, a work needs to be in one and exactly one admin set, our spreadsheet batch ingestion needs to have an admin set column that takes an admin set ID.

Done looks like

  • Column added to batch spreadsheet that requires admin set ID
  • Validation takes place that ensures admin sets are there.

Create Job That Deletes Masterfiles in the pending bucket after ingest success

Once CreateWorkJob has successfully ingested a resource, CreateWorkJob should enqueue a cleanup job for that masterfile. This could be done via hooks or by calling out to super for CreateWorkJob and then adding in desired code.

This job should delete the file from the pending bucket (#35)

Done looks like

  • job is written that cleans up after a successful ingest.

Fix deprecation warnings

This is just good practice and we'll thank ourselves later.

After Carrick's-update-to-the-latest-hyrax branch passes and is merged I'll start fixing the warnings

Compare current validation with BFF spreadsheet and Image model required fields

Primarily for Berkeley at this point, given JSON validate it and determine if a resource can be created or not. If it cannot write error to log.

For MVP required validations are:

  • File is present in the S3 bucket
  • It has a title
  • It has a collection to put it in

Done Looks Like

  • Update validator to conform to Northwestern model.

Trigger ingest from S3 add/update

When an ingest manifest spreadsheet is added to the correct S3 bucket, trigger ingest via the queue.

  • Jobify the existing command line app
  • Rewrite the command line app to run through the job class
  • Set up the S3 notification and queueing
  • Carrick will test out batch import

Fix derivative creation for import_from_s3 import script

When running our new import_from_s3 script, records are being imported and show up in donut but no file derivatives are showing up. We should see the coffee and library thumbnails but we're just getting the placeholder thumbnails instead.

My guess is that this has something to do with pulling the binaries from s3 to create derivatives and making sure we're hitting the remote_files part of the actor stack

Demo Rake Task Powered Ingest of a CSV

  • Ingests using CSV populated with Berkeley Metadata
  • Records display in DONUT
  • Records in DONUT have metadata
  • Records in DONUT have derivatives
  • Records in DONUT are owned by the nul-ingest user
  • Errors are logged into the environment (development, test, production) log file

Investigate user key issue

Description

The user model is storing escaped email strings as ids, which seems to break things like deleting a user from a role. For example, first.last@northwestern is getting stored as the user key but trying to delete the user from role fails with a user key not found error looking for [email protected].

This might would be solved by storing the netid as the user key is User.rb by changing to

  def to_s
    username
  end

Done looks like:

  • the proper user key is used
  • users can be deleted from roles, etc. without not found errors

Allow Authority Driven Dropdown for CREATOR ROLE

Description

(Breakout of Issue https://github.com/nulib/next-generation-repository/issues/90)
-- for CREATOR ROLE
As a Collection Manager, I want to have authorities attached to certain fields and be able to grab them from a drop down menu (editing) so that I don't have to worry about editors putting in inconsistent information.

Here is an example of using the relator endpoint through our local questioning authority:
http://devbox.library.northwestern.edu/authorities/search/loc/relators?q=art

Done Looks Like

  • A dropdown for single authority added

Exclude CreateWithRemoteFilesActor from rubocop

Right now we're overriding CreateWithRemoteFilesActor from Hyrax so we can exclude the area where it's encoding the URL one too many times, but it's also making rubocop upset.

Since it's not our file, we shouldn't really care if it's violating rubocop rules and it should be excluded from it's checks.

Simplify minio setup

Right now donut requires minio running to mimic s3, have a bucket created, and then that bucket needs to be populated to test out our import feature. It requires a few files to be created in the users home directory, an environment variable or two to be set up, and the aws cli scripts have to be run manually every time minio goes up or down.

It's great that we can mimic s3 locally and help speed up dev without relying on outside sources, but it's become unwieldy and missing any one of those steps mean that your tests will fail or give you a false positive. I'm going to try and organize and automate this as much as possible

Verify Various Failure States Log Errors

Description

When derivatives are created successfully, make it fail and follow the failure through the logs to verify we're logging. Beside #38 where an error is written if the metadata is invalid, ensure all the other edge cases write out errors, namely:

  • Fedora timeouts
  • Derivative Failures
  • File not found on S3
  • Error opening/reading CSV

Files not present for derivative generation

We're getting this error message when trying to create derivatives on AWS

Errno::ENOENT (No such file or directory @ rb_sysopen - /var/donut-temp/hyrax/uploaded_file/file/34/<filename>.jpg)

on our EB worker instance, the /var/donut-temp folder exists but there are no subfolders under it.

So the file from s3 isn't being copied over to a temp folder and no derivatives are being created. We need to figure out why and where it's happening

FITS issues on donut workers

I was just testing our import from s3 job on AWS and the jobs are failing on fits:

E, [2018-01-18T17:50:40.060580 #26161] ERROR -- : [fee2029b-36e4-4401-abea-770862632455] [ActiveJob] [CharacterizeJob] [576bc9c2-67f1-4be4-aa0f-dc9a8b88ee92] Error performing CharacterizeJob (Job ID: 576bc9c2-67f1-4be4-aa0f-dc9a8b88ee92) from BetterActiveElasticJob(default) in 140.21ms: RuntimeError (Unable to execute command "/usr/local/fits-1.0.5/fits.sh -i "/tmp/d20180118-26161-5qzrty/coffee.jpg""
Picked up JAVA_TOOL_OPTIONS: -Xmx128m
Error: Could not find or load main class edu.harvard.hul.ois.fits.Fits
):
/opt/rubies/ruby-2.4.2/lib/ruby/gems/2.4.0/gems/hydra-file_characterization-0.3.3/lib/hydra/file_characterization/characterizer.rb:51:in `internal_call'

Missed on first pass: AuthoritySelect field for CREATOR

Description

We missed this one in the initial round...

(Breakout of Issue https://github.com/nulib/next-generation-repository/issues/90)
-- for CREATOR

As a Collection Manager, I want to have authorities attached to certain fields and be able to grab them from a drop down menu (editing) so that I don't have to worry about editors putting in inconsistent information.

Spreadsheet: https://docs.google.com/spreadsheets/d/1F35hLSD11a1mf9UTXvgAc7xKXkAixOaBwVvzYQulnkc/edit#gid=396400352

Done Looks Like

  • An AuthoritySelect dropdown plus autocomplete is added.

Clean start with hyrax 3

Description

Since donut was started on hyrax before 1.0 was released (i think) there might be generated views, configs, controllers, etc that were applicable at the time of they were run, but have been refactored away or are no longer needed or any other number of things.

Carrick and I were talking about starting fresh with Hyrax 2 and bringing over our customizations and configs from donut, but we think a more appropriate time to do that will be when Hyrax 3 is released, since that'll be valkyrie based and will be significantly different than hyrax 2 anyway.

So once Hyrax 3 is released and we're ready to transition Donut to it, we should start a new rails project, run all the updated generators, and then carefully bring over our customizations and configs and refactor where needed.

Create upstream pr from controlled vocabulary

Description

Determine what controlled vocab updates implemented locally would benefit or be appropriate for Hyrax core. At a minimum, see if Authority Select works with multiple items in Hyrax core, and fix that. Up for debate whether controlled vocab mixed with Authority Select is a generic enough use case for users outside of Northwestern.

Done Looks Like

  • Make a decision whether local controlled vocab / authority select updates are necessary for Hyrax core.
  • If yes, do the work.

Tasks

  • - Decide whether local controlled vocab / authority select updates are necessary for Hyrax core.
  • If so, update Hyrax and submit a PR.
  • Merge PR back into Donut and check everything still works (create a new ticket for this).

Refactor route for omniauth callbacks

We're getting a deprecation warning: DEPRECATION WARNING: Using a dynamic :action segment in a route is deprecated and will be removed in Rails 5.2. (called from block (2 levels) in <top (required)> at /home/travis/build/nulib/donut/config/routes.rb:15)

here: https://github.com/nulib/donut/blob/deploy/staging/config/routes.rb#L15

We should refactor this sooner rather than later, but Carrick and I weren't sure what the new syntax was and didn't want to spend all day on it. I'm putting in this issue as a reminder that we'll need to change this before rails 5.2 is released (which is kind of soon)

Figure out a way to avoid env checking for S3 urls

Description

URL encoding is handled differently between Minio and S3. This is being handled by checking the Rails environment now, but that is not ideal.

Done looks like:

  • Conditional logic removed from Importer::Factory::ObjectFactory for Rails environment.

get hyku importer specs passing in donut

The specs from hyku run in donut successfully now, but not all of them are passing yet. We should get them all green (we may have to modify some of the specs because we aren't going to be using filesystem based ingestion)

  • csv_importer_spec
  • csv_parser_spec
  • image_factory_spec
  • string_literal_processor_spec

figure out why derivatives aren't working on AWS (for batch upload)

batch uploads are creating derivative images locally, but aren't when we run it on staging. Look into why! We know this was working before, the Import URLs were being double encoded in a way that was easy to fix. We had it working using Minio in local dev environments.

  • Checked workers, they're running
  • Checked app for errors
  • We have to investigate where the double encoding is hyrax and fix it upstream.
  • Test with Bespoke Fedora (maybe it was simultaneous writes
  • Create more verbose log to dig into

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.