Giter Site home page Giter Site logo

rubyforgood / abalone Goto Github PK

View Code? Open in Web Editor NEW
35.0 24.0 87.0 5.44 MB

A data tracking and analytics app for abalone conservation efforts.

License: MIT License

Ruby 71.48% JavaScript 2.05% HTML 23.61% Dockerfile 0.21% Makefile 0.24% SCSS 2.36% Shell 0.03% Procfile 0.02%

abalone's Introduction

Abalone Analytics

rspec rubocop

The Abalone project is a data tracking and analytics system aimed at storing and measuring data for population trends, mortality rates, and breeding programs. Designed as a multi-tenant application, Abalone will initially serve two stakeholders, the Bodega Marine Laboratory at UC Davis and the Puget Sound Restoration Fund in Washington State.

The Bodega Marine Laboratory's White Abalone captive breeding program is working to prevent the extinction of the White Abalone (Haliotis sorenseni), an endangered marine snail. White abalone are one of seven species found in California and are culturally significant to the native people of the area. White abalone were perilously overfished throughout the 20th century, resulting in a 99 percent population decrease by the end of the 1970s. This group is working to reverse their decline and have already seen some great success, they currently have more abalone in the lab than exist in the wild!

The Puget Sound Restoration Fund works to raise and outplant hatchery-reared Pinto Abalone (Haliotis kamtschatkana), the only abalone species found in the Washington waters. This species has cultural and ecological significance, grazing rock surfaces and maintaining the health of rocky reef habitat and kelp beds. The Washington Department of Fish & Wildlife (WDFW) documented a ~98% decline from 1992 to 2017, leading the pinto abalone to be listed as a state endangered species in 2019.

This application will enable groups to add data either through CSV upload or through the web interface. Groups can view reports and visual representations of key data. Future plans include giving groups the ability to generate custom reports on the fly.

Getting Started

Prerequisites

This application is built on following and you must have these installed before you begin:

  • Ruby (3.0.3)
  • Rails (6.1.4.6)
  • PostgreSQL (tested on 9.x)
  • Yarn

Setup

After forking this repo and cloning your own copy onto your local machine, execute the following commands in your CLI:

gem install bundler
bundle install
yarn install
bin/webpack
rake db:create
rake db:migrate
rake db:seed

Run Test Suite

bundle exec rake

Run Webserver for Abalone

Webpack dependencies can be rebuilt on command with bin/webpack. Alternatively you can run bin/webpack-dev-server in another terminal window. This will effectively run bin/webpack for you whenever files change.

Then, run bundle exec rails s and browse to http://localhost:3000/.

Login information for white abalone:

Email: [email protected]
Password: password

Login information for pinto abalone:

Email: [email protected]
Password: password

Running Background Jobs

The app uses the gem delayed_job for processing CSVs. To run background jobs, run the following command in your CLI:

rake jobs:work

To confirm background jobs are processing, try uploading a CSV at http://localhost:3000/file_uploads/new. You should see the job complete in your CLI and see the file upload results here at http://localhost:3000/file_uploads.

To see detailed logs from background jobs, run:

tail -f log/delayed_job.log

To clear background jobs, run:

rake jobs:clear

Direct SQL Reporting

This application uses a modified implementation of the Blazer gem to provide direct SQL access with data scoped to an organizational level. This requires some setup to use in your development environment. See the instructions for setting this up locally to get started.

Docker

We are currently experimenting with Docker for development. While we would love for more people to try it out, be forewarned - Docker functionality may not be maintained moving forward. You will need Docker and docker-compose.

  • Docker Desktop is recommended for Windows and Mac computers.
  • The make utility can also make your development life easier. It is usually already installed on Linux and Mac computers. For Windows, an easy way to install it is via Chocolatey, a software package management system similar to Homebrew for Windows. Once Chocolatey is installed, install make with choco install make in a command prompt running as Administrator.
  • If you run into issues using Docker Desktop on windows, we recommend you view this page for troubleshooting info.

Starting Fresh

To start the application in development mode:

  • docker-compose up --detach db to start the database
  • docker-compose run --rm schema_migrate to bring the database schema up-to-date
  • docker-compose up --detach web delayed_job to start the web and background job processes

Or, run only this:

  • make minty_fresh to do all of the above

The web app will be available on your host at http://localhost:3000. The logs for the web app and delayed_job processes can be seen and followed with the make watch command.

Some Routine Tasks

  • make spec will run the RSpec tests
  • make lint will run the Rubocop linting and style checks
  • make brakeman will run the Brakeman security vulnerabilities checks
  • make test will run spec, lint, brakeman
  • make build will build the Docker image for the abalone application. You'll need to run this occasionally if the gem libraries for the project are updated.
  • make database_seeds will seed the database according to seeds.rb.
  • make nuke will stop all Abalone docker services, remove containers, and delete the development and test databases. This is also used in the make minty_fresh command to restart the development and test environment with a clean slate.

Only the Database

Some developers prefer to run the Ruby and Rails processes directly on their host computers instead of running everything in containers. It might still be convenient for those developers to run the database in a container and not deal with the installation of yet another server on their computer. To do so:

  • set an environment variable on your host: export DATABASE_URL="postgres://dockerpg:supersecret@localhost:54321"
  • start the database with make database_started

Development

We have included the Annotate gem in this project, for better development experience. It annotates (table attributes) models, model specs, and factories.

The annotate task will run automatically when running migrations. Please see lib/tasks/auto_annotate_models.rake for configuration details.

If it does not run automatically, you can run it manually, on the project root dir, with:

annotate

Check out their Github page for more running options.

Architectural Constraints

In submitting features or bug fixes, please do not add new infrastructure components — e.g. databases, message queues, external caches — that might increase operational hosting costs. We are hosting on a free Heroku instance and need to keep it this way for the foreseeable future. Come talk to us if you have questions by posting in the Ruby for Good #abalone slack channel or creating an issue.

Other Considerations

We want it to be easy to understand and contribute to this app, which means we like comments in our code! We also want to keep the codebase beginner-friendly. Please keep this in mind when you are tempted to refactor that abstraction into an additional abstraction.

Get Familiar with the App

Application Overview

The Problem

Our stakeholders, the Bodega Marine Laboratory and the Puget Sound Restoration Fund work with large amounts of data collected as part of their abalone captive breeding programs. They need a system that can act as a central data repository for all of this data and provide robust reporting capabilities to help them examine trends and combine data collected across their research efforts.

The Solution

We are building a multi-tenant application which has the following capabilities:

  1. Store Data: There are several types of measurement data collected that should be stored in the system and retrievable by each organization.
  2. Import CSVs: Users are able to import single and bulk CSVs. Users should generally submit cleaned CSVs, but the app should alert users if there are parsing problems and which row(s) need to be fixed.
  3. Display Charts and Analytics: Display charts and analytics to meet the reporting needs of each organization. Allow organizations to directly query their data.
  4. Export CSVs: TBD.

Key Definitions

  • Tag number(s), date = e.g. Green_389 from 3/4/08 to 4/6/15 We sometimes tag individuals; however, not all individuals have tags. We can't tag individuals until they are older than one year old because they are too small. Generally a color, a 3-digit number and dates that tag was on. Sometimes tags fall off. It can be logistically challenging to give them the same tag, so they sometimes get assigned new tags. Also, occasionally tags have another form besides color_### (e.g., they have 2 or 4 digits and/or have no color associated with them), and sometimes they are something crazy like, "no tag" or "no tag red zip tie" for animals that lived long ago ... though I suppose we could re-code those into something more tractable.
  • Shellfish Health Lab Case Number (shl number) = SF##-## Animals from each spawning date and from each wild collection have a unique case number created by California's state Shellfish Health Laboratory (SHL). Sometimes animals from a single spawning date have more than one SHL number.
  • Cohort = place_YYYY This is how the lab coloquilly refers to each of their populations spawned on a certain date. It's bascially a note/nickname for each group of animals with a particular SHL #/spawning date.
  • Enclosures = e.g. Juvenile Rack 1 Column A Trough 3 from 3/4/15 - 6/2/16 This is the tank space by date. This is a note. The types of input will vary significantly within a facility and over time.
  • Locations = facility_name - location_name Animals may be located in different location within a single facility
  • Facilities = e.g. BML from 6/5/13 - 11/20/14 Animals move around among a finite number of partner institutions (it is possible for new facilities to be added, but it only happens about once every few years).
  • Organizations e.g. Bodega Marine Laboratory Organizations act as the tenants within the application for the purpose of walling off data
  • MortalityEvents this is a way to track the mortality event of either a specific animal or a cohort. If the mortality event is related to a specific animal, the mortality_count is expected to be nil; if the mortality event is related to a cohort, the mortality_count is expected to be present, but the animal id is expected to be nil.

See a full data dictionary here.

And Don't Forget...

...that Gary needs you.

a white abalone

Photo credit: John Burgess/The Press Democrat

abalone's People

Contributors

bgarr avatar colinsoleim avatar craigjz avatar darrendc avatar dependabot-preview[bot] avatar dependabot[bot] avatar ericawinne avatar glassjoseph avatar haydenrou avatar jcavena avatar josephinef9 avatar jtu0 avatar librod89 avatar marc avatar mdworken avatar megantrimble avatar metamoni avatar michellemhey avatar nickschimek avatar openmailbox avatar puzzleduck avatar rafaltrojanowski avatar robbkidd avatar rruiz85 avatar rudechowder avatar smkopp92 avatar thunderheavyindustries avatar todtb avatar viniciusgama avatar wadewinningham avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

abalone's Issues

Frontend - Render Histogram with Correct Data

BLOCKED by Issue #58 : We need the correct data on the backend first.

We need to make sure the Length Histogram is rendering correctly at /reports under the Growth tab. The current report has incorrect data AND incorrect formatting.

Acceptance Criteria:

  • Y-axis: Count
  • X-axis: Length in centimeters.
  • Bins should be in 1 cm increments, so 0-0.99 cm, 1-1.99 cm, etc. from 0 to 30cm.
  • Title: Size Distribution
  • Tooltip: When hovering over a bar, user should see the count for that bin increment.

We use Highcharts, a JavaScript library for rendering all graphs and charts.

DevOps: Set up CI/CD

  • Look into postgres version... do we need to upgrade?
  • Decide and obtain new domain name
  • Set up continuous integration to run tests automatically
  • Set up continuous deployment (3 applications are: database, background jobs and main application)

Backend - Calculation for Histogram

The histogram is at /reports under the Growth tab.

We need to adjust the calculation we are using for the histogram on the backend. There are some calculations in app/lib/aggregates.rb, but we need to make sure they match the below requirements:

For histograms, bins to be by 1 cm increments, so 0-0.99 cm, 1-1.99 cm, etc. Our largest animal is just over 18 cm and the max listed size for a white abalone is 10 inches (25.4 cm), so it's probably safe to say we wouldn't ever have one over 30 cm. For a specific bar in the histogram, here is a scenario: Say I want to know X, the number of animals in the 4-4.99 cm bin, for the SF15-77 cohort animals:

N = the number of animals that were 4-4.99 cm in length in the most recent uploaded spreadsheet (Untagged AND Tagged animals) with length data for that cohort
S = the total number of animals measured in the same spreadsheet, i.e., the most recent uploaded spreadsheet with length data for that cohort (Untagged AND Tagged animals)
T = the total number of estimated animals in the cohort = the total number counted on the most recent count closest to the length measurement date minus the mortalities since then

X = (N/S)*T ... or the proportion of animals that were measured that were in that size bin, scaled up to the total predicted number of animals in that cohort.

The histogram calculations should have the following parameters (you should be able to pass in one or both):

  1. cohort, or multiple cohorts
  2. Optional date range

Testing - Write Feature/Integration Tests for TaggedAnimalAssessmentJob

Tests need to be written for TaggedAnimalAssessmentJob.

Please use this fixture file for testing: db/sample_data_files/tagged_animal_assessment/Tagged_assessment_12172018 (original).csv

Feature/Integration Tests
The following contexts and expected outcomes should be tested:

  1. Context: The user uploads a CSV that has already been processed.
    Outcome:
  • A new ProcessedFile record should be created
  • On the /file_uploads page, the user should see:
    • File has Status: "Failed"
    • File has Errors: "Already processed a file with the same name. Data not imported!"
    • File has Statistics: "{}"
  1. Context: The user uploads a CSV with invalid headers.
    Outcome:
  • A new ProcessedFile record should be created
  • On the /file_uploads page, the user should see:
    • File has Status: "Failed"
    • File has Errors: "Does not have valid headers. Data not imported!"
    • File has Statistics: "{}"
  1. Context: The user successfully uploads a CSV with no errors:
    Outcome:
  • A new ProcessedFile record should be created
  • 201 new TaggedAnimalAssessment records should be created
  • On the /file_uploads page, the user should see:
    • File has Status: "Processed"
    • File has no Errors
    • File has Statistics: "{row_count: 201, rows_imported: 201, rows_not_imported: 0, shl_case_numbers: {"SF16-9A": 100, "SF16-9B": 21, "SF16-9C": 11, "SF16-9D": 69}}"
  1. Context: The user successfully uploads a CSV with errors for 2 rows:
    Outcome:
  • A new ProcessedFile record should be created
  • 199 new TaggedAnimalAssessment records should be created
  • On the /file_uploads page, the user should see:
    • File has Status: "Processed"
    • File has Errors: "Does not have valid headers. Data not imported!"
    • File has Statistics: "{row_count: 201, rows_imported: 199, rows_not_imported: 2, shl_case_numbers: {"SF16-9A": 100, "SF16-9B": 21, "SF16-9C": 11, "SF16-9D": 69}}"

Testing - Write Unit Tests for ImportJob

Tests need to be written for ImportJob, which is a module used by all the CSV import jobs.

Unit Tests
The following methods should be unit tested:

  • #initialize_processed_file
  • #validate_headers
  • #import_records

CSVs with invalid rows should fail on upload

This resulted from a conversation in this PR: #88

Currently, when a CSV is processed with some invalid rows, we reject the invalid rows and save the valid rows. However, we should not save any rows to the db because this risks the user of duplicating data. Let's reject the entire CSV and tell the user to fix the invalid rows and re-upload the file.

Backend/Frontend - Create Multiple Select List and Date Picker for Histogram Parameters

There should be an area where the user can select parameters in the upper right hand side of the histogram chart at /reports under the Growth tab. The user should have the ability to select a measuring event (i.e. a certain population on a certain date) or a group of measuring events (different populations near the same date).

Acceptance Criteria:

  • Backend - returns a list of cohorts
  • Frontend - Multiple select list of cohorts
  • Frontend - Date picker

We use Highcharts, a JavaScript library for rendering all graphs and charts.

Analytics - Backend (can be split into multiple issues)

Analytics:

  • spawning history of the broodstock (i.e., when we attempted to spawn them, were they successful in releasing gametes)
  • total egg, larval, or juvenile production by year (esp. how many year-old animals are produced annually)
  • mortality within a cohort/population over time
  • size distribution of animals within a population or within the entire captive breeding program
  • average growth rates of tagged individuals within populations or size classes over time

Add SpawningSuccess model validations and test CSV upload job

Acceptance Criteria:

  • As a user, I want to be able to upload a CSV of category found in sample_date_files/spawning_success on the page /file_uploads/new.
  • Tests should be written for this job similar to spec/jobs/tagged_animal_assessment_job_spec.rb
  • Model Validations
  • Unit tests for model validations

Notes:
Much of the logic for the file uploading/parsing is already written in the module concern ImportJob.

The heaviest lifting will be adding appropriate validations for the SpawningSuccess model. Please refer to the data dictionary and the notes below for coding these. For an example, please look at: Example of WildCollection model validations and spec

**If there is a column for Facility, let's make these foreign keys to the Facility.rb model we already have. There should only be certain facilities (see the seeds.rb file and users have the ability to add new ones.)

SpawningSuccess.rb

  • Required: tag, shl_case_number, spawning_date, date_attempted, spawning_success
  • See data dictionary for specific formats for tag, shl_case_number, spawning_success
  • spawning_date and date_attempted should be valid date with month, day, and year
  • #_of_eggs spawned should be Integer
  • #_of_eggs spawned should be populated in the sample data CSV (lab forgot to do this)
  • Note from lab:
    My primary concern here is how to differentiate between the date that an animal was
    spawned (i.e., it’s birthday) and the date the animal spawned (i.e., it released gametes)
    without causing confusion.

    --> We should probably change the header name for spawning_date for this CSV...

Add PopulationEstimate CSV upload job and model validations

Acceptance Criteria:

  • As a user, I want to be able to upload a CSV of category found in sample_date_files/population_estimate on the page /file_uploads/new.
  • Test should be written for this job similar to spec/jobs/tagged_animal_assessment_job_spec.rb
  • Model Validations
  • Unit tests for model validations

Notes:
Much of the logic for the file uploading/parsing is already written in the module concern ImportJob.

The heaviest lifting will be adding appropriate validations for the PopulationEstimate model. Please refer to the data dictionary and the notes below for coding these. For an example, please look at: Example of WildCollection model validations and spec

**If there is a column for Facility, let's make these foreign keys to the Facility.rb model we already have. There should only be certain facilities (see the seeds.rb file and users have the ability to add new ones.)

PopulationEstimate.rb

  • Required: sample_date, shl_case_number, spawning_date, lifestage, abundance and facility
  • shl_case_number, lifestage, and facility should have specific format/options (see example CSV)
  • sample_date and spawning_date should be valid date with month, day, and year
  • abundance should be Integer

Add MortalityTracking CSV upload job and model validations

Acceptance Criteria:

  • As a user, I want to be able to upload a CSV of category found in sample_date_files/mortality_tracking on the page /file_uploads/new.
  • Test should be written for this job similar to spec/jobs/tagged_animal_assessment_job_spec.rb
  • Model Validations
  • Unit tests for model validations

Notes:
Much of the logic for the file uploading/parsing is already written in the module concern ImportJob.

The heaviest lifting will be adding appropriate validations for the MortalityTracking model. Please refer to the data dictionary and the notes below for coding these. For an example, please look at: Example of WildCollection model validations and spec

**If there is a column for Facility, let's make these foreign keys to the Facility.rb model we already have. There should only be certain facilities (see the seeds.rb file and users have the ability to add new ones.)

MortalityTracking.rb

  • Required: mortality_date, cohort, shl_case_number, spawning_date, # of morts (some might be unknown see note from lab)
  • See data dictionary for specific formats for cohort, shl_case_number, tag
  • motality_date and spawning_date should be valid date with month, day, and year
  • Note from lab:
    These are the data that are probably going to need the most attention and QA/QC on
    our end. One challenge here is that mortalities are currently being tracked on this
    datasheet via a single entry for each population/location collected from at a single time
    point. When there are multiple tagged animals collected at a single timepoint, all the
    tagged animals are all lumped together in a single entry on this data sheet, and their tags are listed in the Notes. One other challenge is that sometimes we don’t know exactly which population a dead animal was from (e.g., it was found dead on the floor or in a sump), but we have some guesses
    based on the animal size and/or where it was located. Not sure the best way to code for
    that.

    --> This brings up one question: for those animals that are lumped together in a single entry (#_of_morts > 1), can we create multiple entries in the db for each individual animal, and can we extract each animal's tag from the Note column??

Use the db for temp file storage instead of ActiveStorage

ActiveStorage will not work on Heroku for temp file storage for reasons listed here:
https://devcenter.heroku.com/articles/active-storage-on-heroku. Currently file uploading breaks on production for this reason :(

Let's just use a good ole' postgresql table instead called TemporaryFile to temporarily store the CSV. Workflow should be:

  1. When a user uploads a file, add the raw data to this table
  2. Kick off the job that processes the file and saves the cleaned data to the db
  3. Delete the raw data when the job finishes

Add TaggedAnimalAssessment model validations

Acceptance Criteria:

  • As a user, I want to be able to upload a CSV of category found in sample_date_files/tagged_animal_assessment on the page /file_uploads/new with correct data formats. I should see errors if there are incorrect data formats.
  • Model Validations
  • Unit tests for model validations

Please refer to the data dictionary and the notes below for coding the model validations. For an example, please look at: Example of WildCollection model validations and spec

TaggedAnimalAssessment.rb

  • Required: measurement_date, shl_case_number, spawning_date, tag, length
  • See data dictionary for specific formats for shl_case_number, tag, predicted_sex and gonad_score
  • length should be float not exceed 100
  • measurement_date and spawning_date should be valid date with month, day, and year
  • Columns E-J treated as a note/strings

**If there is a column for Facility, let's make these foreign keys to the Facility.rb model we already have. There should only be certain facilities (see the seeds.rb file and users have the ability to add new ones.)

Histogram for Animal Sizes

What sizes are animals from each spawning date? I'd like to be able to select a measurement event (a certain population on a certain date) or a group of measuring events (different populations near the same dates or same population over time) to generate a histogram of lengths ideally binned in 1-cm increments.

Admin can create new users

This issue builds on the devise work done in #76

We need a way for new users to be created through the app. This should be as lightweight and minimal as possible. There is no guarantee we can send email reliably, otherwise devise-invitable would be a good choice.

  • Add administrate to Gemfile, bundle
  • Create a User-admin dashboard: rails generate administrate:dashboard
    • You might need to create a user model
    • users table was already created in PR 76
    • If you need to create the user model, it should include name and email. Devise will probably want password and maybe some other stuff.
    • Make sure you can still login with the test user (from db/seed)

Confirm that you can do the following:

  • Can login and logout with the test user from db/seed

As an authenticated user:

  • Can create a new user and supply a name and email
  • Can delete a user

Don't worry too much about roles - do whatever is simplest and least code. If that means everyone is an admin, start there.

Please pop into the #abalone channel in rubyforgood.slack.com if you have any questions or want to grab this issue.

WildCollection upload job and model validations

Acceptance Criteria:

  • As a user, I want to be able to upload all of the file types found in sample_date_files/wild_collection on the page /file_uploads/new.
  • Tests should be written for each job similar to spec/jobs/tagged_animal_assessment_job_spec.rb
  • Model Validations
  • Unit tests for model validations

Notes:
Much of the logic for the file uploading/parsing is already written in the module concern ImportJob.

The heaviest lifting will be adding appropriate validations to the WildCollection model. Please refer to the data dictionary and the notes below for coding these.

**For columns that have the Facility, let's make these foreign keys to the Facility.rb model we already have. There should only be certain facilities (see the seeds.rb file and users have the ability to add new ones.)

WildCollection.rb

  • Required: columns A-E, N
  • See data dictionary for specific formats for tag, gonad score, predicted sex, initial holding facility, and final holding facility
  • collection date, date of arrival, and OTC treatment completion date should be valid date with month, day, and year
  • collection depth, length, weight should be float/integer
  • Note from lab:
    I redacted some of the specific location info here, as the federal government is worried about disclosing where these endangered animals still exist in the wild, but I might not want that info in this database anyway, depending on how accessible it ends up being.

1 Year Old Survivors Bar Graph

Related to counts, one graph I’d love to be able to produce easily is one that shows the total number of animals that make it to ~1 year old each year. We have a “first count” when the animals are between 10-12 months old.

Render Length Histogram Correctly

We need to make sure the Length Histogram is rendering correctly: http://abalone.blrice.net/reports.

This should create bins of 1 cm increments, so 0-0.99 cm, 1-1.99 cm, etc up to 30cm.

We also need to be able to input parameters of 1) Cohort or multiple cohorts and 2) Date range (optional - default is the most recent).
--> Will need to model a select dropdown of cohorts

Backend/Frontend: Better CSV Upload Errors

Acceptance Criteria:
As a user, when I upload a CSV and it fails to process, I go the the page /file_uploads and:

  1. See the failed file with name, date, category, 'failed' status, stats and initial errors
  2. Click on the failed file
  3. See detailed errors: which rows failed and why
  4. Fix the errors on the original CSV on my machine
  5. Click a button to re-upload my edited CSV.

Testing - Write Feature/Integration Tests for UntaggedAnimalAssessmentJob

BLOCKED: UntaggedAnimalAssessmentJob must be created first.

Tests need to be written for UntaggedAnimalAssessmentJob.

Please use a fixture file for testing in this directory: db/sample_data_files/untagged_animal_assessment/

Feature/Integration Tests
The following contexts and expected outcomes should be tested:

  1. Context: The user uploads a CSV that has already been processed.
    Outcome:
  • A new ProcessedFile record should be created
  • On the /file_uploads page, the user should see:
    • File has Status: "Failed"
    • File has Errors: "Already processed a file with the same name. Data not imported!"
    • File has Statistics: "{}"
  1. Context: The user uploads a CSV with invalid headers.
    Outcome:
  • A new ProcessedFile record should be created
  • On the /file_uploads page, the user should see:
    • File has Status: "Failed"
    • File has Errors: "Does not have valid headers. Data not imported!"
    • File has Statistics: "{}"
  1. Context: The user successfully uploads a CSV with no errors:
    Outcome:
  • A new ProcessedFile record should be created
  • 201 new UntaggedAnimalAssessment records should be created
  • On the /file_uploads page, the user should see:
    • File has Status: "Processed"
    • File has no Errors
    • File has Statistics: "{row_count: 201, rows_imported: 201, rows_not_imported: 0, shl_case_numbers: {"SF16-9A": 100, "SF16-9B": 20, "SF16-9C": 10, "SF16-9D": 71}}"
  1. Context: The user successfully uploads a CSV with errors for 2 rows:
    Outcome:
  • A new ProcessedFile record should be created
  • 199 new UntaggedAnimalAssessment records should be created
  • On the /file_uploads page, the user should see:
    • File has Status: "Processed"
    • File has Errors: "Does not have valid headers. Data not imported!"
    • File has Statistics: "{row_count: 201, rows_imported: 199, rows_not_imported: 2, shl_case_numbers: {"SF16-9A": 100, "SF16-9B": 20, "SF16-9C": 10, "SF16-9D": 69}}"

Backend - Create UntaggedAnimalAssessmentJob

Most of the code for this CSV import is already written in ImportJob. Import this module into an UntaggedAnimalAssessmentJob and write any custom methods; it should look similar to the TaggedAnimalAssessmentJob.

Add UntaggedAnimalAssessment model validations and test CSV upload job

Acceptance Criteria:

  • As a user, I want to be able to upload a CSV of category found in sample_date_files/tagged_animal_assessment on the page /file_uploads/new.
  • Tests should be written for this job similar to spec/jobs/tagged_animal_assessment_job_spec.rb
  • Model Validations
  • Unit tests for model validations

Notes:
Much of the logic for the file uploading/parsing is already written in the module concern ImportJob.

The heaviest lifting will be adding appropriate validations for the UntaggedAnimalAssessment model. Please refer to the data dictionary and the notes below for coding these. For an example, please look at: Example of WildCollection model validations and spec

**If there is a column for Facility, let's make these foreign keys to the Facility.rb model we already have. There should only be certain facilities (see the seeds.rb file and users have the ability to add new ones.)

UntaggedAnimalAssessment.rb

  • *cohort column should be changed to shl_case_number on sample data CSVs (this is an error; cohort is something different)
  • Required: measurement_date, shl_case_number, spawning_date, length
  • See data dictionary for specific formats for shl_case_number, predicted_sex and gonad_score
  • length should not exceed 100
  • length/mass should be float
  • measurement_date and spawning_date should be valid date with month, day, and year
  • Columns E-G treated as a note/strings

Add Pedigree and PedigreeParents CSV upload job and model validations

Acceptance Criteria:

  • As a user, I want to be able to upload all of the file types found in sample_date_files/pedigree/ on the page /file_uploads/new.
  • Tests should be written for each job similar to spec/jobs/tagged_animal_assessment_job_spec.rb
  • Model Validations
  • Unit tests for model validations

Notes:
Much of the logic for the file uploading/parsing is already written in the module concern ImportJob.

The heaviest lifting will be adding appropriate validations to both models. Please refer to the data dictionary and the notes below for coding these.

**If there are columns that have Facility, let's make these foreign keys to the Facility.rb model we already have. There should only be certain facilities (see the seeds.rb file and users have the ability to add new ones.) For an example, please look at: Example of WildCollection model validations and spec

Pedigree.rb

  • Required: cohort, shl_case_number, spawning_date
  • Mother, Father, and Separate crosses within cohort are a list of tags and EACH of these should follow the correct tag format. We could store these in an array and array of arrays for the crosses?

PedigreeParents.rb
**Need model and migration for this

  • See data dictionary for specific formats for Sex (M/F), Origin, Holding Facility
  • Fertilization date, Collection date should be valid date with month, day, and year

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.