Giter Site home page Giter Site logo

ilri / dspace Goto Github PK

View Code? Open in Web Editor NEW

This project forked from alanorth/dspace

8.0 8.0 15.0 179.82 MB

Fork of the official DSpace repository. DSpace powers the CGIAR outputs repository (CGSpace). This repository contains local modifications to the DSpace source code.

Home Page: https://cgspace.cgiar.org

License: Other

Shell 0.15% Java 80.45% HTML 0.73% XSLT 11.21% CSS 1.43% JavaScript 3.72% Perl 0.11% PLSQL 0.55% Batchfile 0.03% Ruby 0.01% FreeMarker 0.02% Python 1.21% SCSS 0.35% Handlebars 0.03% sed 0.02%
dspace java open-access repository

dspace's People

Contributors

alanorth avatar aschweer avatar benbosman avatar bram-atmire avatar christian-scheible avatar cjuergen avatar ctu-developers avatar grahamtriggs avatar helix84 avatar joao-de-melo avatar jonas-atmire avatar kevinvdv avatar kshepherd avatar kstamatis avatar marsaoua avatar mdiggory avatar mwoodiupui avatar peterdietz avatar philipvis avatar pnbecker avatar richard-jones avatar robintaylor avatar rtansley avatar scott-phillips avatar scottyeadon avatar stuartlewis avatar tdonohue avatar terrywbrady avatar tomdesair avatar zuki avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

dspace's Issues

Certain metadata fields aren't controlled by input-forms.xml

For example, ICARDA's cg.subject.icarda has only been populated by batch import and doesn't have a corresponding value-pairs entry in the input forms. This means we can't use it for Listings and Reports (and probably other things!), and there are probably other places where this causes ill effects.

Other examples might be: cg.subject.cip and cg.subject.drylands...

Add IPv6 DNS records

Our Linode VPSes have IPv6, but we don't have AAAA records, so IPv6 clients have to use IPv4 instead.

Make sure meta data titles are using sentence case

We need plural and lower case for second words (except proper nouns), eg:

  • Author affiliations
  • Date issued
  • Output types
  • CGIAR Research Programs
  • Regions
  • Countries
  • AGRIS keywords

Also:

  • Bioversity subjects
  • CIAT subjects
  • CIFOR subjects
  • CIP subjects
  • CTA subjects
  • ICARDA subjects
  • ILRI subjects
  • IWMI subjects
  • CCAFS subjects
  • Drylands subjects
  • Humidtropics subjects
  • WLE subjects

This is possibly related to #32, but because the fields are used elsewhere, it might be as simple as adjusting the messages.xml and all occurrences will print correctly.

Add custom discovery facets for CRP etc

Sisay had added some but they were based on DSpace 3.x code, and they included a lot of "kitchen sink" changes that had other side effects. We need to do new ones based on the DSpace 4.x dspace/config/spring/api/ discovery files.

Add ICARDA subjects

We need to be able to store, display, browse, and search by ICARDA subject.

  • Add to metadata registry (control panel)
  • Add to browse indexes (dspace.cfg)?
  • Add to search filters in discovery.xml
  • Add to sidebar facets in discovery.xml
  • Add strings to messages.xml
  • Add to Atmire CUA or Listings and Reports?
    • dspace/config/modules/atmire-listings-and-reports.cfg
    • dspace/config/modules/atmire-cua.cfg
  • Add code for clickable links to item-list.xsl and item-view.xsl

After adding to dspace.cfg you need to re-index, but I think you also need to re-index after modifying discovery.xml. I believe both are handled by dspace index-discovery -b... I suspect we only need to add browse indexes to dspace.cfg, as they are used for "browse by blah", but search indexes there used to be Lucene, and are now handled in discovery.xml.

Add more Discovery sidebar facets

As a follow up to #28, we need to add more sidebar facets for:

  • Author affiliations
  • Authors
  • Date issued
  • Output types
  • CGIAR Research Programs
  • Region
  • Country
  • Subject – should be 'AGRIS keywords'
  • Bioversity subjects (cg.subject.bioversity)
  • CCAFS subjects
  • CIAT subjects
  • CIFOR subjects
  • CIP subjects
  • CTA subjects
  • Dryland Systems subjects
  • Humidtropics subjects
  • ICARDA subjects
  • ILRI subjects (dc.isubject.ilrisubject)
  • IWMI subjects
  • WLE subjects
  • CPWF subjects

Etc etc...

Check which of these are already configured as a searchFilter in dspace/config/spring/api/discovery.xml, then add them as sidebarFacets, ie:

<ref bean="searchFilterCrpsubject" />

Migrate old DC terms to new CG ones

Some of our older custom metadata fields are living in the DC namespace, which is ugly, confusing, and just makes me lose sleep at night. For example:

  • ILRI subject: dc.isubject.ilrisubject
  • CRP subject: dc.crsubject.crpsubject

We learned to behave better, and started prefixing custom fields with "cg", for example:

  • CIAT subject: cg.subject.ciat
  • CIP subject: cg.subject.cip

We should find all instances of these and kill them one by one, including in dspace.cfg, input-forms, XMLUI themes, spreadsheet templates by the content staff... etc.

What happens when Handle increases beyond 99999?

We're somewhere near 50,000 right now... and importing stuff like crazy. Someone brought up a question about what happens when we reach 99999, and I didn't know the answer. I'll have to ask the DSpace Test mailing list.

Strange characters in Discovery search

Some users reported that the search results in DSpace 4.2 display strange characters. For example, searching for "ilri research brief" displays results such as (attached).

image001

Item in the screen shot is: https://cgspace.cgiar.org/handle/10568/51393

I suspect these strange characters are present in the full-text search for that item, which begins:

ILRIRESEARCH
RESEARCHBRIEF
BRIEF N
32O.
I LRI
Date
year
November
2014
6
14

Growing food and feed with less environmental
impact: A dual-crop impact narrative
Did you know?
• 

Investigate robots.txt versus X-Robots-Tag HTTP header

Right now we have both robots.txt and add_header X-Robots-Tag "none"; for DSpace Test... but I don't know which one is respected. Ideally, on DSpace Test we would block access in robots.txt, but we deploy CGSpace from the same code base, so that's tricky to have different settings.

Maybe we could use robots_test.txt, and redirect those in the nginx vhost for DSpace Test?

Listings and Reports prints in browser instead of downloading

When using Listings and Reports to generate a bibliography, instead of downloading the results as a PDF / PS / RTF, it prints in the browser window. According to Chrome dev tools the content type is text/plain... but it should be application/pdf for PDF at least.

dspacetest cgiar org-jspui-listings-and-reports-listings-and-reports-export pdf - google chrome_015

Hide "Browse By" block in sidebar navigation

We need screen real estate on the side bar, so we want to hide the Browse By sidebar block. We can hide them in XMLUI but they will remain functional for when people click the subject terms in item views and lists. Related to #48.

Re-organize Discovery facets

Change the order of facets on the navigation sidebar to be of order most common to least common. Related to #32 for adding more Discovery facets.

  • Authors
  • Author affiliations
  • Date
  • Output types
  • CRPs
  • Regions
  • Countries
  • "AGROVOC" subjects
  • Myriad of centre specific subjects in alphabetical order
  • Status

We want to keep the same order in the homepage and the main Discovery config.

Add Drylands CRP subjects

List of tasks for Drylands CRP subjects, aka cg.subject.drylands:

  • Add to metadata registry (control panel)
  • Add to browse indexes (dspace.cfg)
  • Add to search filters in discovery.xml
  • Add to sidebar facets in discovery.xml
  • Add strings to messages.xml
  • Add to Atmire CUA or Listings and Reports?
    • dspace/config/modules/atmire-listings-and-reports.cfg
    • dspace/config/modules/atmire-cua.cfg
  • Add code for clickable links to item-list.xsl and item-view.xsl

After adding to dspace.cfg and discovery.xml you need to re-index, ie dspace index-discovery -b.

Add Journal Title to Content and Usage Analysis options

Need to add a new search filter for dc.title.jtitle in the Discovery config, dspace/spring/api/discovery.xml and then add it to the CUA config: dspace/modules/atmire-cua.cfg.

Unfortunately we'll need to re-index Solr after this because of adding a new search filter.

Migrate to single, unified submission template

We want to clean up and consolidate the input forms so they are simpler and we don't have 10+ separate input forms with duplicate metadata fields.

There is a rough pull request in #81 which works, but needs one last look to clean up some typos ("IPASTORAL").

OpenSearch behaving strangely

We enabled it in #29 for crpsubject LIVESTOCK+AND+FISH, and it seemed to work... but it never worked for other CRPs, and the RSS feed seemed to stop updating eventually. Also, sometimes there are results in the RSS which shouldn't match!

The configuration for the sort_by parameter in dspace.cfg is currently:

webui.itemlist.sort-option.1 = title:dc.title:title
webui.itemlist.sort-option.2 = dateissued:dc.date.issued:date
webui.itemlist.sort-option.3 = dateaccessioned:dc.date.accessioned:date
webui.itemlist.sort-option.4 = type:dc.type.output:text

Need to double check:

  • Indexes for these sort types (Lucene? Solr? Cron jobs?)
  • DSpace 4.3 config for websvc.opensearch.

Switch to controlled vocabularies

During research for #34 I found that controlled vocabularies allow you to split up input-forms.xml by keeping ILRI subjects, for example, in a separate file. The side effect is that this functionality implies a hierarchical taxonomy, like ILRI::Subjects::AGRICULTURE and is stored / displayed in the user interface as such! That requires us to change a lot of the places where we are printing / aggregating subjects in Discovery and XMLUI item lists...

But there is power in this approach! The hierarchy is obviously powerful, and sites like the World Bank's Open Knowledge use it, see okr.sector etc:

https://openknowledge.worldbank.org/handle/10986/21529?show=full

This, together with the proper Discovery config is very useful:

https://wiki.duraspace.org/display/DSDOC4x/Discovery#Discovery-Hierarchical(taxonomiesbased)sidebarfacets

Add XMLUI strings for AbstractSearch

New Discovery sidebar facets are missing some XMLUI strings, which show up when you use the "View more" from the sidebar facet. So far I've seen:

xmlui.Discovery.AbstractSearch.type_status
xmlui.Discovery.AbstractSearch.type_outputtype

Update CCAFS metadata for Phase II

CCAFS is moving into Phase II and needs to update some of their metadata taxonomy. This requires:

  • Updating input-forms.xml for the new terminology
  • Batch updating fields existing items in CGSpace via "Export metadata" (CSV)

Disable non-caching cocoon pipeline for XMLUI themes

Some time during DSpace 1.7 or 1.8, we were seeing bizarre caching effects in XMLUI themes, like themes using CSS from one theme, but a banner from another. We initially suspected client-side browser caching but eventually found that the Cocoon pipeline was somehow messing up.

The fix at the time was to use a non-caching pipeline for XMLUI themes. It has been nearly two years since we made this fix, and we need to re-evaluate whether we still want this.

Remove manually generated thumbnails

Since DSpace 4.0 the automatically generated thumbnails are much better quality than the manual ones we had been creating. For items where there is an auto-generated thumbnail we should remove the manual ones.

Enable OpenSearch

We can use OpenSearch backend in DSpace to provide custom RSS feeds based on OpenSearch queries, ie for certain subject terms like CRPs. This is useful because we can feed them to FeedBurner to get alerts when new ones are added.

Add CIP subjects

List of tasks for CIP subjects, aka cg.subject.cip:

  • Add to metadata registry (control panel)
  • Add to browse indexes (dspace.cfg)
  • Add to search filters in discovery.xml
  • Add to sidebar facets in discovery.xml
  • Add strings to messages.xml
  • Add to Atmire CUA or Listings and Reports?
    • dspace/config/modules/atmire-listings-and-reports.cfg
    • dspace/config/modules/atmire-cua.cfg
  • Add code for clickable links to item-list.xsl and item-view.xsl

After adding to dspace.cfg and discovery.xml you need to re-index, ie dspace index-discovery -b.

Recent submissions on homepage

Bring back the recent submissions on the homepage. They were disabled a few months ago because it was kinda messy, but it's the only place where you can see all CG centers working together, so we want to show the latest from the entire CGSpace.

It should be somewhere in dspace/config/spring/api/discovery.xml (look in git history for that file, you'll find it).

Test DSpace 5.x

At the planning meeting in December we discussed the post-4.x roadmap. We agreed we'd try to start testing DSpace 5.x in March and hopefully deploy in June. No testing has happened yet.

We need to:

  • Create a 5_x-dev branch based on upstream's 5_x
  • Rebase our 4_x-prod on top of 5_x-dev
  • See what breaks!

CGSpace fails OAI validation at base-search

We fixed OAI support in #57, but now, testing CGSpace's OAI with the base-search validation tool fails with the following single fatal error:

ERROR: No incremental harvesting (day granularity) of ListRecords: Harvest for reference date 2015-02-14 returned no records. Please make sure that the from and until parameters are evaluated.

I've seen several messages on the dspace-tech mailing list about this, and will have to investigate their solutions. //cc @Jayakananthan

Split up input-forms.xml and explore controlled vocabularies

input-forms.xml is one big, ugly file right now. It has many templates for many institutes, which makes it hard to manage, and always conflicts with mainline DSpace during upgrades.

We need to be able to split the institution-specific controlled vocabularies into separate files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.