@martin, @erasche, @bebatut,
Since we migrated the public Galaxy server list on the hub from a monolithic web page to directory based approach, I think it would be easy to programmatically generate this CSV from that directory structure. Here's how:
Current columns
name
This is title
in the server page metadata
url
This is url
in the server page metadata. Would require checking that everyone of these actually point to the server. (I think they do - I'll be visiting every page anyway.)
support
As far as I can tell, these are all email addresses. These do not exist in the current metadata, although sometimes they are in the User Support section of the page content.
Are these all supposed to be a single email address? Are there other options we could do here, like a semicolon separated list of emails, or a URL?
See email_contacts
below.
location
This is a standard two letter country code.
See home_country_code
below
tags
I was thinking about adding tags to the server pages and I asked @dannon to look into metalsmith support for tags, but I also told him it was an unimaginably low priority. We can support tags in the page metadata before we do anything with them in the hub. Some of the tags are already on the pages, but with a different name:
server_group: "general"
There are three groups: general
, domain
, and tool-publishing
. general
maps to genomics
, and tool-publishing
maps to tools
. Those two are easy.
Domain-specific tags like phage
aren't currently supported in the hub.
See tags
below.
Proposed new columns and metadata
info_page
, in CSV
URL of the server's information page on the hub.
email_contacts
, in Hub
Copied from support
in CSV.
home_country_code
, in Hub
Copied from location
in CSV.
But ...
Country codes are not as informative as country names
Displaying "DK" in the hub is not informative. But, country names are ambiguous and 5 names can map to one country.
What say ye?
More location?
@bebatut and I have discussed having Event locations be free form text, but be specific enough that we could pass the string to a mapping service, and it would return some geolocation.
Should we do that with location
, or is country all we'll ever care about (or all we care about now :-)?
I'm OK with country code for now.
I just don't want to display it, and it's easy to change this programmatically later if we want to go there.
tags
, in Hub
Initially populated from tags
in CSV. Combined with server_group
when updating tags
in CSV.
Mixed Model
We don't have to go fully one way or the other. We could use a mixed model where the file can be both programmatically and manually updated. The program would read in the CSV first, and then update information in place. It would report on any updates it did, and on anything that's in the CSV, but not in the Hub.
Differences would be reconciled before the new CSV is pushed.