data-lessons / library-sql-deprecated Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 16.0 6.52 MB

SQLite lesson for librarians NOW MOVED > https://github.com/LibraryCarpentry/lc-sql

Home Page: https://github.com/LibraryCarpentry/lc-sql

License: Other

Makefile 3.09% HTML 36.30% CSS 3.36% JavaScript 0.95% Python 52.76% R 3.12% Shell 0.20% Ruby 0.22%

library-sql-deprecated's People

Contributors

Stargazers

Watchers

Forkers

njsimons clintonroy c-martinez cmacdonell michielcock danmichaelo scriptotek libads icecjan amsichani thegsi mkuzak mttrb elainewong uio-library emanuelelanzani

library-sql-deprecated's Issues

update the dataset description

The available dataset description in 'Introduction' is quite minimal. Also, it should come after the 'Import' section.

Align objectives with learning outcome language in Bloom's

See http://swcarpentry.github.io/instructor-training/19-lessons/ Please edit learning outcomes so they fit the language in Bloom’s taxonomy. The most important think I learnt about this from Instructor Training is that we cannot use 'understand' as this isn't measurable.

Installation of SQL needs to be better explained

I am working through the lesson prior to teaching it and have stumbled across a few gotchas. Setting up SQLite needs to be a documented step-through process as the language used on the SQLite download page is pretty hostile for Windows people. Install binaries? Huh? Install where? And then what? What I got when I installed was a command line SQLite3 but then the lesson suddenly talks about SQLite Manager which is a Firefox Add on - we need to walk people through getting that working as that was slightly complicated as well. It is really important in these lessons that we do not alienate people by assuming knowledge or leaving steps out so people feel stranded.

CONTRIBUTING.md: restore references to Data Carpentry

When customising the new template, I foolishly removed some references to Data Carpentry. They should be restored. Perhaps I should be the one to restore them.

Proposed change for 00-sql-introduction.md

"Introduction to SQLite Manager" should follow the "Download" section, because it expects you to have a database downloaded in SQLite Manager.

Proposed changes for 01-sql-basic-queries.md

Writing my first query

Requires explicit instructions to go to "Execute SQL" in SQLite

Calculated values

Suggest expand g into grams and kg into kilograms. Likewise for mg into milligrams for the Challenge underneath.
Suggest explaining why we use 1000.0 instead of 1000 for calculating weight in kilograms

Functions

Suggest introducing ROUND in a little more detail and how ROUND(weight/1000.0, 2) will take the value of weight/1000.0 and round to two decimal places.

Escaping STRING

Suggest adding an explanatory box about why we escape STRING but not other data types. Can use both single quotes (') and double quotes (") for escaping.

Filtering

Suggest adding explanatory box about operators i.e. <, <=, >, >=, =, !=

Rephrasing needed for second exercise on aggregation

Had a question today about this exercise: "How many citations that were counted each month a) in total; b) per journal"

"How many citations that were counted each month" can be interpreted as "How many citations were made each month". We don't have any citation time data in the dataset, only article publication time, so that question would be impossible to answer with the data we have, but I still think we should try to rephrase it to make it more clear.

I struggle a little bit to come up with a good way of phrasing it though. Could be because the question is a bit artificial. Perhaps something like "the number of citations per (publication) month; a) ..."? Not sure if that is easy to understand. Help needed :)

Add authors to AUTHORS file

We now have a workflow for releasing citable versions of our lessons (with DOIs) every 6 months via Zenodo. This makes our more discoverable and sustainable and ensures that everyone involved gets the credit they deserve. For more on this work see data-lessons/librarycarpentry#5

In order to make this happen we need to make one crucial change: all AUTHORS files need to change so that they list names of contributors in the following format:

James Allen
James Baker
Piotr Banaszkiewicz
Erin Becker

@jt14den will run a script that that strips names from lesson logs and edit AUTHORS across all Library Carpentry repos.

When this is actioned (hopefully, soon!), lesson maintainers are asked to eyeball the AUTHORS file to see if anyone obvious is missing (for example, people who contributed to discussions but didn't edit any lessons). Note: template developers are credited in this process; this is in line with Software Carpentry best practice.

In the future, lesson maintainers are encouraged to ensure that those who contribute to lessons are added manually to AUTHORS files (encourage contributors to do it so they see where and how we give credit!)

expand the learning objectives of each episode

It would be useful to expand the learning objectives of each episode.

SQLite Manager not compatible with latest Firefox

Firefox 57 is great, but it comes with a new extension system, and it seems like there is no easy way to migrate SQLite Manager to it: lazierthanthou/sqlite-manager#75 (comment) , so we might need to find a replacement.

Lesson episode headers

I've added headers to all the lessons but they are very basic and based on no knowledge of SQL... Could someone who knows more take a look? (who is maintaining this less, @mpfl @mkuzak?)

Rewrite sections "Filtering" and "Building more complex queries"

02-basic-queries SAYS:

Databases can also filter data – selecting only the data meeting certain criteria. For example, let’s say we only want data for a specific ISSN for the Theory and Applications of Mathematics & Computer Science journal, which has a ISSN code 2067-2764|2247-6202. We need to add a WHERE clause to our query:

SELECT *
FROM articles
WHERE issns='2067-2764|2247-6202';

But 2067-2764|2247-6202 is not a ISSN code, it's a combination of 2 (pipe-separated).
In case you want to match both at the same time this is not the way, cause

in a row of the table they might be ordered differently within the field
these 2 could be among 3 or more
If you're also looking for 1 of the 2 matched, this query wouldn't return them either.

First focus on fields that have 1 entry, perhaps ?

remove Data types and SQL Data Type Quick Reference sections

Starting the lesson with over twenty data types and differences between different database systems is very likely to overwhelm learners. I suggest removing those all together and introduce data types along the way through examples.

00-sql-introduction.md: steps under "import" not clear

The steps under "Import" are not completely clear: it doesn't mention that you specify the location where your database needs to be saved.
Also, it assumes in step 6 and 10 that you have seen the content of the database: how do I know whether the first row has column headings or what columns contain which data type if I haven't seen the file? Better explain how to get a quick overview of the contents.

update set up page

The main lesson page links to a set up page which is the lesson infrastructure set up but not the SQL set up. We could use the software carpentry set up instructions.

Library-centric dataset

Might be nice to find a dataset that's more library-centric. Perhaps some fake bib records and a fake patron borrowing history?

Put "Why use SQL?" section on top

Reader wants "is this for me?" answered asap or might leave before

Readme

Please ensure README.md is consistent with template elsewhere, e.g. https://github.com/data-lessons/library-data-intro/blob/gh-pages/README.md

One key thing missing at present is maintainers.

Library challenges

Challenges on lessons 02-sql-aggregation.md and 03-sql-joins-aliases.md are still in ecology speech. Should be migrated to library speech.

non- 'single atomic value'

http://data-lessons.github.io/library-sql/05-supplement/
mentions

" Every row-column combination contains a single atomic value,i.e.,not containing parts we might want to work with separately "

However, the example has a column name ISSNs (plural, pipe-separated)

backup when SQL not working

Sometimes things doesn't work on an attendees laptop: the software is installed incorrectly, they've installed the wrong thing, it all just unfathomably doesn't work.

Now, peer programming is great from a pedagogical point of view, so "work with someone else" is a good option. But prompted by @weaverbel, we should consider adding a backup to our Instructor Notes.

For shell data-lessons/library-shell-DEPRECATED#53 and refine data-lessons/library-openrefine-DEPRECATED#153 there are web based options we can use. What is possible with this lesson? (if nothing, fine - it is probably worth putting that in the notes!)

Proposed changes for 00-sql-introduction.md

Suggested corrections/clarifications

Relational databases

Introduction should be fleshed out, include concept of the relationships between tables.

Dataset description

Perhaps a diagram of how the three tables relate to each other.

Import

Step 1: There are four CSV files at the target URL. The three to use (plots, species, surveys) should be explicitly named in step 1 (they appear again in step 5)
Step 1: Link to Portal Project via DOI for long-term link stability: https://dx.doi.org/10.6084/m9.figshare.1314459.v5
Step 2: Does not mention that SQLite will ask for a database name. Suggest "portal"
Step 2: Does not mention that SQLite will want to save the database to a folder.
Step 6: How do we know if the first row contains column headers?
Step 10: INTEGER, not INT. What about the Primary Key? What about data types for columns not mentioned?

Data Types

Some explanatory text required.

00-sql-introduction.md: data types not clear

Some of the entries in the table of "Data types" have the same description, e.g. INTEGER(p), SMALLINT, INTEGER and BIGINT are all described as "Integer numerical (no decimal)". That's a little confusing: why have different data types that appear to be the same?

Also, perhaps add examples of data types (at least, for some).

replace the SQLite download with SQLmanager

Following a previous issue about better explanation of SQL installation, we suggest to avoid the 'SQLite download' by simply using the SQLite Manager plugin for the Firefox web browser; for instructions see also http://www.datacarpentry.org/sql-ecology-lesson/setup/ .

Add new admin for this repo

Hi @tracykteal - can you also please add @c-martinez as an admin on this repo?

Proposed changes for 03-sql-joins-aliases.md

Joins

Specify that USING needs to have brackets

Lesson maintainers

This lesson has no maintainers listed. Maintainers perform the following tasks:

Maintainers perform a number of important tasks:

make sure their lesson is consistent with the other Library Carpentry lessons. For example, that the Readme or License pages are correct and consistent (indeed the readme does need a little work data-lessons/library-webscraping-DEPRECATED#28)
address any issues that are raised against the lesson
deal with any pull requests that are made for the lesson.
after a lesson is taught, make sure that suggestions for improvement from learners and instructors are integrated
as this is a new lesson, helping it get through the (new) incubator process data-lessons/librarycarpentry#22
and, ideally, keep up with general Library Carpentry chatter at https://gitter.im/weaverbel/LibraryCarpentry

The lesson needs two maintainers, but more the merrier, especially if we can ensure a good mix of timezones. Anyone up for it?

Proposed changes for 02-sql-aggregation.md

The HAVING keyword

Why do we suddenly reference table.column (surveys.species_id)? Not needed until we start using JOIN

Saving queries for future use

In the Challenge, perhaps remind students of ORDER BY ... ASC/DESC

Instructor Notes

This lesson has no Instructor Notes http://data-lessons.github.io/library-sql/guide/. These are helpful for passing on lesson specific tips to potential instructors. See http://data-lessons.github.io/library-data-intro/guide/ for an example of how this might be done.

add a learner's profile

Add a learner's profile (see also the learner's profile example) in instructor training.

Handout

This lesson might benefit from making a handout of reference materials.

To do this add detail of commands/terminology under the keypoints headers for each lesson: for example, https://github.com/data-lessons/library-data-intro/blob/gh-pages/_episodes/04-regular-expressions.md. This effectively then builds a handout at - for example http://data-lessons.github.io/library-data-intro/reference/ - which can be printed out in advance of the session (librarians love handouts!)

Make sure you make a note of this in your Instructor Notes #49

Install sqlite3 command line tools or Firefox sql manager plugin as prerequisite

It might be useful to acoit

Please delete the text below before submitting your contribution.

Thanks for contributing! If this contribution is for instructor training, please send an email to [email protected] with a link to this contribution so we can record your progress. You’ve completed your contribution step for instructor checkout just by submitting this contribution.

Please keep in mind that lesson maintainers are volunteers and it may be some time before they can respond to your contribution. Although not all contributions can be incorporated into the lesson materials, we appreciate your time and effort to improve the curriculum. If you have any questions about the lesson maintenance process or would like to volunteer your time as a contribution reviewer, please contact Kate Hertweck ([email protected]).

different joins, other than rows 1:1 , name collisions for rows

In http://data-lessons.github.io/library-sql/04-joins-aliases/

I see nothing mentioned about different types and naming for joins, as in https://www.w3schools.com/sql/sql_join.asp

I'd like it more explicitly explained what happens when it's other than 1:1 row-match

Also 2 tables can have exact same row names. I'd like to see an explanation, or a remark at least