data-lessons / library-sql-deprecated Goto Github PK
View Code? Open in Web Editor NEWSQLite lesson for librarians NOW MOVED > https://github.com/LibraryCarpentry/lc-sql
Home Page: https://github.com/LibraryCarpentry/lc-sql
License: Other
SQLite lesson for librarians NOW MOVED > https://github.com/LibraryCarpentry/lc-sql
Home Page: https://github.com/LibraryCarpentry/lc-sql
License: Other
The available dataset description in 'Introduction' is quite minimal. Also, it should come after the 'Import' section.
See http://swcarpentry.github.io/instructor-training/19-lessons/ Please edit learning outcomes so they fit the language in Bloom’s taxonomy. The most important think I learnt about this from Instructor Training is that we cannot use 'understand' as this isn't measurable.
I am working through the lesson prior to teaching it and have stumbled across a few gotchas. Setting up SQLite needs to be a documented step-through process as the language used on the SQLite download page is pretty hostile for Windows people. Install binaries? Huh? Install where? And then what? What I got when I installed was a command line SQLite3 but then the lesson suddenly talks about SQLite Manager which is a Firefox Add on - we need to walk people through getting that working as that was slightly complicated as well. It is really important in these lessons that we do not alienate people by assuming knowledge or leaving steps out so people feel stranded.
When customising the new template, I foolishly removed some references to Data Carpentry. They should be restored. Perhaps I should be the one to restore them.
"Introduction to SQLite Manager" should follow the "Download" section, because it expects you to have a database downloaded in SQLite Manager.
Writing my first query
Calculated values
Functions
Escaping STRING
Filtering
Had a question today about this exercise: "How many citations that were counted each month a) in total; b) per journal"
"How many citations that were counted each month" can be interpreted as "How many citations were made each month". We don't have any citation time data in the dataset, only article publication time, so that question would be impossible to answer with the data we have, but I still think we should try to rephrase it to make it more clear.
I struggle a little bit to come up with a good way of phrasing it though. Could be because the question is a bit artificial. Perhaps something like "the number of citations per (publication) month; a) ..."
? Not sure if that is easy to understand. Help needed :)
We now have a workflow for releasing citable versions of our lessons (with DOIs) every 6 months via Zenodo. This makes our more discoverable and sustainable and ensures that everyone involved gets the credit they deserve. For more on this work see data-lessons/librarycarpentry#5
In order to make this happen we need to make one crucial change: all AUTHORS files need to change so that they list names of contributors in the following format:
James Allen
James Baker
Piotr Banaszkiewicz
Erin Becker
@jt14den will run a script that that strips names from lesson logs and edit AUTHORS across all Library Carpentry repos.
When this is actioned (hopefully, soon!), lesson maintainers are asked to eyeball the AUTHORS file to see if anyone obvious is missing (for example, people who contributed to discussions but didn't edit any lessons). Note: template developers are credited in this process; this is in line with Software Carpentry best practice.
In the future, lesson maintainers are encouraged to ensure that those who contribute to lessons are added manually to AUTHORS files (encourage contributors to do it so they see where and how we give credit!)
It would be useful to expand the learning objectives of each episode.
Firefox 57 is great, but it comes with a new extension system, and it seems like there is no easy way to migrate SQLite Manager to it: lazierthanthou/sqlite-manager#75 (comment) , so we might need to find a replacement.
02-basic-queries SAYS:
Databases can also filter data – selecting only the data meeting certain criteria. For example, let’s say we only want data for a specific ISSN for the Theory and Applications of Mathematics & Computer Science journal, which has a ISSN code 2067-2764|2247-6202. We need to add a WHERE clause to our query:
SELECT *
FROM articles
WHERE issns='2067-2764|2247-6202';
But 2067-2764|2247-6202 is not a ISSN code, it's a combination of 2 (pipe-separated).
In case you want to match both at the same time this is not the way, cause
First focus on fields that have 1 entry, perhaps ?
Starting the lesson with over twenty data types and differences between different database systems is very likely to overwhelm learners. I suggest removing those all together and introduce data types along the way through examples.
The steps under "Import" are not completely clear: it doesn't mention that you specify the location where your database needs to be saved.
Also, it assumes in step 6 and 10 that you have seen the content of the database: how do I know whether the first row has column headings or what columns contain which data type if I haven't seen the file? Better explain how to get a quick overview of the contents.
The main lesson page links to a set up page which is the lesson infrastructure set up but not the SQL set up. We could use the software carpentry set up instructions.
Might be nice to find a dataset that's more library-centric. Perhaps some fake bib records and a fake patron borrowing history?
Reader wants "is this for me?" answered asap or might leave before
Please ensure README.md is consistent with template elsewhere, e.g. https://github.com/data-lessons/library-data-intro/blob/gh-pages/README.md
One key thing missing at present is maintainers.
Challenges on lessons 02-sql-aggregation.md
and 03-sql-joins-aliases.md
are still in ecology speech. Should be migrated to library speech.
http://data-lessons.github.io/library-sql/05-supplement/
mentions
" Every row-column combination contains a single atomic value,i.e.,not containing parts we might want to work with separately "
However, the example has a column name ISSNs (plural, pipe-separated)
Sometimes things doesn't work on an attendees laptop: the software is installed incorrectly, they've installed the wrong thing, it all just unfathomably doesn't work.
Now, peer programming is great from a pedagogical point of view, so "work with someone else" is a good option. But prompted by @weaverbel, we should consider adding a backup to our Instructor Notes.
For shell data-lessons/library-shell-DEPRECATED#53 and refine data-lessons/library-openrefine-DEPRECATED#153 there are web based options we can use. What is possible with this lesson? (if nothing, fine - it is probably worth putting that in the notes!)
Suggested corrections/clarifications
Relational databases
Dataset description
Import
Data Types
Some of the entries in the table of "Data types" have the same description, e.g. INTEGER(p), SMALLINT, INTEGER and BIGINT are all described as "Integer numerical (no decimal)". That's a little confusing: why have different data types that appear to be the same?
Also, perhaps add examples of data types (at least, for some).
Following a previous issue about better explanation of SQL installation, we suggest to avoid the 'SQLite download' by simply using the SQLite Manager plugin for the Firefox web browser; for instructions see also http://www.datacarpentry.org/sql-ecology-lesson/setup/ .
Hi @tracykteal - can you also please add @c-martinez as an admin on this repo?
Joins
USING
needs to have bracketsThis lesson has no maintainers listed. Maintainers perform the following tasks:
Maintainers perform a number of important tasks:
The lesson needs two maintainers, but more the merrier, especially if we can ensure a good mix of timezones. Anyone up for it?
The HAVING
keyword
surveys.species_id
)? Not needed until we start using JOIN
Saving queries for future use
ORDER BY
... ASC/DESC
This lesson has no Instructor Notes http://data-lessons.github.io/library-sql/guide/. These are helpful for passing on lesson specific tips to potential instructors. See http://data-lessons.github.io/library-data-intro/guide/ for an example of how this might be done.
Add a learner's profile (see also the learner's profile example) in instructor training.
This lesson might benefit from making a handout of reference materials.
To do this add detail of commands/terminology under the keypoints headers for each lesson: for example, https://github.com/data-lessons/library-data-intro/blob/gh-pages/_episodes/04-regular-expressions.md. This effectively then builds a handout at - for example http://data-lessons.github.io/library-data-intro/reference/ - which can be printed out in advance of the session (librarians love handouts!)
Make sure you make a note of this in your Instructor Notes #49
Please delete the text below before submitting your contribution.
Thanks for contributing! If this contribution is for instructor training, please send an email to [email protected] with a link to this contribution so we can record your progress. You’ve completed your contribution step for instructor checkout just by submitting this contribution.
Please keep in mind that lesson maintainers are volunteers and it may be some time before they can respond to your contribution. Although not all contributions can be incorporated into the lesson materials, we appreciate your time and effort to improve the curriculum. If you have any questions about the lesson maintenance process or would like to volunteer your time as a contribution reviewer, please contact Kate Hertweck ([email protected]).
In http://data-lessons.github.io/library-sql/04-joins-aliases/
I see nothing mentioned about different types and naming for joins, as in https://www.w3schools.com/sql/sql_join.asp
I'd like it more explicitly explained what happens when it's other than 1:1 row-match
Also 2 tables can have exact same row names. I'd like to see an explanation, or a remark at least
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.