Giter Site home page Giter Site logo

dassl / classdb Goto Github PK

View Code? Open in Web Editor NEW
7.0 5.0 2.0 2.86 MB

An open-source system to let students experiment with relational data

Home Page: https://dassl.github.io/ClassDB/

License: Other

PLpgSQL 79.41% TSQL 20.59%
foss education database lab sql postgresql students instructors

classdb's Introduction

ClassDB

ClassDB is an open-source database application to help instructors provide students an environment to experiment with relational data. With ClassDB, each student gets their own space (think schema) with all rights to that space. The instructor is able to read from any student's space, but no student has access to another student's space. Further, the instructor can create additional common spaces with selected rights for students.

Instructors can use ClassDB to accept class assignments and term projects in both introductory courses on data management and upper-level courses where students program against a database.

ClassDB is developed at the Data Science & Systems Lab (DASSL, read dazzle) at the Western Connecticut State University (WCSU) by Sean Murthy, Andrew Figueroa, and Steven Rollo.

Documentation

The documentation for the latest release is on the ClassDB web site.

The documentation for the version in development is on the ClassDB Wiki.

Contributing

Contributions and ideas are welcome. Mail a summary of your thoughts to murthys at wcsu dot edu. Please mention "ClassDB" in the subject line.

Legal Stuff

(C) 2017- DASSL. ALL RIGHTS RESERVED.

ClassDB is distributed under Creative Commons License BY-NC-SA 4.0.

PROVIDED AS IS. NO WARRANTIES EXPRESSED OR IMPLIED. USE AT YOUR OWN RISK.

classdb's People

Contributors

afig avatar kevinkelly25 avatar s-aithal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

kevinkelly25

classdb's Issues

Using Postgres objects rather than Information_Schema (W)

metaFunctions.sql uses pg_views, prepareClassDB.sql uses pg_roles. Using Information_Schema increases the use of standards compliant code (desirable), and makes it easier to port the scripts to another implementation.

Evaluate the suitability of the applicable views provided by Information_Schema and implement them wherever possible.

Automate adding users to pgAdmin (M)

pgAdmin stores its user data in a separate sqlite database. We should see if a script can add user data to this database, and if user information can be synced between postgres and pgAdmin.

Can be hard to figure out the initial password for a student (E)

Presently, a student's password is either something explicitly supplied or a school ID if it exists, or the same as the user name. However, after a student is created, it is not clear what the initial password is. This will especially be a problem if students are created in batch by reading a roster.

I propose we remove the automatic choice of school ID (if it exists) as the initial password. Specifically, use the supplied value if provided or use the username. This revision will make it easier for instructors to communicate initial password to students and make the logic uniform for all kinds of users.

All privileges needed for functions are also given to DBManagers (W)

Since all SECURITY DEFINER functions are currently set to be owned by DBManager, all privileges necessary for the functions to operate are also given to users with the DBManager role.

Some of these privileges can lead making changes that cause anomalies in an instance of ClassDB. One issue in particular is the ability to modify the userName column in the classdb.Student and classdb.Instructor tables (or alternatively, the potential future classdb.User table). We cannot restrict that privilege to only one function, but rather only to a particular role.

It seems that most straightforward approach to solve this would be to create another role that would own the functions marked as SECURITY DEFINER. This new role would need the createrole privilege, CREATE on current_database, and ALL on the classdb.Student and classdb.Instructor / classdb.User tables. The DBManager and Instructor roles would then only have SELECT granted on the userName column in those tables. This new role could also become the owner of the classdb schema. (A potential name for this new role could also be "ClassDB".)

I believe that Dr. Murthy had mentioned a set up similar to this previously.

Effective user permission levels have not been checked (M)

Tests should be run to ensure users only have permissions on the appropriate objects and schemas for their role(s). These tests should be run on a complete instance of the application using roles that have been created by the application.

No easy way to create a ClassDB database (M)

I propose a createClassDB.sql to create a DB with the role ClassDB as owner. This script should be run after running prepareClassServer.sql but before running prepareClassDB.sql.

  • Assert current user has CREATEDB privilege
  • Test if app-specific roles are already created (as in prepareClassDB.sql)
  • Create a database with the given name and assign ownership to role ClassDB
  • Assign privileges to ClassDB for PUBLIC schema etc. if that is required

Parameterizing the script to accept a database name probably fixes psql as the client, but it can be instructive to consult the docs for the createdb program.

Repo directory structure needs revision (W)

Directories need to be urgently reorged. It will be nice if we can create the directory structure, move files, and merge to master before making any more code changes.

Here is a summary of the top-level dirs based on e-mail conversations on this topic. I am making it a to do list so it can be updated.

  • docs can directly contain .MD files or sub-dirs md and html if we have both MD and equivalent HTML files. We should start with the assumption that we will have only MD files. So, it is OK to start by putting MD files directly in the docs folder
  • examples can contain all files related to example schemas. For now we will have only the Shelter schema. As long as we have only one schema, we can put all the files directly inside the examples folder with a README
  • src can contain all the .sql files (excluding those intended for testing). At this point, there is no reason to maintain sub-dirs
  • tests can contain all files related to testing. It should have a README

Student roles need to have a statement timeout (M)

In order to prevent confusion and unnecessary long-running queries, student roles should have a timeout on execution time for queries sent by them. This will mitigate frustration if students do not know how to cancel a query, and communicate to users that a long running time is likely to be an issue with the query, rather than a connection/server-client issue.

The amount of time to allow queries also needs to be decided. The setting is measured in milliseconds.

Some password scripts may be redundant due to psql's password command (E)

The \p (password) command provided by the psql client provides a better way for users to change their password. It is also more secure than the functions created by the scripts, since the password is never displayed or transmitted in clear-text when using the command. Consider whether all functions in passwordProcedures.sql are necessary.

testPrepareClassDB fails at classdb.createStudent() (W)

testPrepareClassDB.sql throws an exception when executing classdb.createStudent() for the first example student. There error appears to be from the ALTER DEFAULT PRIVILEGES statement. The docs state that, "You can change default privileges only for objects that will be created by yourself or by roles that you are a member of," which I think may be the root of the issue. We may be able to solve this with SET ROLE or similar.

prepareClassDB does not grant CONNECT to current DB if user is not the owner of the DB or a superuser (W)

Only the owner of the current database or a superuser can grant roles CONNECT on the current database. The script needs to be updated to declare this, and check to see if the current user is either the owner of the current database or a superuser.

In its current state, the script does not fail if the current user is not the DB owner or a superuser, but does result in several WARNINGS, along with not allowing the proper roles to connect to the DB.

Tab and line widths are inconsistent in .sql files (W)

Not being format-police, but:

At least in prepareClassDB.sql, some tabs are 3 spaces and others are 4 spaces; some tabs are spaces; others are tab characters (or so it seems). Likewise, some code is broken to be under ~80 chars/line, but other lines have 120+ chars.

Any decision is OK as long we have consensus and follow it reasonably. Specifically, line width does not have to be exactly adhered, but we shouldn't also exceed it frequently by more than 5-10 chars.

Student table does not capture query activity (M)

Rename attr LastActivity to lastDDLStartTime; add attr such as ddlStartCount which maintains a count of ddl starts; consider capturing some information about the last ddl that was run, e.g. object name, operation, etc. (say, DROP TABLE X)

Sean Murthy (smurthys)

PVFC schema is not populated (M)

The script createPVFC.sql does not currently contain any meaningful queries. We will have to port the existing schema over to Postgres.

Functions are not Dropped Prior to Creation (W)

Most functions in ClassDB are created with CREATE OR REPLACE and are not dropped prior to creation. We should find all instances of CREATE OR REPLACE and replace them with CREATE and a preceeding DROP FUNCTION IF EXISTS. This will help in making the setup scripts idempotent.
We can maintain a list of instances that need to be/have been fixed here:

Table definition is not modular (N)

This issue is based on a comment by @srrollo.

Club creation of logging-related fields in table Student with the appropriate logging-code: use ALTER table to add the fields lastConnection and connectionCount just before the procedure to import server log. Also, add the fields lastDDL* just before the function to update DDL activity.

At end only the core fields should remain in CREATE TABLE STUDENT.

FInd a replacement for the NAME data type (W)

NAME seems to be heavily discouraged for non-internal operations, however, there does not seem to be a straightfoward way to implement the userName field, without placing a constraint on the rolenames allowed. Andrew Figuero(@afig)

prepareClassDB: Privileges on public schema are not set if the user running the script is not a superuser (W)

Currently, if the script is run as a user who is not a superuser, privileges on the public schema are not revoked and the following warning is reported:

prepareClassDB.sql:53: WARNING: no privileges could be revoked for "public"

In short, only the owner of a schema can revoke privileges on it. By default, the owner of the public schema is the postgres user, not the owner of the database.

Two possible solutions:

  1. Require the user running the script to be a superuser (easiest, but most restrictive, and might not be a good idea in general).

  2. Require the person setting up the ClassDB instance to ensure that the user they are running the script as has been set as the owner of the public schema. Issue a descriptive warning/exception if they are not the owner.

Tests for existence of objects use unnecessary variables/code (E)

File prepareClassDB.sql in branch pginstance-2ndrev-dev contains code to test existence of users, schemas, etc. These tests use variables such as valueExists which is then used in IF statements. Instead, the SELECT statements could directly be used in the IF statement in conjunction with the EXISTS operator. This change eliminates the need for variables and the use of EXISTS makes the code’s intent much clearer.

Maximum number of connections for student roles is not set (M)

It is a good idea to limit the number of connections that a student is allowed to make, in order to prevent idle or abandoned connections from causing too many issues. This can be set with the SET CONNECTION LIMIT setting when creating or altering a role. The default for that setting is -1 (no limit). According to @smurthys , 5 connections seems like a good limit, and I would agree.

Imprecise data types used occasionally for function parameters (W)

Just as example, functions in metaFunctions.sql use TEXT for parameters instead of using narrower types.
Not sure if it is ISO SQL, but consider using the table_name.column_name%TYPE to reference the type of a column. (Also not sure if this type can be used to refer to view columns.)

In general, review all functions for possible improvement.

Also see discussion in Issue #4.

prepareClassDB: Latest version in comments-consistency-fixes-M1 results in errors when run (W)

Attempting to run the prepareClassDB script results in the transaction being aborted due to a few syntax errors in the file, and an incompatibility with an older version. The specific problems are listed below:

  • CREATE OR REPLACE does not work if the functions have different parameter names. This means that in its current state the script cannot be run over an older version with different parameter names. This can be fixed by adding a DROP FUNCTION IF EXISTS prior to the affected functions.

  • Line 102 uses %1 to refer to a parameter in the createUser function. This resulted in an error when running the script. It can be fixed by using $1 instead. Here is a copy of the line:
    format('CREATE USER %I ENCRYPTED PASSWORD %L', %1, COALESCE($2, $1));

  • Lines 294, 347, and 380: It seems like there is a missing AND between the two boolean expressions. Here is an example:
    WHERE pg_catalog.pg_has_role($1, oid, 'member') rolname != $1

  • The ON CONFLICT clause resulted in an error when attempting to execute the createStudent and createInstructor functions. The error is copied below. It seems like there was an issue with username being both the name of a parameter and of a column in each table. A quick fix was to change the name of the parameter, but we may want to choose a different solution.

ERROR:  column reference "username" is ambiguous
LINE 2:           ON CONFLICT (username) DO UPDATE SET studentname =...
                              ^
DETAIL:  It could refer to either a PL/pgSQL variable or a table column.
QUERY:  INSERT INTO classdb.Student VALUES($1, $2, $3)
          ON CONFLICT (username) DO UPDATE SET studentname = $2
CONTEXT:  PL/pgSQL function createstudent(

Only a superuser has full access to pg_stat_activity (W)

Related to #27 and #23. classdb.listConnections(username) can't be owned by dbManager or similar, because only a superuser can read all rows and columns in pg_stat_activity. I will do some research to understand how the row/column permissions are set in it, and if there is any way to grant full read permissions to other roles.
Strangely, pid for ALL connections can be read by anyone, only details like the timestamps and ip address are protected by default.

prepareUserLogging.sql is not modular (M)

DB-level

  • Move logging-related code from prepareUserLogging.sql to a new addLogMgmt.sql
  • Move DDL-monitoring code from prepareUserLogging.sql to a new addDDLMonitors.sql
  • Use helper functions and assertions

Server-level

  • Move out the ALTER SYSTEM statements and the call to pg_reload_conf(); from prepareUserLogging.sql to enableServerLogging.sql with a clear note about running the script from psql or running each statement individually

prepareUserLogging.sql should be no more after this revision.

Unable to easily alter/configure certain DB defaults and settings (M)

Information such as connection limit are presently hard coded. Perhaps the (default) value for such settings could be maintained in a "Settings" table and those values could be used instead of hard coding.

For example, the following schema permits maintaining role-specific default settings related to connections:

Role Category Setting Value
Student Connection Timeout 1000
Student Connection Number 5

A thorough discussion is needed. No need to implement now.

Privileged users should have a function to kill user connections (M)

As we saw today, it will be helpful to provide a function to dbManagers, and possibly instructors, to kill the connections for a given user. We should discuss which role should be required for this functionality. Below is the query I used to do this today. It should be sufficient to wrap this in a function to start:

SELECT pg_terminate_backend(pid)
FROM pg_stat_activity 
WHERE usename = '<username>';

Reading classDB.Student returns timestamps in UTC (M)

Storing timestamps in UTC is the right thing to do, but it is typical for systems to return timestamps in "local timezone"

Perhaps a function such as getStudents(optional timezone): will perform SELECT * on the table but convert timestamps to the specified timezone or to "local" timezone if parameter is missing.

Developing a view is an alternative, but views cannot be parameterized. A view that reuses the aforementioned function to always apply the local timezone is an option. However, a view seems redundant if a function is available.

classdb.killUserConnections will not work unless prepareClassDB is run as a superuser (W)

Currently, classdb.killUserConnections requires that its definer have permission to kill any user's connections. If prepareClassDB.sql is not run as superuser, then classdb.killUserConnections will only be able to target connections belonging to its defining user.

Postgres provides a facility to fix this problem. There is a predefined role pg_signal_backend, which allows a granted user to kill connections from any non-superusers. I propose that we grant this to the ClassDB role.

One requirement for this fix is that the user running prepareClassDB.sql must have ADMIN rights on pg_signal_backend. It seems common for non-superuser 'admin' accounts, for example the ones for AWS and Bluemix, to have ADMIN rights on pg_signal_backend to other users. We will definitely want to document this requirement.

No way to know which version of ClassDB is presently deployed on a server (M)

We need a means to know which version of ClassDB scripts were used to create a DB. For example, we do not presently know which version was used to create the dassl DB on campus.

I propose creating and populating a settings table in the DB prepared by prepareClassDB.sql and other .sql files used to set up the DB. Specifically, the hash of the file used should be maintained any time that file is run.

Could we automate getting git's hash for files? I imagine yes.

addUserMgmt.sql (used to be prepareClassDB.sql) is too long (W)

prepareClassDB.sql is now 500+ lines long and I expect it to get close to 600 lines. At this point, it is rather hard to see where code for any aspect begins/ends. For now, I am placing a dashed line between code segments so I can see the boundaries, but that is just band-aid.

Perhaps it can help to split the file into multiple files and define a sequence of execution for the files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.