Giter Site home page Giter Site logo

icssc / peterportal-public-api Goto Github PK

View Code? Open in Web Editor NEW
22.0 22.0 11.0 31.81 MB

API that provides easy-access to UC Irvine data such as: courses, professors, grade distribution, schedule of classes, and more

Home Page: https://api.peterportal.org/

License: MIT License

JavaScript 4.84% Shell 0.05% Python 4.04% TypeScript 91.06%
icssc uci

peterportal-public-api's People

Contributors

akins1 avatar alam7989 avatar alanchangxyz avatar brandonvu12 avatar chasec99 avatar coffee-snake avatar dependabot[bot] avatar ecxyzzy avatar edwu29 avatar ileenf avatar jakegerber avatar kirbster6 avatar nathantoannguyen avatar pranavmreddy avatar ramanxg avatar tisuela avatar tjhu avatar uci-mars avatar y-dejong avatar ym-aung avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

peterportal-public-api's Issues

Improve new developer experience

Goals:

  • Allow new developers to easily setup the project, through instructions on the documentation
  • Working on the project and running it should not require any credentials.

Todo:

  • Any use of credentials should not be required to run the app locally.
  • Update any documentation for how to setup and run the project.
  • Add more contributing guidelines to CONTRIBUTING.md

Implement grades distribution for GraphQL

Currently implemented in REST.

It should be helpful to check out these files for how it is implemented in REST.

/rest/v0/grades.helper.js
/rest/v0/grades.js

We would need it for both calculated and raw grades.

GraphQL endpoint that can filter for a certain class:

{
  grades(year:"2018-19", instructor:"PATTIS, R.") { // here is it possible to add multiple arguments
    year
    quarter
    department
    ...
  }
} 

Logging Requests and Tracking Usage

We want to be able to track the referers of a request in our API.
What logs we want to find out:

  • Unique users
  • Number of invocations/calls per user
  • Total Calls in General

We currently have logs going to AWS CloudWatch. It's better if we use CloudWatch for our logs, and stick with it going forward, and create dashboards based on these logs.

Once we have detailed logs on requests, we can move to creating dashboards in #106

I believe this should be good enough to log everything: https://docs.aws.amazon.com/lambda/latest/dg/nodejs-logging.html

Improve Helper Methods to be more modular for REST and GraphQL

The helper methods, especially Grades, are designed for REST, and take in parameters, i.e. req, and res.
Instead, we should move all the helper methods outside the REST and into its own helper directory.
Route handling should be done in the route outside of the helper methods.

|-- helpers
-- courses.helper.js
-- professors.helper.js
-- grades.helper.js
-- errors.helper.js

Course Offering Schema

The following is a brainstorming + Proposal of a course offering schema and beyond. Future discussion, ideas, and passionate debate can be put here.

What is a Course Offering?

It is an "instance" of a course. Like a lab, lecture, discussion, etc. From what I've seen, some colleges/APIs call them "classes" or "sections". We'll be using Course Offering as the name in this discussion, although this may differ from our future implementation.

For clarity, a college-agnostic definition of Course Offering fits the following criteria:

  • A consistent meeting time each week, lasting the duration of the term. Due to remote learning, this may show up as TBD.
  • A finite number of seats.
  • Define one or more instructors that are consistent for the course.
  • A Unique identifier (course offering code) defined by the college for every registration season.
  • A type (usually seminar, discussion, lecture, or lab).
  • A status for registration (usually FULL, OPEN, CLOSED, WAITLISTED).

The Problem

For many course planning applications, it is important to have access to all course offerings of a course. More likely, an application would want to see all open lectures for a course.

Use cases:

  • Seeing which course offerings have the smallest waitlist
  • Compare professor ratings for all course offerings in a course
  • Tracking "hot" course offerings with the highest increase in enrollment in a given time period

Add parameter to select specific field to view

Rather than dumping all the data at once, add a parameter option to select which field (instructor, quarter, year, etc) to view and send in order to make the data dump a lot smaller.

Follow guidelines of UCI Directory Privacy Policy

Privacy Policy:

This directory has been compiled for the use and convenience of the faculty, staff, students, and affiliates of the University of California, Irvine and others dealing with UC Irvine. It is the property of the Regents of the University of California. In accordance with the California Information Practices Act, neither this directory nor the information contained herein may be used, rented, distributed, or sold for commercial purposes. Compilation or redistribution of information from this directory is strictly forbidden. Upon using this directory, you submit to these terms of use. For more details, please see the University guidelines for assuring the privacy of personal information in mailing lists and telephone directories.

We should follow the privacy policy, so we do not get in trouble, and the API can continue to be used. I would consider at least removing the phone number since, that's the most sensitive information. Most other fields for instructors can be found from their own websites or other pages.

Things to do:

  • update elastic search db
    • This can be a simple run of npm refresh
  • Update graphql schema
    • We wil need to remove the phone number field from Instructor types.
  • change documentation that includes phone
    • Change any examples in the documentation that show the phone number, this is probably in the REST docs for instructors.
    • Remove any mention of compiling from UCI Directory like in below comment.
    • Rebuild docs

Reimplement API key authentication

Implementing API Key

Requirements

  • Create a pipeline to generate and store key in FaunaDB
  • UI for registering API key
  • Email template for confirming api key
  • Add hashing for api key
  • Documentation for using api key
  • Decide rate limiting method and implement.

Schema for API Keys

{
          "key": <string>,
          "first_name": <string>,
          "last_name": <string>,
          "email": <string>,
          "app": {
            "name": <string>,
            "description": <string>,
            "url": <string>,
          },
          "status": <string>,
          "created_on": <Date>
}
  1. Rate limiting can be done depending on the method of deployment.
  • Serverless: through APi Gateway, throttling is handled for us, and we can adjust settings.
  • Server(heroku) - packages exist that keep track of requests in memory express-rate-limit. Since it holds it in memory, it cannot be used for serverless architecture.

Set up AWS Cloudwatch monitoring

Create a easy-to-read dashboard in AWS, to monitor API requests.

Some important data points to know:

  • Server initialization time
  • Server response time
  • Total Costs

Create CI/CD pipeline for deploying.

Set up a CI/CD pipeline. Github actions to test the every pull request and merge to master.

  • Update the tests for REST
  • Research how to set it up, whether through github action, Travis, or other.
  • More...

Find a better method to check if a request comes from graphql-playground

Problem

Currently, to check if the request comes from the GraphQL playgroud, the code checks if the referer header contains the "graphql-playground". However, this header can easily be replicated, and if done, can allow any user to bypass our api key authentication.

Doing something like this works:

import requests

url = "http://localhost:8080/rest/v0/courses/COMPSCI161"

payload={}
headers = {
  'referer': 'graphql-playground'
}

response = requests.request("GET", url, headers=headers, data=payload)

print(response.text)

What To Do

Find a way to allow requests from the graphql playground to bypass the api key check, but is secure and cannot be replicated as the current method.

What has been tried

  • Adding a header into the endpoint for an api key
    • The api key becomes exposed in the graphql playgound

Update Tests for REST

The current tests from jest are outdated, and also needs updating for the new features added.

Main Tests that need to be updated:

  • Grades (Add unit tests for the grades helper methods)
  • API key authentication
    • Test Cases:
      • Valid API Key
      • Invalid API Key
      • Invalid header
      • No headers in request
  • Websoc api
    • Correct API calls
    • Invalid parameters
    • Correct data in Response

On branch unit-tests, there is an update to the tests that separates the different tests into separate files. Start working from this branch to update the REST tests.

Remove Moesif

Remove Moesif. It was added awhile ago, but we don't need it, because we're using AWS CloudWatch.

  • Identify where moesif is used (app.js)
  • Remove these lines of code, and fix it so we don't need them

Create Instructor Name Map

For some queries, like grades and schedule, the instructor name (e.g. "PATTIS, R.") is returned instead of the ucinetid. As a result, we are unable to connect that instructor field with the instructor type.

To fix this, we should create a dictionary that maps the instructor name to their ucinetid.

Fix average GPA calculations with P/NP

As mentioned by Mr. Zotistics, there are errors in the average gpa calculation, and we can fix them by excluding p/np on those calculations. We will probably add a parameter to the request, and also investigate why the gpa calculation is different.

Course endpoint returns misnamed field

/courses/{courseID} endpoint returns a field called dependencies. This implies that listed courses are prerequisites rather the class is a prerequisite for these courses. I think the field should be renamed to prerequisite_for.

Fix GraphQL Error message for nonexistent courses.

For requests to the grades endpoint that also asks for instructors, some courses that do not exist in the current catalogue will throw an error that is confusing to a user like in the below image. Instead let's check if the course is null first, and return {name} so it doesn't throw an error that doesn't make sense to a user since they never asked for a department field.

image

Add GraphQL Tests

Create Tests for GraphQL. If possible, the feature should also be included as part of the tests for REST, so it's all together.

Dump old WebSOC data into sqlite db

In reference to this comment, we could put old websoc data into the sqlite so that the query time is faster. This should prevent us from having to make a new request to websoc for data that will not change.

We can start by doing this with the course offerings from 2019-20 and earlier. It might be helpful to also keep track of query times so we know how much our api is improving for speed. These requests might not occur very often, but in the case that they do, we won't have to make more requests to websoc.

Todo

  • Request and dump the old data into a csv file. We can do this with our own api.
  • Create a table in our sqlite with appropriate fields and put csv file data into it.
  • Restructure both rest endpoints and graphql so that for any requests for older years, the sqlite file is queried instead of the websoc-api.
  • Test and compare speeds

Documentation: Meta & Rest

Meta

  • Documentation on running the documentation site locally (via mkdocs)
  • Resources for styling via Material for MkDocs

Rest

  • Adding more to the REST/start-here page. (Resources on how to test API, like via postman)
  • Reordering navigation via mkdocs.yml

Reduce Server Initialization Time

Every time the the server has a cold start, the duration to initialize is around ~1.5 seconds. Try to reduce this amount, to less than 1 second. This will require identifying what part in the code takes the longest to initialize, and how to reduce it.

Possible Ideas:

  • Loading the cached data into memory each time.
  • Middleware that takes time to load

GPA calculation is off on grades

cc: @Maybe14

i've been noticing a lot of data variability between mine and peterportal's api

i know for the instructors, you used the column that i scraped from websoc rather than the original, but the gpa is really different...

my guess is that you guys included the p/np in the calculation

SSL Error on REST API courses endpoint

Sending a GET request to the endpoint at https://api.peterportal.org/rest/v0/courses/CHEM1A for instance throws an error,
Error: write EPROTO 3179117272:error:10000438:SSL routines:OPENSSL_internal:TLSV1_ALERT_INTERNAL_ERROR:../../third_party/boringssl/src/ssl/tls_record.cc:592:SSL alert number 80 according to Postman. We are using x-api-key header with an API key generate for AntAlmanac, for reference.

Edit: Sending the request on http as in http://api.peterportal.org/rest/v0/courses/CHEM1A works fine

Setup CI/CD pipeline for deploying to Serverless

Creating new issue and closing issues #21 #46, since these are similar issues and could be combined.

Research how to deploy to serverless, and use CircleCI to deploy. We need CircleCI, because this will create a docker environment for us that can build our dependencies without using docker explicitly and complicating our development. We can also setup this so that it automatically triggers on master as a new deployment for our api.

Heroku Plugins:
Some of our plugins are tied to heroku. We can instead create our own project in these plugins and use a new api key.

What is needed in CI/CD Pipeline:

  • Build application
  • Run npm test
  • Deploy serverless application

To Do

  • Setup serverless in project
  • Setup CircleCI in project
  • Determine costs of Lambda and compare to Heroku
  • Deploy with domain name

Add Error Logging and API Analytics

Currently if an error is encountered by the user, to find the error, a developer would have to go to the papertrail, and look for the request made by the user. Finding errors this way may be difficult, especially when traffic starts to increase.

Utilizing a third-party application to monitor errors specifically, can help us identify errors, and possibly traceback errors.
https://sentry.io/welcome/

It would also be helpful to use an API Analytics tool that would allow us to keep track of traffic and which endpoints may be called on more than others. I think it would be cool to look at how our users use our API.
https://www.moesif.com/

Add Summer 2021-22 Grades

We need to add these grades, so that when the react version of Zotistics comes out, it will display the grades.

This entails:

Implement endpoint to fetch data from WebSoC using the websoc-api module

Ref: https://github.com/icssc-projects/websoc-api

Endpoint: /schedule/soc?...

Params:

Name Formatting Notes
term [Year] ['Fall'|'Winter'|'Spring'|'Summer1'|'Summer2'|'Summer10wk']
Example: '2017 Fall'
Default: ' '
Required. Schedule for your selected term must be available on WebSoc.
ge ['ANY'|'GE-1A'|'GE-1B'|'GE-2'|'GE-3'|'GE-4'|'GE-5A'|'GE-5B'|'GE-6'|'GE-7'|'GE-8']
Example: 'GE-1B'
Default: ' '
Must specify at least one of department, GE, courseCodes, or instructorName
department List of available departments to search available in file depts.txt
Example: 'I&C SCI'
Default: ' '
Must specify at least one of department, GE, courseCodes, or instructorName
courseNumber Any valid course number or range
Example: '32A' OR '31-33'
Default: ' '
division ['ALL'|'LowerDiv'|'UpperDiv'|'Graduate']
Example: 'LowerDiv'
Default: 'ALL'
sectionCodes Any valid 5-digit course code or range
Example: "36531" OR "36520-36536"
Default: ' '
Must specify at least one of department, GE, courseCodes, or instructorName
instructorName Any valid instructor last name or part of last name
Example: 'Thornton'
Default: ' '
Enter last name only
courseTitle Any text
Example: 'Intro'
Default: ' '
sectionType ['ALL'|'ACT'|'COL'|'DIS'|'FLD'|'LAB'|'LEC'|'QIZ'|'RES'|'SEM'|'STU'|'TAP'|'TUT']
Example: 'LAB'
Default: 'ALL'
units Any integer or decimal with only tenths place precision, or 'VAR' to look for variable unit classes only.
Example: '5' OR '1.3'
Default: ' '
days ['M'|'T'|'W'|'Th'|'F'] or a combination of these days
Example: 'T' OR 'MWF'
Default: ' '
startTime Any time in 12 hour format
Example: '10:00AM' OR '5:00PM'
Default: ' '
Only enter sharp hours
endTime Any time in 12 hour format
Example: '12:00AM' OR '6:00PM'
Default: ' '
Only enter sharp hours
maxCapacity Exact number like '300' or modified with '<' or '>' to indicate less than specified or greater than specified.
Example: '>256' OR '19' OR '<19'
Default: ' '
fullCourses ['ANY'|'SkipFullWaitlist'|'FullOnly'|'OverEnrolled']
'SkipFullWaitlist' means that full courses will be included if there's space on the wait-list
'FullOnly' means only full courses will be retrieved
'OverEnrolled' means only over-enrolled courses will be retrieved
Example:'SkipFullWaitlist'
Default: 'ANY'
cancelledCourses ['Exclude'|'Include'|'Only']
Example: 'Include'
Default: 'EXCLUDE'
building Any valid building code
Example: 'DBH'
Default: ' '
The value is a building code. Building codes found here: https://www.reg.uci.edu/addl/campus/
room Any valid room number
Example: '223'
Default: ' '
You must specify a building code if you specify a room number

Response:
No need to modify the output of the websoc-api query.

Documentation Update

List of things that need to be updated in the documentation

  • Sample cURL requests, should remove mention of the api key.
  • Add code samples with responses for GraphQL documentation. There is currently not a lot in this section, and we should have much more, and enough, so that users who aren't familiar with graphql can understand how to use it from our documentation.
  • Update our Contributing page. We will be progressing towards having an open source project, so it's important that this is updated with our recent codebase. Here are some of the things this can include
    • New repo overview
    • Initial setup
    • How to contribute to open source.
  • New details/information about compression options. Particularly, that you can add the x-no-compression, so the response is not compressed
  • Remove /graphql-docs/, endpoint and files. In reference to issue #85

There are a lot of changes and these can be done in multiple PRs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.