Giter Site home page Giter Site logo

chrisvilches / kattis-scraper Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 99 KB

Scrapes the entire Kattis website, downloads all problems and helps you perform complex queries to find interesting problems.

TypeScript 100.00%
competitive-programming kattis kattis-problems scraping typescript surrealdb

kattis-scraper's Introduction

Kattis Scraper

Scrapes the entire Kattis website, downloads all problems and helps you perform complex queries to find interesting problems.

Install

Install using:

npm install

Make sure you've installed the appropriate Node version:

nvm use

# or

fnm use

Note: The current Node version is found in .nvmrc.

How to Run

The scraped URLs are cached, and will only be downloaded once, so you can run the script multiple times without using additional network resources.

Remove the .cache folder if you wish to clear the cached data.

Export as CSV

Run and export the data as an CSV file:

npm run csv

Populate a SurrealDB Database

Make sure you have installed SurrealDB before starting.

Start the SurrealDB server (this command also starts the built-in API):

surreal start --log debug --user root --pass root memory

Scrape the Kattis website and populate the database:

SURREALDB_USER=root SURREALDB_PASS=root npm run surrealdb

Then, you can query the SurrealDB API. Read the docs to learn how to use cURL or Postman to query SurrealDB. Keep in mind the NS and DB headers should both be kattis. For example:

DATA="INFO FOR DB;"
curl --request POST \
	--header "Accept: application/json" \
	--header "NS: kattis" \
	--header "DB: kattis" \
	--user "root:root" \
	--data "${DATA}" \
	http://localhost:8000/sql

To learn how to perform advanced queries in SurrealDB, you should refer to the official documentation.

Example #1: Filter by Difficulty

SELECT slug, minDifficulty FROM problem WHERE minDifficulty > 9.3 LIMIT 5;
"result": [
  {
    "minDifficulty": "9.6",
    "slug": "connectdots"
  },
  {
    "minDifficulty": "9.6",
    "slug": "magicalmysteryknight"
  },
  {
    "minDifficulty": "9.5",
    "slug": "cameramakers"
  },
  {
    "minDifficulty": "9.4",
    "slug": "textprocessor"
  },
  {
    "minDifficulty": "9.4",
    "slug": "callacab"
  }
]

Example #2: Find Geometry Problems

SELECT subdomain, slug FROM problem
WHERE statement CONTAINS "coordinate"
AND statement CONTAINS "distance"
AND statement CONTAINS "polygon"
LIMIT 4;
"result": [
  {
    "slug": "randommanhattan",
    "subdomain": "open"
  },
  {
    "slug": "marshlandrescues",
    "subdomain": "open"
  },
  {
    "slug": "puzzle2",
    "subdomain": "open"
  },
  {
    "slug": "tracksmoothing",
    "subdomain": "open"
  }
]

Example #3: Full Text Search

Under construction. At the moment of writing, full text search is not currently supported by SurrealDB.

Example #4: Count Amount of Scraped Problems

SELECT subdomain, count(subdomain) AS total
FROM problem
GROUP BY subdomain;
"result": [
  {
    "subdomain": "icpc",
    "total": 112
  },
  {
    "subdomain": "open",
    "total": 3551
  }
]

Subdomains

Currently the problems are downloaded from the following URL scopes:

https://icpc.kattis.com/problems/*
https://open.kattis.com/problems/*

More subdomains can easily be added by modifying the source code.

Tools Used

  • Node
  • TypeScript
  • SurrealDB

kattis-scraper's People

Contributors

chrisvilches avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.