Giter Site home page Giter Site logo

dc2015's Introduction

dc2015

Some Clojure scripts to digest Hong Kong District Council Election.

Overview

Elected Ratio Analysis

./Elected%20Ratio%20Pan%20Democracy.jpg ./Elected%20Ratio%20Pan%20Establishment.jpg ./Elected%20Ratio%20Others.jpg

Processing

Step 1

Create some utility functions to simplify parsing html table

(def parse-slurp (comp parse slurp))

(defn- read-hdrs [tbl data]
  (->> (extract-from data tbl [:x] "tr th" text)
       first vals first))

(defn- read-data [url]
  (let [d    (parse-slurp url)
        hdrs (read-hdrs "table" d)]
    (->> (extract-from d "table tr" [:x] "td" text)
         (map :x)
         (remove nil?)
         (map #(zipmap hdrs %)))))

Step 2 - results

Ungroup the result table (i.e. split the merged cell)

(->> (extract-from data "table.contents2 tr" [:x :y] "td[rowspan]" text "td" text)
     (remove #(every? nil? (vals %)))
     (reduce (fn [[last res] {:keys [x y]}]
               (if (nil? x)
                 [last (conj res (concat last y))]
                 [x    (conj res y)]))
             [nil []])
     last
     (map #(zipmap hdrs %))
     to-dataset)

Step 3 - noms

Use the utility functions created in Step 1 to parse the master nomination table

(->> "http://www.elections.gov.hk/dc2015/pdf/2015_DCE_Valid_Nominations_C.html"
     read-data
     to-dataset)

Step 4 - noms2

Collect all other nomination data

(->> (extract-from (parse-slurp "http://www.elections.gov.hk/dc2015/chi/nominat2.html")
                   "table tr td"
                   [:x] "a" (attr :href))
     (map :x)
     (remove nil?)
     (apply concat)
     (filter #(and (string? %) (re-matches #"\.\./pdf/nomination.*html" %)))
     (map #(->> % (drop 2) (apply str) (str "http://www.elections.gov.hk/dc2015")))
     (mapcat read-data)
     to-dataset)

Step 5 - mm

Join the tables created in Step 3 - noms and Step 4 - noms2

($join [["選區代號" "獲提名人士姓名 (姓氏先行)"] ["選區號碼" "姓名"]] noms2 noms)

Step 6

Final step - join the nomination data with election result and output to Excel

(-> ($join [["選區號碼" "候選人編號"] ["Constituency Code" "Candidate Number"]]
           mm
           results)
    (save-xls "output.xls"))

License

Copyright © 2015 RMCV

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.

dc2015's People

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.