Giter Site home page Giter Site logo

tier2ui's Introduction

Please join :-UKTiersponsors Slack

Build Status

This project is the front-end for https://uktiersponsors.co.uk

Sorry i haven't open sourced the entire project yet.

Nitty Gritty

1 A service which downloads pdf every day and checks whether pdf is new. If the pdf is new , the pdf is downloaded

2 Each page of pdf is extracted and different company information is put to database. The database is SQL2016.

3 This scans will delete companies from the database, if the company no longer sponsors. The deleted data is moved to a new table, so as to display it back to users.

4 Another service, will fetch the company address, and NIC codes from Companies House API. https://developer.companieshouse.gov.uk/api/docs/

This is done to augment the existing data with the industry in which the company operates.

This also helps in finding the exact company address and other financial obligations like whether bankrupt or made arrears which are currently stored in the database, - This is not brought to website yet.

5 Another service, will crawl, the search engines like Bing, Yahoo and DuckDuckGo to extract company website. This is done using company name, town combination and a union of the results from Page 1 of the search engines are taken and the result which is appearing in all three are counted and ranked based on exclusions (eg:-listing from common company name aggregators like companieshouse,bizstats to name a few are avoided).

This data is put to a Mongo DB database,which contains a list of common urls , (website and social websites)

6 Another service, will fetch the first common url and social website link from Mongo DB and puts it back to the SQL database , which contains the tier 2 companies as described in step(b) , so as to avoid multiple db calls and data consolidation.

This has the added advantage , that if I want to change the algorithm for website finding after analysing existing mongo db search set, i can reinsert it back to the main database easily.

7 A service which will generate CSV files for All companies, recent and deleted companies and stores to Azure Blob Storage which can be downloaded from the website.

Techincal details

  • Webiste front end - Ported from ASP.NET MVC 4 to ReactJS with Web API 2.0 and aspnet core hosting to serve data from SQL 2016.
  • Pdf extractor service - .NET 4.5 , SQL 2016
  • Industry finder service - .NET 4.5, SQL 2016.
  • Website extractor service - .NET Core, Mongo DB
  • Website inserter service - .NET Core, Mongo DB, SQL 2016.
  • Csv generator service - .NET 4.5, SQL 2016.

The entire website is hosted at aspnethosting VPS.

tier2ui's People

Contributors

rohithnair avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.