Giter Site home page Giter Site logo

dino's People

Contributors

bookcliff avatar russeii avatar

Watchers

 avatar  avatar  avatar

dino's Issues

Improve input data

Right now we have 2500 URLs that were returned from https://defillama.com/docs/api endpoint and getting their URLs.

However, some protocols provided their landing page (root domain or www.domain, example ribbon.finance ) while others provided their app domain (app.domain.com). This makes the data slightly misleading / less informative as companies are usually set up in 1 of two ways

  1. A shared landing page + app domain. They use paths for routing and usually the same codebase. https://llamapay.io/ is an example of this
  2. A separate domain for the landing page and the app. I think this is most common. https://ribbon.finance and https://app.ribbon.finance is an example of this.

We should somehow segment the data between 'landing pages' and 'apps'.

If the site is doing method number 1, we can count it as an app.
If it's using method number 2, we should count their app domain as an app, but their landing page as a landing page.

As the output of this issue I'm imagining we have all the same graphs as step 1, but one of them for all 'landing pages' and then another set of all graphs for 'apps'

  • How do know if a site uses method 1 or method 2?
  • How to get the app domains for sites using method 2? (maybe we just use app.domain since that seems to be most common)

Maybe we can write a quick script using chatgpt where we can feed in a bunch of root websites, and it pings the app domain and a few other common domains to see if it has a separate app?

Create data graphs

We have a .csv and a .json file in the data folder in the project. The .json is probably easier to work with.

  • Clean the data so we can then use the better formatted data for data analytics. For example, we don't currently care about the nextjs versions numbers being used so that data can / needs to be removed . The data cleaning we need is simple so I'd probably just use javascript for thos
  • Create pie graphs for each of the columns / fields that we care about. Seems like https://charts.ant.design/en/examples/pie/basic/#basic would be good for visualizing the data.

Here are the columns we care about.

Data needs cleaned in many of these columns + pie graph probably best way to visualize for us.

  • CDN
  • PAAS
  • Static Site Generator
  • Page Builders
  • UI Frameworks
  • Web frameworks
  • Web servers (what exactly is different between this and web frameworks?)
  • Country
  • Responsive
  • Technology Spend

Notes

I'd probably put an unknown entry for each blank entry instead of ignoring them completly
We should be able to add more data and have this still work. For example, I'll want to add ~5000 more records to this.
Some of the URLs are landing pages and others are their apps themselves (root domain vs app.root) which will distort the technology as landing pages often use a different stack. We can adjust for this in the future (separate landing pages and apps)

Bonus:

  • Turn the Javascript Frameworks column into some useful graph somehow

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.