Giter Site home page Giter Site logo

velocityzen / meta-extractor Goto Github PK

View Code? Open in Web Editor NEW
36.0 36.0 4.0 881 KB

Super simple and fast html page meta data extractor with low memory footprint

License: MIT License

JavaScript 100.00%
atom extractor feed html meta metadata nodejs opengraph parser rss

meta-extractor's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

meta-extractor's Issues

support for google maps and request timeout functionality

Hello, great work. can you please provide support for google maps meta tags. it is currently not parsing google maps meta tags. i am providing a url with facebook open graph result and your application result and please add timeout functionality.Thanks
URL: https://www.google.com.bd/maps/place/Shaheed+Suhrawardy+Medical+College+and+Hospital/@23.7688139,90.360714,15z/data=!4m5!3m4!1s0x0:0x515136eca08d1278!8m2!3d23.769162!4d90.3710112?hl=en

FaceBook:

capture3

meta-extractor:

capture4

Adding keywords

Hi,
I am happy to do a pull request/etc, but wondering if you will accept a patch that also extract keywords....
Just need to add keywords to rxMeta:
let rxMeta = /charset|keywords|description|twitter:|og:|theme-color/im;

Let me know,

Philippe

CORS error

Trying to extract from an external site returns me:

MLHttpRequest cannot load "site-url". No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'localhost-url' is therefore not allowed access.

Is there any workaround?

Does meta-extractor protect for LARGE html files?

Let's assume I have a 2GB file called index.html in my web server, and my HTTP server serves it with common Content-Type: text/html, so everything looks good.

If I pass such a link (http://mydomain/index.html) to the extract() function of meta-extractor, will it attempt to download 2GB of data?

Obviously this is something like an attack. Does meta-extractor prevent it by limiting the total HTML file size or something similar?

add typescript

It's very cool your project, but it's a bit difficult to use it, without the types when use in a typescript project.
So if you could do types of your project would be very nice.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.