Giter Site home page Giter Site logo

ogpparser's Introduction

Open Graph Protocol Parser

npm

日本語のREADMEはこちら

"ogp-parser" is a node.js library to extract some information such as OGP, SEO from website.

[IMPORTANT]
From v0.5.0, we modified our library to use axios and we can extract oEmbed information. So v0.5.0 has some destructive changes.

Release Information

  • 2023-06: v0.8.1 released
    • remove axios from package.json (I forgot...)
  • 2023-06: v0.8.0 released
    • removed some dependent modules such as axios
    • modified the type definition of oEmbed structure
    • refactored some codes
  • 2023-03: v0.7.1 released TypeScript Upgrade (to 4.8, because build failed v0.7.0)
  • 2023-03: v0.7.0 released translate README to English and some security update
  • 2021-08: v0.6.0 released support typescript
  • 2021-04: v0.5.6 released security updates
  • 2021-02: v0.5.5 released update axios
  • 2020-08: v0.5.4 released fix bug
  • 2020-07: v0.5.3 released fix bug that we cannot install ogp-parser from npm
  • 2020-07: v0.5.2 released securities update (#27)
  • 2020-04: v0.5.1 released same as v0.5.0 (because publishing v0.5.0 to npm is failed)
  • 2020-04: v0.5.0 released support extracting oEmbed, and update interface to use axios
  • 2020-03: v0.4.7 released add npm keyword
  • 2020-03: v0.4.6 released security update
  • 2020-02: v0.4.5 released bug fix (Thank you for @RyosukeCla)
  • 2019-08: v0.4.4 released library update
  • 2018-01: v0.4.1 released refactoring (to use ES2015 syntax)
  • 2016-08: v0.4.0 released refactoring (support Promise)
  • 2016-07: v0.3.1 released refactoring (support charset that is not UTF-8)
  • 2015-05: v0.3.0 released
  • 2015-04: support redirect option
  • 2015-03: support https
  • 2015-03: support page title
  • 2014-06: fix data format
  • 2014-06: add seo tag informations

Dependencies

please check my package.json

Install

npm install -S ogp-parser

Test

npm test

If you want to see coverage:

npm run test-cov

Usage

JavaScript

const ogp = require('ogp-parser');

TypeScript

import ogp from 'ogp-parser'

Example

From v0.5, we have supported to extract oEmbed information. To extract oEmbed, we will use any href attribute in the link tag that has either types:

  • application/json+oembed
  • text/xml+oembed
const ogp = require("ogpParser");
console.log("URL:"+url);

ogp(url).then(function(data) {
    console.log(JSON.stringify(data, null, "    "));
}).catch(function(error) {
    console.error(error);
});

Result

{
    "title": "うきょう(@ukyoda)さん | Twitter",
    "ogp": {
        "al:ios:url": [
            "twitter://user?screen_name=ukyoda"
        ],
        "al:ios:app_store_id": [
            "333903271"
        ],
        "al:ios:app_name": [
            "Twitter"
        ],
        "al:android:url": [
            "twitter://user?screen_name=ukyoda"
        ],
        "al:android:package": [
            "com.twitter.android"
        ],
        "al:android:app_name": [
            "Twitter"
        ]
    },
    "seo": {
        "robots": [
            "NOODP"
        ],
        "description": [
            "うきょう (@ukyoda)さんの最新ツイート 独立系SIer。ビッグデータや機械学習を使ったシステム開発によく携わっています。 最近はPythonが多いですが、JavascriptとかPHPとかJavaとかC/C++での開発もやってます。 https://t.co/y8iW4rQ7lD ザクソン村"
        ],
        "msapplication-TileImage": [
            "//abs.twimg.com/favicons/win8-tile-144.png"
        ],
        "msapplication-TileColor": [
            "#00aced"
        ],
        "swift-page-name": [
            "profile"
        ],
        "swift-page-section": [
            "profile"
        ]
    },
    "oembed": {
        "url": "https://twitter.com/ukyoda",
        "title": "",
        "html": "<a class=\"twitter-timeline\" href=\"https://twitter.com/ukyoda?ref_src=twsrc%5Etfw\">Tweets by ukyoda</a>\n<script async src=\"https://platform.twitter.com/widgets.js\" charset=\"utf-8\"></script>\n",
        "width": null,
        "height": null,
        "type": "rich",
        "cache_age": "3153600000",
        "provider_name": "Twitter",
        "provider_url": "https://twitter.com",
        "version": "1.0"
    }
}

If you aren't necessary oEmbed information

You need to request oEmbed information besides an normal http request. If you won't need oEmbed information in your application, you can disable to extract it by using skipOembed option.

const parser = require("ogp-parser");
const url = "https://twitter.com/ukyoda";
parser(url, { skipOembed: true }).then(function(data) {
    console.log(JSON.stringify(data, null, "    "));
}).catch(function(error) {
    console.error(error);
});

Result (no oEmbed)

{
    "title": "うきょう(@ukyoda)さん | Twitter",
    "ogp": {
        "al:ios:url": [
            "twitter://user?screen_name=ukyoda"
        ],
        "al:ios:app_store_id": [
            "333903271"
        ],
        "al:ios:app_name": [
            "Twitter"
        ],
        "al:android:url": [
            "twitter://user?screen_name=ukyoda"
        ],
        "al:android:package": [
            "com.twitter.android"
        ],
        "al:android:app_name": [
            "Twitter"
        ]
    },
    "seo": {
        "robots": [
            "NOODP"
        ],
        "description": [
            "うきょう (@ukyoda)さんの最新ツイート 独立系SIer。ビッグデータや機械学習を使ったシステム開発によく携わっています。 最近はPythonが多いですが、JavascriptとかPHPとかJavaとかC/C++での開発もやってます。 https://t.co/y8iW4rQ7lD ザクソン村"
        ],
        "msapplication-TileImage": [
            "//abs.twimg.com/favicons/win8-tile-144.png"
        ],
        "msapplication-TileColor": [
            "#00aced"
        ],
        "swift-page-name": [
            "profile"
        ],
        "swift-page-section": [
            "profile"
        ]
    }
}

Disclaimer Note

I publish this library as MIT License.
I'm not going to place special regulations to use this library if the range of the license.
And I make no guarantees even if you got some accidents to use this library.

ogpparser's People

Contributors

dependabot[bot] avatar ukyoda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ogpparser's Issues

ogp[], seo[], must be array?

return値であるogp,seoは配列ではないほうがいいと思うのですがいかがですか?
returnされた配列内に内容が入っている場合でも、lengthが0なので処理するときに空行列として扱われる場合が多いです。

配列のlengthが正常な値になるように修正するか、オブジェクト型にした方が良いのではないでしょうか?

h1タグやlinkタグを抽出したい

h1タグやaタグを抽出したい。どのように取り出すか検討中。

  • これまで通り、<h1>のリスト、<a>のリストで取り出す
    • 抜けなく取り出すことができる
  • HTML5的に、article, header, section あたりでグルーピングして取り出す
    • 対応していないページでは取り出せない

TypeError: "NetworkError when attempting to fetch resource."

OGP情報を取得してオリジナルのブログカードを作成しようと考えています。

環境はWordPress。
WordPressのエディタ(Gutenberg)はJavaScriptで構築されているようで、オリジナルブロックの作成にもJavaScriptを使用しています。

OGP情報をこのパーサーを使用して取得しようと考えていたのですが、外部リンクの情報を取得する時にタイトルのようなエラーが発生してしまいました。
(内部リンクは問題なく情報取得しています)

「CORS ヘッダー ‘Access-Control-Allow-Origin’ が足りない」とのことで、異なるドメインのURLから情報を参照する際に発生する様子。「同一生成元ポリシー」によるエラーのようですが、なんとかして外部リンク先のOGP情報を取得できないものでしょうか?

インストールエラー:ENOENT: no such file or directory, open './package-lock.json'

PS C:\Users\user> npm i -global ogp-parser

> [email protected] preinstall C:\Users\user\AppData\Roaming\npm\node_modules\ogp-parser
> npx npm-force-resolutions

npx: installed 5 in 1.729s
ENOENT: no such file or directory, open './package-lock.json'
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] preinstall: `npx npm-force-resolutions`
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] preinstall script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     C:\Users\user\AppData\Roaming\npm-cache\_logs\2020-07-21T02_58_56_906Z-debug.log

githubからrepositoryをダウンロードしてとlocalインストーしました:

PS C:\Users\user> cd C:\Users\user\Downloads\ogpParser-master
PS C:\Users\user\Downloads\ogpParser-master> npm install -g

> [email protected] preinstall C:\Users\user\AppData\Roaming\npm\node_modules\ogp-parser
> npx npm-force-resolutions

npx: installed 5 in 1.813s

> [email protected] postinstall C:\Users\user\Downloads\ogpParser-master\node_modules\fast-xml-parser
> node tasks/postinstall.js || exit 0

Love fast-xml-parser? Check https://amitkumargupta.work for more projects and contribution.

+ [email protected]
added 29 packages from 67 contributors in 4.193s
PS C:\Users\user\Downloads\ogpParser-master>

それはいいけどdependenciesが無いです。

PS C:\Users\user\AppData\Roaming\npm> node.exe .\test.js
internal/modules/cjs/loader.js:1033
  throw err;
  ^

Error: Cannot find module 'he'
Require stack:
- C:\Users\user\Downloads\ogpParser-master\ogpParser.js
- C:\Users\user\AppData\Roaming\npm\test.js
    at Function.Module._resolveFilename (internal/modules/cjs/loader.js:1030:15)
    at Function.Module._load (internal/modules/cjs/loader.js:899:27)
    at Module.require (internal/modules/cjs/loader.js:1090:19)
    at require (internal/modules/cjs/helpers.js:75:18)
    at Object.<anonymous> (C:\Users\user\Downloads\ogpParser-master\ogpParser.js:9:12)
    at Module._compile (internal/modules/cjs/loader.js:1201:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1221:10)
    at Module.load (internal/modules/cjs/loader.js:1050:32)
    at Function.Module._load (internal/modules/cjs/loader.js:938:14)
    at Module.require (internal/modules/cjs/loader.js:1090:19) {
  code: 'MODULE_NOT_FOUND',
  requireStack: [
    'C:\\Users\\user\\Downloads\\ogpParser-master\\ogpParser.js',
    'C:\\Users\\user\\AppData\\Roaming\\npm\\test.js'
  ]
}

手伝ってお願いします。ORZ
英語はいいですか?

403 Forbidden

var parser = require('ogp-parser')
var url = 'http://anond.hatelabo.jp'
parser(url, true)
.then(function (data) {
  console.log(data)
})
.catch(function (error) {
  console.error(error)
})

403 Forbiddenが返ってきた - node 7.1.0

{ title: '403 Forbidden', ogp: {}, seo: {} }

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.