makepad-fr / fbjs Goto Github PK
View Code? Open in Web Editor NEWTooling that automates your Facebook interactions.
Home Page: https://www.npmjs.com/package/@makepad/fbjs
License: GNU General Public License v3.0
Tooling that automates your Facebook interactions.
Home Page: https://www.npmjs.com/package/@makepad/fbjs
License: GNU General Public License v3.0
Posts and Author are incorrect. It's saving the Author as the post and the post as the author. :)
For instance, we do not have much information on our README. Also, we do not have proper documentation. We need to update this to help newcomers.
Hello.
I'm excited to try out this tool. I just installed it and ran it but got this error:
(node:47502) UnhandledPromiseRejectionWarning: TimeoutError: waiting for selector "#login_form" failed: timeout 30000ms exceeded.
This is the command I used:
fgps --group-ids ##########
I then ran it with the --headful
param and can see the browser open and load the facebook.com but then doesn't fill in the username/password.
The groupIds
in Options
is unused we need to remove it as it complexifies for nothing the FB.init
function
Hi! it's posible set limit number post for scraping?
I been searching and I found this browser API: captureStream().
It allows you to capture stream from html video/audio/canvas elements:
var video = document.querySelector('video')
//Capture a video stream in 30 FPS
var stream = video.captureStream(30)
This implementation can be used to scrape Facebook videos simply by recording the captured stream using the MediaRecorder browser api.
The post id is useful for having a sort of hash to detect the changes in a post. For instance, if a post changes by the time, we just scrape that one more time. This post id can be get from the href
attribute of the date a
element at the bottom of the author's name. Once the post id is got the GropPost
class should be updated
Hi there! I can't run the example script on my local machine.
Steps I took:
git clone [email protected]:mihailthebuilder/fbjs.git && cd fbjs/example
.npm install
inside the example
folder.FACEBOOK_USERNAME="<your_facebook_username>" FACEBOOK_PASSWORD="<your_facebook_password>" FACEBOOK_2FA_CODE="<facebook_2fa_code>" FACEBOOK_GROUP_ID="<ffacebook_group_id>" npm start
cd .. && npm install
to fix above errors (takes a good few minutes to install)example
folder:cd example
FACEBOOK_USERNAME="<your_facebook_username>" FACEBOOK_PASSWORD="<your_facebook_password>" FACEBOOK_2FA_CODE="<facebook_2fa_code>" FACEBOOK_GROUP_ID="<ffacebook_group_id>" npm start
but I get a MODULE_NOT_FOUND
error
Local environment:
Is there any chance you could add a Docker image?
fbjs-cli
repository content in this repository as a cli
folderpackage.json
to compile both library and CLI with a single scriptpackage.json
to publish the CLIFor instance, CircleCI is not working for deployment. It only checks for linter errors. We need to deploy automatically via CircleCI
For instance, all our functions are in the FB
class. If we keep adding new features this will be very ugly. We need to transform existing group posts related functions in a FacebookGroup
class that will contain all Facebook groups related features.
Add max post per group option so that it won't download all posts in very large groups.
Given the user id, we want to add this user as a friend. If it is not already on our friend list. If the given user already exists in our friend list, do nothing.
fgps --group-ids 610355872400171 --output C:/Users/xx/Documents/ Cookie banner did not appear (node:9572) UnhandledPromiseRejectionWarning: TimeoutError: waiting for XPath "//div[@data-pagelet="Stories"]" failed: timeout 30000ms exceeded at new WaitTask (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\DOMWorld.js:549:28) at DOMWorld._waitForSelectorOrXPath (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\DOMWorld.js:478:22) at DOMWorld.waitForXPath (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\DOMWorld.js:441:17) at Frame.waitForXPath (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\FrameManager.js:642:47) at Frame.<anonymous> (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\helper.js:112:23) at Page.waitForXPath (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\Page.js:1131:29) at facebookLogIn (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\src\index.js:366:14) at processTicksAndRejections (internal/process/task_queues.js:93:5) at async main (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\src\index.js:612:10) (Use
node --trace-warnings ...to show where the warning was created) (node:9572) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag
--unhandled-rejections=strict(see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1) (node:9572) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code. (node:9572) UnhandledPromiseRejectionWarning: Error: Page crashed! at Page._onTargetCrashed (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\Page.js:213:24) at CDPSession.<anonymous> (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\Page.js:122:56) at CDPSession.emit (events.js:315:20) at CDPSession._onMessage (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\Connection.js:200:12) at Connection._onMessage (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\Connection.js:112:17) at WebSocket.<anonymous> (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\WebSocketTransport.js:44:24) at WebSocket.onMessage (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\ws\lib\event-target.js:120:16) at WebSocket.emit (events.js:315:20) at Receiver.receiverOnMessage (C:\Users\xx\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\ws\lib\websocket.js:789:20) at Receiver.emit (events.js:315:20) (node:9572) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag
--unhandled-rejections=strict(see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 2) ERROR: The process with PID 2340 (child process of PID 9572) could not be terminated. Reason: There is no running instance of the task.
Hi I'm experiencing this error after installing 2.4.0.
Note: This solution applies to the desktop version of the Facebook website, just as the other solutions I'm providing to improve this library, you should switch from the mobile version first then I'll start making some pull requests.
Scrapping text from posts on the desktop version is much complicated than the mobile version, since it comes in the form of HTML elements rather than plain text, the key here is finding the right selector for the post body, as for the other elements we need to scrape like images, videos, submission permalink... and other staff, this needs a separate issue and a deeper discussion.
Anyway, using the browser inspector we can see how it looks like under the hood:
You'll notice that It's located between two pseudo-elements (::before and ::after), we just need to copy the .innerHTML
of the parent element, then converting it to markdown, and there is a very good library for that called turndown, and as you can see from the image below, we MADE IT!
Another issue is the See More button, you should click it first to allow more text to appear:
And that's all, I hope that this information will help <3
fb groups:get:posts -i 689598851175466 --headfull --output=C:\Users\myname\Desktop\fc.json -c 10
Running in headless mode ? true
Спомени за родната казарма* публична група | Facebook Facebook group's posts scraped: 156 posts found
Error: ENOENT: no such file or directory, open 'C:\Users\apollo\Desktop\fc.jsongroupname* публична група | Facebook.json'
Code: ENOENT
Given a user id, we want to return public information for the user.
UserProfile
which contains this informationprofile
module which will contain all functions related to the user profile..json
file ends with .json.json
.cookieOutputPath
Facebook is a heavy beast. You guys use puppeteer and wait a random timeout before scrolling down. This seems to be slow AF.
mbasic.facebook.com does not use any js files, it's simply a html page that could be parsed. No need for loading all of the images and videos while scrolling.
#m_group_stories_container > section > article
selector.#m_group_stories_container > section + div > a
.Also it would make implementing of #15 much easier.
Only downside I can see, is one additional request per post to get an url of full size image.
Can you please give an example of how this could be used in a script?
I'd love to have RSS feeds generated from private facebook groups. This repo has pretty much everything I need though instead of forking it and maintaining a modified version I'd rather use it as a module.
For instance, there are no tests at all except for manual tests. We need to have tests.
Running with this example: fgps --group-ids 196322762283544 --output /path
get this error:
(node:45896) UnhandledPromiseRejectionWarning: TimeoutError: waiting for XPath "//button[@data-cookiebanner="accept_button"]" failed: timeout 30000ms exceeded at new WaitTask (/Users/assafelovic/.nvm/versions/node/v12.9.1/lib/node_modules/facebook-group-posts-scraper/node_modules/puppeteer/lib/DOMWorld.js:549:28) at DOMWorld._waitForSelectorOrXPath (/Users/assafelovic/.nvm/versions/node/v12.9.1/lib/node_modules/facebook-group-posts-scraper/node_modules/puppeteer/lib/DOMWorld.js:478:22) at DOMWorld.waitForXPath (/Users/assafelovic/.nvm/versions/node/v12.9.1/lib/node_modules/facebook-group-posts-scraper/node_modules/puppeteer/lib/DOMWorld.js:441:17) at Frame.waitForXPath (/Users/assafelovic/.nvm/versions/node/v12.9.1/lib/node_modules/facebook-group-posts-scraper/node_modules/puppeteer/lib/FrameManager.js:642:47) at Frame.<anonymous> (/Users/assafelovic/.nvm/versions/node/v12.9.1/lib/node_modules/facebook-group-posts-scraper/node_modules/puppeteer/lib/helper.js:112:23) at Page.waitForXPath (/Users/assafelovic/.nvm/versions/node/v12.9.1/lib/node_modules/facebook-group-posts-scraper/node_modules/puppeteer/lib/Page.js:1131:29) at facebookLogIn (/Users/assafelovic/.nvm/versions/node/v12.9.1/lib/node_modules/facebook-group-posts-scraper/src/index.js:331:14) at processTicksAndRejections (internal/process/task_queues.js:85:5) at async main (/Users/assafelovic/.nvm/versions/node/v12.9.1/lib/node_modules/facebook-group-posts-scraper/src/index.js:608:10) (node:45896) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1) (node:45896) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
On CentOS 8 without GUI:
(node:1136594) UnhandledPromiseRejectionWarning: Error: Failed to launch the browser process!
/home/grzegorz.kowalski/facebook-group-posts-scraper/node_modules/puppeteer/.local-chromium/linux-722234/chrome-linux/chrome: error while loading shared libraries: libX11-xcb.so.1: cannot open shared object file: No such file or directory
To fix it (and a host of simiilar errors) I did:
sudo dnf install libX11-xcb libXcomposite libXcursor libXdamage libXi libXtst libXss cups libXScrnSaver alsa-lib atk at-spi2-atk pango gtk3
Installed just now with
npm i facebook-group-posts-scraper -g --unsafe-perm
Version:
# fgps --version
2.3.0
(node:35) UnhandledPromiseRejectionWarning: TimeoutError: waiting for selector "#login_form" failed: timeout 30000ms exceeded
at new WaitTask (/usr/local/lib/node_modules/facebook-group-posts-scraper/node_modules/puppeteer/lib/DOMWorld.js:549:28)
at DOMWorld._waitForSelectorOrXPath (/usr/local/lib/node_modules/facebook-group-posts-scraper/node_modules/puppeteer/lib/DOMWorld.js:478:22)
at DOMWorld.waitForSelector (/usr/local/lib/node_modules/facebook-group-posts-scraper/node_modules/puppeteer/lib/DOMWorld.js:432:17)
at Frame.waitForSelector (/usr/local/lib/node_modules/facebook-group-posts-scraper/node_modules/puppeteer/lib/FrameManager.js:627:47)
at Frame. (/usr/local/lib/node_modules/facebook-group-posts-scraper/node_modules/puppeteer/lib/helper.js:112:23)
at Page.waitForSelector (/usr/local/lib/node_modules/facebook-group-posts-scraper/node_modules/puppeteer/lib/Page.js:1122:29)
at facebookLogIn (/usr/local/lib/node_modules/facebook-group-posts-scraper/src/index.js:335:14)
at processTicksAndRejections (internal/process/task_queues.js:97:5)
at async main (/usr/local/lib/node_modules/facebook-group-posts-scraper/src/index.js:599:10)
(node:35) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag--unhandled-rejections=strict
(see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:35) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
In example folder the static file for the output of the group messages is present.
.gitignore
I don't quite see how I'm going to pass the 2FA code every time I want to scrape the group. Also one may perceive inputting password as a security threat. Especially if it's stored in plaintext (I have no idea if it is.)
CI is broken for now.
For instance, we're scraping only messages published in a Facebook Group. We need to get comments on this message and comments on a comment.
{
id: String,
parent: String,
author: String,
content: String
}
The id
the id of the comment (that we got as for post from the date's link href
) ,parent
will be the id of the parent, id
will be the id of the comment, author
the name of the author and content
the content of the comment.
README
For the
README
lines you're right. The written version is the version that I'm using on my computer, but if younode
v14.17.2 works great, I'll update this. Do you mind share yournpm
version too?
@kaanyagci Sure, my npm version is 6.14.13, but this doesn't change the fact that it may work on a lower version, technically you should target the min supported version of puppeteer (node v8.x should work fine, check this).
In the other side I couldn't Install the dev dependencies with node v10 (I used to have this before upgrading to v14), so the min supported version for the development machine is higher than the min version for production.
Originally posted by @iMrDJAi in #55 (comment)
I want to download all the memes in this group
C:\Users\Abdullah>fgps --group-ids sadnibbahourshitposting --output "C:\Users\Abdullah\Desktop\fb test" --headful
Cookie banner did not appear
node:internal/process/promises:246
triggerUncaughtException(err, true /* fromPromise */);
^
Error: No node found for selector: input#email
at assert (C:\Users\Abdullah\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\helper.js:283:11)
at DOMWorld.focus (C:\Users\Abdullah\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\DOMWorld.js:376:5)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async facebookLogIn (C:\Users\Abdullah\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\src\index.js:349:3)
at async main (C:\Users\Abdullah\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\src\index.js:612:10)
-- ASYNC --
at Frame.<anonymous> (C:\Users\Abdullah\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\helper.js:111:15)
at Page.focus (C:\Users\Abdullah\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\node_modules\puppeteer\lib\Page.js:1071:29)
at facebookLogIn (C:\Users\Abdullah\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\src\index.js:349:14)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async main (C:\Users\Abdullah\AppData\Roaming\npm\node_modules\facebook-group-posts-scraper\src\index.js:612:10)
Node.js v17.2.0
I'm using headful mode because if I don't put the --headful
flag, I don't see anything on my output folder, so the program is essentially doing nothing.
We want to be able to post a comment a given post. We can do that with a function void comment(String content)
in the Post
class.
Hi all,
great lib works very well, can we add a way to scrape the post date too? I tried to look into the lib inner workings, but cannot figure what is the right selector to extract the date. Else I've wrote some func that will parse out properly the date from string to actual date object.
There are 3 sorting settings: RECENT_ACTIVITY (default), CHRONOLOGICAL and TOP_POSTS.
You can apply them by adding an extra query to the group URL:
https://www.facebook.com/groups/YOUR_GROUP_ID/?sorting_setting=NAME
Check this out: einaregilsson/Redirector#137.
Thank you for this very usueful package! I would be happy to contribute.
Would be very helpful if you guys could put up your documentation again, https://fbjs.dev/
Should we login to see posts from public groups? No.
I guess you should make an option to skip authentication, this is pointless in this case, unless there are limitations I'm not aware of.
Good job btw, I'm very excited for the NodeJS module support! <3
CONTRIBUTING.md
file following the exampleAs #66 is now merged, the example app should be updated with the refactored version.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.