Giter Site home page Giter Site logo

facebook-scraper's People

Contributors

asymness avatar barakplasma avatar belalhamdy avatar bipsen avatar dependabot[bot] avatar ethan353 avatar girvinjunod avatar ianneee avatar is3ka1 avatar jacobmas avatar johnliu-tw avatar josx avatar jwesheath avatar kevinzg avatar kinshukdua avatar krzygorz avatar lazanet avatar lennoxho avatar lucasmrdt avatar masalha-alaa avatar moda20 avatar neon-ninja avatar nielsoerbaek avatar nubpro avatar pierremesure avatar qdii avatar roma-glushko avatar senexus avatar tbuytaer avatar themulti0 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

facebook-scraper's Issues

Contributing to this repo

Hello @moda20 I tried to look up your email to contact but didn't find it so I had to open this issue.

Thank you for rescuing this repo from being abandoned.

I have been using it for 4 days and started to like it already but with a few catches, a necessary features and enhancements needs to be implemented to mention few:

  • Add ensureAscii=False to the json.dump function to support other languages like Arabic.
  • Add proper throttle mechanism (very important) to prevent our accounts from getting banned.
  • Add multiple ways to extract missing information like (username, comments)
  • Deprecate write_posts_to_csv function because it would get your account banned in no time (happened to me).
  • Add proper resume mechanism (important) to allow the scraping session to continue over multiple days.
  • Support multiple cookies to rotate accounts (avoid ban)
  • Support multiple proxies out of the box
  • MAKE mbasic.facebook.com the main way to scrape.
  • Implement a way to rotate between the multiple accounts (cookies), proxies and also proper throttling to ensure maximum scraping without getting banned at all.

These enhancements discovered when I tried to scrape a one groupe I think there's a lot to discover.

Now, a little introduction to myself I'm a PHP Developer for more than +8 years and Python is not my main thing but I can navigate my way around it.

I'm welling to contribute and dedicate some time to improve and enhance this project and keep it a life.

Would you please provide some guidance on how to get started contributing, how to know what element to scrape and so on.. and keep in mind that I'm partially new to python but I have good programming knowledge.

Waiting to hear from you.

Still cannot get full text

I have update my rev to b0b242f.
And I try the example
for post in get_posts('NintendoAmerica', base_url="https://mbasic.facebook.com", start_url="https://mbasic.facebook.com/NintendoAmerica?v=timeline", cookies=self.fb_cookie_path, pages=1):
print(post)

The post on December 6th was truncated on the web page, and the post results I got were also truncated 'text' instead of 'full_text'.

The 'text' show below :
There's no need to say goodbye just yet! The Seattle Aquarium’s partnership with Nintendo of America is continuing through the end of March 2024. 💙

That means you'll have more time to see some of your favorite characters from the Animal Crossing: New Horizons game all while exploring the Aquarium...

Failed to get the comments and reaction counts of page post scraping.

I am only receiving the post text and image link in the output JSON file, and I'm encountering issues fetching the comments or complete comments using the comments_full parameter in the output JSON file. I already have the latest code from the master branch as I pulled it recently. Here is my output JSON file for a single page post.

[
    {
        "post_id": "912430496918994",
        "text": "Tonight's the night! The Geminids, the most prolific and reliable meteor shower of the year, peaks. All you need to view them are clear skies, dark surroundings, and proper clothing for the weather. Get more tips here: https://bit.ly/\n3uVFdWH\n\u0986\u099c\u0995\u09c7\u09b0 \u09b0\u09be\u09a4! \u099c\u09c7\u09ae\u09bf\u09a8\u09bf\u09a1\u09b8, \u09ac\u099b\u09b0\u09c7\u09b0 \u09b8\u09ac\u099a\u09c7\u09af\u09bc\u09c7 \u09b8\u09ae\u09cd\u09ad\u09be\u09ac\u09a8\u09be\u09ae\u09af\u09bc \u098f\u09ac\u0982 \u09a8\u09bf\u09b0\u09cd\u09ad\u09b0\u09af\u09cb\u0997\u09cd\u09af \u0989\u09b2\u09cd\u0995\u09be \u099d\u09b0\u09a8\u09be, \u09b6\u09bf\u0996\u09b0\u0964 \u0986\u09aa\u09a8\u09be\u09b0 \u09af\u09be \u09a6\u09c7\u0996\u09a4\u09c7 \u09b9\u09ac\u09c7 \u09a4\u09be \u09b9\u09b2 \u09aa\u09b0\u09bf\u09b7\u09cd\u0995\u09be\u09b0 \u0986\u0995\u09be\u09b6, \u0985\u09a8\u09cd\u09a7\u0995\u09be\u09b0 \u099a\u09be\u09b0\u09aa\u09be\u09b6, \u098f\u09ac\u0982 \u0986\u09ac\u09b9\u09be\u0993\u09af\u09bc\u09be\u09b0 \u099c\u09a8\u09cd\u09af \u0989\u09aa\u09af\u09c1\u0995\u09cd\u09a4 \u09aa\u09cb\u09b6\u09be\u0995\u0964 \u098f\u0996\u09be\u09a8\u09c7 \u0986\u09b0\u0993 \u099f\u09bf\u09aa\u09b8 \u09aa\u09be\u09a8: https://bit.ly/3uVFdWH\n\u0987\u0982\u09b0\u09c7\u099c\u09bf \u098f\u09b0 \u09a5\u09c7\u0995\u09c7 \u0985\u09a8\u09c1\u09ac\u09be\u09a6 \u0995\u09b0\u09be \u09b9\u09df\u09c7\u099b\u09c7\n\nTonight's the night! The Geminids, the most prolific and reliable meteor shower of the year, peaks. All you need to view them are clear skies, dark surroundings, and proper clothing for the weather. Get more tips here: https://bit.ly/\n3uVFdWH\n\n\u0986\u099c\u0995\u09c7\u09b0 \u09b0\u09be\u09a4! \u099c\u09c7\u09ae\u09bf\u09a8\u09bf\u09a1\u09b8, \u09ac\u099b\u09b0\u09c7\u09b0 \u09b8\u09ac\u099a\u09c7\u09af\u09bc\u09c7 \u09b8\u09ae\u09cd\u09ad\u09be\u09ac\u09a8\u09be\u09ae\u09af\u09bc \u098f\u09ac\u0982 \u09a8\u09bf\u09b0\u09cd\u09ad\u09b0\u09af\u09cb\u0997\u09cd\u09af \u0989\u09b2\u09cd\u0995\u09be \u099d\u09b0\u09a8\u09be, \u09b6\u09bf\u0996\u09b0\u0964 \u0986\u09aa\u09a8\u09be\u09b0 \u09af\u09be \u09a6\u09c7\u0996\u09a4\u09c7 \u09b9\u09ac\u09c7 \u09a4\u09be \u09b9\u09b2 \u09aa\u09b0\u09bf\u09b7\u09cd\u0995\u09be\u09b0 \u0986\u0995\u09be\u09b6, \u0985\u09a8\u09cd\u09a7\u0995\u09be\u09b0 \u099a\u09be\u09b0\u09aa\u09be\u09b6, \u098f\u09ac\u0982 \u0986\u09ac\u09b9\u09be\u0993\u09af\u09bc\u09be\u09b0 \u099c\u09a8\u09cd\u09af \u0989\u09aa\u09af\u09c1\u0995\u09cd\u09a4 \u09aa\u09cb\u09b6\u09be\u0995\u0964 \u098f\u0996\u09be\u09a8\u09c7 \u0986\u09b0\u0993 \u099f\u09bf\u09aa\u09b8 \u09aa\u09be\u09a8: https://bit.ly/3uVFdWH\n\n",
        "post_text": "Tonight's the night! The Geminids, the most prolific and reliable meteor shower of the year, peaks. All you need to view them are clear skies, dark surroundings, and proper clothing for the weather. Get more tips here: https://bit.ly/\n3uVFdWH\n\u0986\u099c\u0995\u09c7\u09b0 \u09b0\u09be\u09a4! \u099c\u09c7\u09ae\u09bf\u09a8\u09bf\u09a1\u09b8, \u09ac\u099b\u09b0\u09c7\u09b0 \u09b8\u09ac\u099a\u09c7\u09af\u09bc\u09c7 \u09b8\u09ae\u09cd\u09ad\u09be\u09ac\u09a8\u09be\u09ae\u09af\u09bc \u098f\u09ac\u0982 \u09a8\u09bf\u09b0\u09cd\u09ad\u09b0\u09af\u09cb\u0997\u09cd\u09af \u0989\u09b2\u09cd\u0995\u09be \u099d\u09b0\u09a8\u09be, \u09b6\u09bf\u0996\u09b0\u0964 \u0986\u09aa\u09a8\u09be\u09b0 \u09af\u09be \u09a6\u09c7\u0996\u09a4\u09c7 \u09b9\u09ac\u09c7 \u09a4\u09be \u09b9\u09b2 \u09aa\u09b0\u09bf\u09b7\u09cd\u0995\u09be\u09b0 \u0986\u0995\u09be\u09b6, \u0985\u09a8\u09cd\u09a7\u0995\u09be\u09b0 \u099a\u09be\u09b0\u09aa\u09be\u09b6, \u098f\u09ac\u0982 \u0986\u09ac\u09b9\u09be\u0993\u09af\u09bc\u09be\u09b0 \u099c\u09a8\u09cd\u09af \u0989\u09aa\u09af\u09c1\u0995\u09cd\u09a4 \u09aa\u09cb\u09b6\u09be\u0995\u0964 \u098f\u0996\u09be\u09a8\u09c7 \u0986\u09b0\u0993 \u099f\u09bf\u09aa\u09b8 \u09aa\u09be\u09a8: https://bit.ly/3uVFdWH\n\u0987\u0982\u09b0\u09c7\u099c\u09bf \u098f\u09b0 \u09a5\u09c7\u0995\u09c7 \u0985\u09a8\u09c1\u09ac\u09be\u09a6 \u0995\u09b0\u09be \u09b9\u09df\u09c7\u099b\u09c7\n\nTonight's the night! The Geminids, the most prolific and reliable meteor shower of the year, peaks. All you need to view them are clear skies, dark surroundings, and proper clothing for the weather. Get more tips here: https://bit.ly/\n3uVFdWH\n\n\u0986\u099c\u0995\u09c7\u09b0 \u09b0\u09be\u09a4! \u099c\u09c7\u09ae\u09bf\u09a8\u09bf\u09a1\u09b8, \u09ac\u099b\u09b0\u09c7\u09b0 \u09b8\u09ac\u099a\u09c7\u09af\u09bc\u09c7 \u09b8\u09ae\u09cd\u09ad\u09be\u09ac\u09a8\u09be\u09ae\u09af\u09bc \u098f\u09ac\u0982 \u09a8\u09bf\u09b0\u09cd\u09ad\u09b0\u09af\u09cb\u0997\u09cd\u09af \u0989\u09b2\u09cd\u0995\u09be \u099d\u09b0\u09a8\u09be, \u09b6\u09bf\u0996\u09b0\u0964 \u0986\u09aa\u09a8\u09be\u09b0 \u09af\u09be \u09a6\u09c7\u0996\u09a4\u09c7 \u09b9\u09ac\u09c7 \u09a4\u09be \u09b9\u09b2 \u09aa\u09b0\u09bf\u09b7\u09cd\u0995\u09be\u09b0 \u0986\u0995\u09be\u09b6, \u0985\u09a8\u09cd\u09a7\u0995\u09be\u09b0 \u099a\u09be\u09b0\u09aa\u09be\u09b6, \u098f\u09ac\u0982 \u0986\u09ac\u09b9\u09be\u0993\u09af\u09bc\u09be\u09b0 \u099c\u09a8\u09cd\u09af \u0989\u09aa\u09af\u09c1\u0995\u09cd\u09a4 \u09aa\u09cb\u09b6\u09be\u0995\u0964 \u098f\u0996\u09be\u09a8\u09c7 \u0986\u09b0\u0993 \u099f\u09bf\u09aa\u09b8 \u09aa\u09be\u09a8: https://bit.ly/3uVFdWH\n\n",
        "shared_text": "",
        "original_text": null,
        "time": "2023-12-13T12:00:00",
        "timestamp": null,
        "image": null,
        "image_lowquality": "https://scontent.fdac135-1.fna.fbcdn.net/v/t39.30808-6/411139464_912430480252329_4124049930765544502_n.jpg?stp=cp0_dst-jpg_e15_q65_s240x240&_nc_cat=1&ccb=1-7&_nc_sid=ab7367&efg=eyJpIjoiYiJ9&_nc_ohc=PZe5VqLD430AX-XY6bd&_nc_ht=scontent.fdac135-1.fna&oh=00_AfAy-0N5Sx0bFz8GKdn_tw5XWweTnGU8xtzVWe8ntUg6Sw&oe=658039A9",
        "images": [
            null
        ],
        "images_description": [],
        "images_lowquality": [
            "https://scontent.fdac135-1.fna.fbcdn.net/v/t39.30808-6/411139464_912430480252329_4124049930765544502_n.jpg?stp=cp0_dst-jpg_e15_q65_s240x240&_nc_cat=1&ccb=1-7&_nc_sid=ab7367&efg=eyJpIjoiYiJ9&_nc_ohc=PZe5VqLD430AX-XY6bd&_nc_ht=scontent.fdac135-1.fna&oh=00_AfAy-0N5Sx0bFz8GKdn_tw5XWweTnGU8xtzVWe8ntUg6Sw&oe=658039A9"
        ],
        "images_lowquality_description": [
            "In this long exposure, a meteor streaks across a dusty blue star-spangled sky. Along the horizon, the bright lights of the Baikonur Cosmodrome glow yellow, illuminating buildings and a launch pad. Credit: NASA/Joel Kowsky"
        ],
        "video": null,
        "video_duration_seconds": null,
        "video_height": null,
        "video_id": null,
        "video_quality": null,
        "video_size_MB": null,
        "video_thumbnail": null,
        "video_watches": null,
        "video_width": null,
        "likes": 0,
        "comments": 0,
        "shares": 0,
        "post_url": "https://facebook.com/story.php?story_fbid=pfbid0hhxU7odSLSBK9ip1NK8dNEcYLRKpN76PP2JLBmbYqBwBGXotVKvp2rq8W4M98ep1l&id=100044561550831",
        "link": "https://bit.ly/3uVFdWH?fbclid=IwAR1FomDql03dWi8yTkTaLSfzKVLxbngiljCgArKNfKUJXwtdgUQ4r0dPl6E",
        "links": [
            {
                "link": "https://lm.facebook.com/l.php?u=https%3A%2F%2Fbit.ly%2F3uVFdWH%3Ffbclid%3DIwAR1FomDql03dWi8yTkTaLSfzKVLxbngiljCgArKNfKUJXwtdgUQ4r0dPl6E&h=AT2BWFUArXXPl1cfSPLDDc6pEmnA-Rns50dcane48z5Qm-F8f6Inj4auqc82PvJLzCtd-gVHjdUs0kTFHxDPviDap3v3T1FNyrZK60yd6kd_zREc5eK8Z0ITY832Uwqr-zMFKLmyRxK-Qidjg7iPeGguc6II_a4lY2y0yU2BS1UypkoqMPygt7e792W22b_loOPzv_4VtD3-N7wbuJaepmEqMMJ_KNme7H5RNgEQYBppAyklSemq1shh2Xn5HAbXYlHqUQG0Kc656srGz4MHn_7_VvmMFfG9LH_MAl8ehZirzQYUmbE4BgNLnyfC-B8G2DBrwp-1Qx_sTr_v_xBXvS1aIpxpl6WE9DOo3YR-RFSGb7tXM1w3AsPYP4huqu8UKFI3szfQ_USKpspFjcYj4pTqtqdmm3eaRj89L6w_W-HzpHGIWakRa4rd1hXp2tos1Fvj2bZjpHOdZvZHXP65-aun5lgxCoOE_h4IIYEb4j6B_E3c5jNl_kgVS4ZkrOaz1q6hJ1HCZifSOjzjYqQI6BUi_89jDjrwGzhE6Ruhfk-Zs5J5Cf53JDpB6HWiKHJdsndjL743gfk8wLOMscp9F38tjOSuCRJnLC1dEOXyINh27Ebtb6xmMTKBdUpFsjYONsJWZ0iL33WNvdgf5yx982koR2SxqxcdwhKKFez4kqjO0cj69VONR-H5XGD2oovJiqKRZZQnjbZ_hrVkiw2XKT_BBSvbD_laxW4M1LJQiVTzBN0K5rIPijfDp08xaYrR0fqpXCb810fDJXHVUgJc-XB_crQm7SFD-y1Tl03Ry6DzT-csjUSzZngwUWMV8rha4vnrz6mwDClWpEByaWDccW_XANvVAAyBMr0RW4TtUF2WzSBxiq2GXA43AUgRpE_rAwTKTNceg4C49yII9zzS-jqKFE0LAeLnX4k66947boiK9KXA7BKD8vsSAoGxqQ29LYXcfFuGfLbdQa7AmMLOY4C4WccBUo2xuhrVfuCUPoEEgOJ9DzQKbU0lY4WSi64Y3q_ifVbbb3wrS9048YxYewQbp79Ix2TCAa80WIO1rMGycCiZln_WiuDG2qRnVLwjHAHVN1FrrO12wVwYuq4evEDg24x_gKcIX6y4glM-KRcy_Oimr7McoLUTUoqT235NhOq6RhHt1U2Pt0uiTGeE988c8unZin_X1W2pay_RlPz7cppbaAF4TOQBJ2V9hswArE3qm5652-Ttk7XaaYZqLzKbzlutruRnbkYr8FVPKemFq43_M1KWCnLsFob4RDsD6YXwziWqwuAkYXbAaGbhJcD53CSGTkj489j1jQr8UoYhTEbbDgcCQlB99DKsu65KljxVK428g_WiMQmFjvrnPOEyElBa8Om_VLMPczJ_vtpsw1xZeAnXw5a8aDfR9WkO9no5hKRohvpJAenXwvraDIYoCKBc3L0ncPDl2woF_ZFyxOCrtPsa4xBEOXCL_UdDS7f4nzpWDEiBZuR_pLb5Km7itnuMDkWoH4wJMiTJc4WnNRZkzwG2aQhdRET_b2lZzWydXFxj5MRQEpArjbRGeVIz_jWUKY_mp0YUpK1NZlvz50T8R5qAmzmmdPNhQ0gUiRostxyOf9yAuQGnMN0FiMxVWdO3nrFeEQ4k6j-aAq49Mg0ILnXGiNvXS95CGKOsCq0pL7ZofnnzK_X_7PIP4yieibOqaCkRM7UtmRxWDSpBNp8mO-a4LzSjbQhgBrxPitNiTO644Ny2E9i8g5ANnXzA9BNRPi1vvRt8GVH-pbBO1XPpvwt2artQtQcoCHWiquNz3UYFBrVEfBesPGD9XqP3isyKEUnXawglY98jOkzHoniVgN68esFqqI4hS73Kwsd9N0Iw8cQDgbWCtMTb44MzyThkMzMY9bv2ADB0mTd5KdmXMjhrz3Oeu8Jd1NpLJtscyTxMRphl2Hwyz6w9nowupQ9or2LGrfdTTnwstqfKyZA1jzarQ-A6lyp_E2dExDdkrmOrnVf3A06MEspBjJ1U0fqPJbEQjM8oETB3NDoWFwH-OVkEnLYRdQSx2ZzrDUOJgp0IiLkYl_2qmMh5u-GeCnRrt_NXev9SbQ9D7KSiYJ8Ls9XYPPRq6pOCS7DyPRIig3tVF0jeAU3arqSUhh3UFPSGZr1uDODlvxyKkM8qSpSjP3JVJOpi",
                "text": "https://bit.ly/\n3uVFdWH"
            },
            {
                "link": "https://lm.facebook.com/l.php?u=https%3A%2F%2Fbit.ly%2F3uVFdWH%3Ffbclid%3DIwAR1KhNbDFDTN7j_DiVnJD7bRfrLDq54u469SWmmzD8u_p1f-mTAjUHL2iWY&h=AT3d5wlE3sStPKwYi1d9UcHetNjH-AJG52cAU9LLh3KO1m1s-DYPKPOYzbN4-1oi-8_CPVUMAsGf-hkEmKNsskF6LPh3Gdwihp_KZjpafyVLPNGD3HWLK4U31YMj_fGughrOk8qgp9EVaHcE0k0W-W8r6sG1A0a6DTJ5A9ELk0cqHxia-C9HslZLodaeDDcbVZvm9wEjR7GPI9ylb2BkJeBaQSONab5NJ6DPkYvQPPWtxby_tTUHOCTBtv351RrZcdaq6RDb5frwJNYEHsaJa7f_G-BK6PBdAl8ynXvq5km18qbHU8KrXG61DwftavI2wev-Eoh4C_ww7bkvy1BKDFi3Z8hXFdrIr4yy1aRZlLhy6k0K-DnYnN9riWGStKLB5K1Lx1GkSRC-g8xrB8HaYczWTfM9_WQnwqupP8MFJLDJiEl-z3zaqXFDN7qwJW_6yuKIHWAhsqyVB_Mgelx11PbUGPPyOzCZJvAO3q0Tag6wgRtkbU7dPqghpzxwqoVqWg7_-jyG4Mx3Z2fofm5xN3BSn4sDGS2O9eW9bkkyYAaZebi-Z81dp9oeYWPZFV--vX7votEzPVtogq2bgl2lkUdKPRmapLkfvKczDIcSjqbDuwuU9xNQI7e_RmpaDkWh5wX_easI-D4kAfUGbib02mAcj9mTHkxntp9-1VeA1T6Lytqi5jIlTqUqf-dTu05xKtXD9uqp1Z5Sz2CY_dNbUheR7y3lak7eld4i0sE9IhgzecrqzSXRPFIzTThY_d8mP6mz2KpE8s7EkXLAUUiFCWefb_CNryKwlEkzct17i89OA8V5ElH-j-Mk8isW6Q4uSvMp6kIXs04mjrN8dBLyY5GJIc84gqUsZcIA6H1VJp4B2t29caPE8dfi3paka864XdBp5HSbLqzCuqYaQCY7dQxRxuuISYXuUklERn3kP_TpR0IfNXHbgxPeBA96Ek7fMrYIfwT1XkCAbSvn--XfXNNZ4LUc0XdqR71jST6gtAa8eF0KnVdAuhYCc23mwoYiu2JYK5-ICZd60A6nWtAk7JE122zeSRyDxN1GjFakeezJch6C0swObqRcSYB6xLY4I-Xi4_d174CHblrEZysX7QVNK-pLb6HDxou_WJfw9uGu9RviVSz9ab0YqRaNh1VLqNCwEM3-NXqN28g-3OlOjRij_3t2OXHEsmMWcAQ85fRvkcpAFTRd4bWBnENDMXjWcC0rZoPwenp8M9OChWPT9mgxYYmKtzPx3TtZdTLHBPCKb-ZD1WfWdtzDgXN7cWSteGEWeAGobRfiSA3e20HmKr7Xsd3wGW9DMhARF4ugom4lalLy7peLruTJKxvSxB8qrlPqLrQqntsQrM5EvVqpNT1ViTZOGfPp4dgIRB1S3CjMmfQ8wUPZNByBhyOLIGSi5Lc-Vu5lMOHmXsphJYYAISHQdsg_tR71Q-_2_D5o9-zSs-eHV50uRD8QAJm72W8pd4Eond14xl4XCXLUtEc86wn8E9QEP6Sq9FaI--mPHaWL97_SpbS5tmvkQmeNb3p3siROUIARRK0y3CX4PoayEXKlr0EKednwszy3xgcS6DzTA_TG468LUffBM3NHCERxBItEw1mhy4J0zpLT_D45_cZHZ5lSxPZlhl8ghibg_ssLgJox6HgtnvxSLW9c4KWJ8BC2e-TRAkG9IS9d_GZTcxEMUKYIOB9NZRhYMTH2mvntRYN2ojwxbtaldDHsScKsrfqWwXfuz1SgG014e3hEPAsnU6efQGPX-Nz07EZx06AQ5FUJYtfsph_Is7oEcigevtXSKTy2v9oporMuZJIoy0OKSEuwt3bcJ_UqJgw-SBg3CqOw__GbhTs99SmNXZ_RADRNqjOwnvw5OVQ9lE32BepYheXK5hwz1bRLCETSuOwzVGUgFyv9lThQ7EccYg0HZaZfWfmHeWXWp6GlFvpDbr70wivvSKCA7vMOlyJ29t7yy-iFqoWX1GtEmVTEp63jl3k3LM2vXN-t2_8cjqsXyyPP-Lz0rcrjjeP7KGSHz7w6wLiEdkTrTYBj2gDdGX-q2wPnu1nPcO490ej-yldTwMEyp-YyiCUiWoV7Jm3Ef5aZCtpd6H-joICKy7uTYHcRuCc7u5fNV53pM63lMtohwQzHQFdzfUjTZz1Py9uO_1nLq2JpwLPiKrI5",
                "text": "https://bit.ly/3uVFdWH"
            },
            {
                "link": "/basic/translation_preferences/?target_id=912430496918994&auto_translation_mode=1&refid=17&_ft_=encrypted_tracking_data.0AY9J9kYTH4cJDALE0KXf1vCFRGEeQeew72WLGahvMRMNQTgbLiddqx0NPuxkOoR0EhcXjIk4j7jJ3yYAKEH2yQQ9oChjNYBh6k3Tsz0RHCZzgLNWh-Zyy2WnuJ4wVaXlQ34IFsUNNc48eajwPC2pGLbilRAGaFZKANkSA02WVyDkSrSOrJTLGCoCRzVxmwZqIYZ5lK5gDH-llzqAWSFBPTuzW4ygQ6s8PfDjqExc5Dd49Vc44beGQTRsw3lKBhJWScXmimA2_gsOPS8nh2QF06datMVXaghGfyPXeTz9FmtPt-HX905O8zMAhNClqPDIvm76ARC4Bl4AKiG3e5ajTVoev-yO4GGUYPJXOos01Z65kQUmaSlE17gp17WZJWksV1rU9-EC7ahcQ0jdVxTNmUthFFrHhphF5dDaAeDb6k9Vh2YsxAVOyH6hwOYgya5nRGLHbMroe9ITMDy-J7GBitDcrAbjfkljPOVA-KzFxzuaLGq-csLeVPF2LAJy8yLA0PhpQSyfAc_-A6rSQ9hDGNeP2Kco8O_RtND2pBz5g-EOqN9Gvt4J7snUc7g-rP62rbbNXjUJIrR2ZbZSBts8PinFOU_jE8mQBp-4dfOmMX5HSV_zQj3Rh7qHYa4WMjLS7vWtykrTNJLq-NJbtVAWZVn3NXlV0DbnQiSg97RKqqeK8Lfm6HYLU5VYV16z-uI0rgZGzA_1BzHHW0GF_i3WYiVtFz1I8T2UX2-z642_KhaZ2IUmTMQeoeGNSBOXmCrhmms8jGQuKPZOGpLyagBFdDE6owHSlt7cL_NeHUuw9RmsEkQk8XRhk5gx2BTExNSNyxQGt3b4PCNyWrtFlrur8ds7ojvJrSIeHQKVm4xmaWdoilRC2eSel8w8VrTuVDP87JmDPjo_FK2J4f-BJQ5AcR6qdE1j_e9HG6CRCLsmMbLTzwLk3wsaWUOoE9C1yRP_aP6u8eY-SEkpZ4eVs7dz9jujKMUjwazjDlroUXZtxkqW_ClgJ4FaEIe8BsK8Sdc81p0qGR4lehI17zaRpwvz95Wg12KElbNeUdWz71fDYCrUkHqozFVxj9BHWefhRpwLMWyYUIsrc74xM92v1gS08ljJWE1_mva0HpFhb8ZkSkBgK5JdjcI_ZkGubCu2wAMXpd4gkpPtRaCMInv9TgLaXJhM_Kyv7YpU4uTvVp8q-QPbYGuAl1C386jdjHcqdxTW1kIs3HIvvUU5IOKkcbUQXu_43qssVTpwLo8CJzYGz0r9-hwFxoJO_adFYN2swnth_X1N1E0MxQTfB3ppkLh-uMijk8YXoXAifthZAKxvzGbqv6asMcJnYwP4_NqfL84szseqzjlWk6lLjdm1Q80_oi49nxFktybv9Ysj0cXxZsBdfEzfaPbeXyUYDtrBH37JaUCiZ7UjSjFFebCbyt2BFwt3FkcRmETs-uT3W7rHYuA4e2vPtzNKtOhXC0M5J3k1&__tn__=%2As%2As-R&paipv=0&eav=AfYA8_tqAykPoYtmSq6oGPxujdxIKRPr20chEC8yWEMhLuDMKgSCKNSRcewCLVFkQiU",
                "text": "\u0987\u0982\u09b0\u09c7\u099c\u09bf \u098f\u09b0 \u09a5\u09c7\u0995\u09c7 \u0985\u09a8\u09c1\u09ac\u09be\u09a6 \u0995\u09b0\u09be \u09b9\u09df\u09c7\u099b\u09c7"
            },
            {
                "link": "/photo.php?fbid=912430483585662&id=100044561550831&set=a.416661013162614&eav=AfaEih11Dq7Px1PZczzUm-At1z6dTk3gJXyhAtdZmmxBNzvgXGzXtzglDI9HtpwMJFA&paipv=0&refid=17&_ft_=encrypted_tracking_data.0AY9J9kYTH4cJDALE0KXf1vCFRGEeQeew72WLGahvMRMNQTgbLiddqx0NPuxkOoR0EhcXjIk4j7jJ3yYAKEH2yQQ9oChjNYBh6k3Tsz0RHCZzgLNWh-Zyy2WnuJ4wVaXlQ34IFsUNNc48eajwPC2pGLbilRAGaFZKANkSA02WVyDkSrSOrJTLGCoCRzVxmwZqIYZ5lK5gDH-llzqAWSFBPTuzW4ygQ6s8PfDjqExc5Dd49Vc44beGQTRsw3lKBhJWScXmimA2_gsOPS8nh2QF06datMVXaghGfyPXeTz9FmtPt-HX905O8zMAhNClqPDIvm76ARC4Bl4AKiG3e5ajTVoev-yO4GGUYPJXOos01Z65kQUmaSlE17gp17WZJWksV1rU9-EC7ahcQ0jdVxTNmUthFFrHhphF5dDaAeDb6k9Vh2YsxAVOyH6hwOYgya5nRGLHbMroe9ITMDy-J7GBitDcrAbjfkljPOVA-KzFxzuaLGq-csLeVPF2LAJy8yLA0PhpQSyfAc_-A6rSQ9hDGNeP2Kco8O_RtND2pBz5g-EOqN9Gvt4J7snUc7g-rP62rbbNXjUJIrR2ZbZSBts8PinFOU_jE8mQBp-4dfOmMX5HSV_zQj3Rh7qHYa4WMjLS7vWtykrTNJLq-NJbtVAWZVn3NXlV0DbnQiSg97RKqqeK8Lfm6HYLU5VYV16z-uI0rgZGzA_1BzHHW0GF_i3WYiVtFz1I8T2UX2-z642_KhaZ2IUmTMQeoeGNSBOXmCrhmms8jGQuKPZOGpLyagBFdDE6owHSlt7cL_NeHUuw9RmsEkQk8XRhk5gx2BTExNSNyxQGt3b4PCNyWrtFlrur8ds7ojvJrSIeHQKVm4xmaWdoilRC2eSel8w8VrTuVDP87JmDPjo_FK2J4f-BJQ5AcR6qdE1j_e9HG6CRCLsmMbLTzwLk3wsaWUOoE9C1yRP_aP6u8eY-SEkpZ4eVs7dz9jujKMUjwazjDlroUXZtxkqW_ClgJ4FaEIe8BsK8Sdc81p0qGR4lehI17zaRpwvz95Wg12KElbNeUdWz71fDYCrUkHqozFVxj9BHWefhRpwLMWyYUIsrc74xM92v1gS08ljJWE1_mva0HpFhb8ZkSkBgK5JdjcI_ZkGubCu2wAMXpd4gkpPtRaCMInv9TgLaXJhM_Kyv7YpU4uTvVp8q-QPbYGuAl1C386jdjHcqdxTW1kIs3HIvvUU5IOKkcbUQXu_43qssVTpwLo8CJzYGz0r9-hwFxoJO_adFYN2swnth_X1N1E0MxQTfB3ppkLh-uMijk8YXoXAifthZAKxvzGbqv6asMcJnYwP4_NqfL84szseqzjlWk6lLjdm1Q80_oi49nxFktybv9Ysj0cXxZsBdfEzfaPbeXyUYDtrBH37JaUCiZ7UjSjFFebCbyt2BFwt3FkcRmETs-uT3W7rHYuA4e2vPtzNKtOhXC0M5J3k1&__tn__=EH-R",
                "text": ""
            }
        ],
        "user_id": "100044561550831",
        "username": "NASA - National Aeronautics and Space Administration",
        "user_url": "https://facebook.com/NASA?lst=100006358183755%3A100044561550831%3A1702538223&eav=AfYrRWiG5i2lkKgIgMAZM0BwPSur-VJCbrJMHk_4lzI0m-KRU32-zDfs8WH8RsYHbsw&refid=17&_ft_=encrypted_tracking_data.0AY9J9kYTH4cJDALE0KXf1vCFRGEeQeew72WLGahvMRMNQTgbLiddqx0NPuxkOoR0EhcXjIk4j7jJ3yYAKEH2yQQ9oChjNYBh6k3Tsz0RHCZzgLNWh-Zyy2WnuJ4wVaXlQ34IFsUNNc48eajwPC2pGLbilRAGaFZKANkSA02WVyDkSrSOrJTLGCoCRzVxmwZqIYZ5lK5gDH-llzqAWSFBPTuzW4ygQ6s8PfDjqExc5Dd49Vc44beGQTRsw3lKBhJWScXmimA2_gsOPS8nh2QF06datMVXaghGfyPXeTz9FmtPt-HX905O8zMAhNClqPDIvm76ARC4Bl4AKiG3e5ajTVoev-yO4GGUYPJXOos01Z65kQUmaSlE17gp17WZJWksV1rU9-EC7ahcQ0jdVxTNmUthFFrHhphF5dDaAeDb6k9Vh2YsxAVOyH6hwOYgya5nRGLHbMroe9ITMDy-J7GBitDcrAbjfkljPOVA-KzFxzuaLGq-csLeVPF2LAJy8yLA0PhpQSyfAc_-A6rSQ9hDGNeP2Kco8O_RtND2pBz5g-EOqN9Gvt4J7snUc7g-rP62rbbNXjUJIrR2ZbZSBts8PinFOU_jE8mQBp-4dfOmMX5HSV_zQj3Rh7qHYa4WMjLS7vWtykrTNJLq-NJbtVAWZVn3NXlV0DbnQiSg97RKqqeK8Lfm6HYLU5VYV16z-uI0rgZGzA_1BzHHW0GF_i3WYiVtFz1I8T2UX2-z642_KhaZ2IUmTMQeoeGNSBOXmCrhmms8jGQuKPZOGpLyagBFdDE6owHSlt7cL_NeHUuw9RmsEkQk8XRhk5gx2BTExNSNyxQGt3b4PCNyWrtFlrur8ds7ojvJrSIeHQKVm4xmaWdoilRC2eSel8w8VrTuVDP87JmDPjo_FK2J4f-BJQ5AcR6qdE1j_e9HG6CRCLsmMbLTzwLk3wsaWUOoE9C1yRP_aP6u8eY-SEkpZ4eVs7dz9jujKMUjwazjDlroUXZtxkqW_ClgJ4FaEIe8BsK8Sdc81p0qGR4lehI17zaRpwvz95Wg12KElbNeUdWz71fDYCrUkHqozFVxj9BHWefhRpwLMWyYUIsrc74xM92v1gS08ljJWE1_mva0HpFhb8ZkSkBgK5JdjcI_ZkGubCu2wAMXpd4gkpPtRaCMInv9TgLaXJhM_Kyv7YpU4uTvVp8q-QPbYGuAl1C386jdjHcqdxTW1kIs3HIvvUU5IOKkcbUQXu_43qssVTpwLo8CJzYGz0r9-hwFxoJO_adFYN2swnth_X1N1E0MxQTfB3ppkLh-uMijk8YXoXAifthZAKxvzGbqv6asMcJnYwP4_NqfL84szseqzjlWk6lLjdm1Q80_oi49nxFktybv9Ysj0cXxZsBdfEzfaPbeXyUYDtrBH37JaUCiZ7UjSjFFebCbyt2BFwt3FkcRmETs-uT3W7rHYuA4e2vPtzNKtOhXC0M5J3k1&__tn__=C-R&paipv=0",
        "is_live": false,
        "factcheck": null,
        "shared_post_id": null,
        "shared_time": null,
        "shared_user_id": null,
        "shared_username": null,
        "shared_user_url": null,
        "shared_post_url": null,
        "available": true,
        "comments_full": null,
        "reactors": null,
        "w3_fb_url": null,
        "reactions": null,
        "reaction_count": 0,
        "with": null,
        "page_id": null,
        "sharers": null,
        "translated_text": "",
        "image_id": null,
        "image_ids": [],
        "was_live": false
    }
]

@moda20 can you please provide any solution?

Get post problem

I'm trying to crawl data from a facebook page but it doesn't do anything unfortunately, I'm really desperate pls help. I'm using the standard get post syntax
post_list = []
for post in get_posts(FANPAGE_LINK,
options={"comments": True, "reactions": True, "allow_extra_requests": True},
extra_info=True, pages=PAGES_NUMBER, cookies=COOKIE_PATH):
print(post)
post_list.append(post)

Only get few posts

Hi, thanks for providing the new way to get the posts.

There are some problems show up:

  1. If I set group parameter in get_posts funciton, it turns out error.

"""
File ~\anaconda3\Lib\site-packages\facebook_scraper\extractors.py:108 in init
self.scraper = kwargs['scraper']

KeyError: 'scraper'

"""

  1. The problem above solved ffter I remove the parameter group and just type the groups ID. But even I set the pages parameter in the get_posts function to 20, I still only get 8 posts.
    It seems like mbasic.facebook only display 8 posts in one pages. If I want to check more, I have to click the "see more posts" button.

Is there any other way to fix this problem? Thanks a lot.

Here is the code that I can only get 8 posts in a group:

from facebook_scraper import get_posts

for post in get_posts('817620721658179', base_url="https://mbasic.facebook.com/groups", 
start_url="https://mbasic.facebook.com/groups/817620721658179?v=timeline", 
                      pages=50,
 cookies = "www.facebook.com_cookies.txt"):

  print(post['text'][:50])

post_id = None

I can't get post_id when parsing

My code:

from facebook_scraper import *
i="Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36"
set_user_agent(i)
for post in get_posts('100038659142270', cookies='cookie2.json'):
            print(post)
            print(post['text'][:50])

output:

{'post_id': None, 'text': 'У меня информация для родителей прекрасного города Мюнхена и всех, кому до него рукой подать\nУже в это воскресенье у вас там будет семинар с Еленой Журавлевой "Что нам делать с этой домашкой?", конечно, не только про домашку, а вообще про роль родителей в отношениях ребенка со школой, мотивацию и все то, про что вы всегда спрашиваете:)\nЛена про это… More рассказывает офигенно и очень умеет сделать так, чтобы родителей поотпустило, а договариваться с детьми стало легче.\nВ общем, кому актуально - сделайте подарок своей нервной системе и своим отношениям с детьми\nСсылка в первом комментарии', 'post_text': 'У меня информация для родителей прекрасного города Мюнхена и всех, кому до него рукой подать\nУже в это воскресенье у вас там будет семинар с Еленой Журавлевой "Что нам делать с этой домашкой?", конечно, не только про домашку, а вообще про роль родителей в отношениях ребенка со школой, мотивацию и все то, про что вы всегда спрашиваете:)\nЛена про это… More рассказывает офигенно и очень умеет сделать так, чтобы родителей поотпустило, а договариваться с детьми стало легче.\nВ общем, кому актуально - сделайте подарок своей нервной системе и своим отношениям с детьми\nСсылка в первом комментарии', 'shared_text': '', 'original_text': 'У меня информация для родителей прекрасного города Мюнхена и всех, кому до него рукой подать\nУже в это воскресенье у вас там будет семинар с Еленой Журавлевой "Что нам делать с этой домашкой?", конечно, не только про домашку, а вообще про роль родителей в отношениях ребенка со школой, мотивацию и все то, про что вы всегда спрашиваете:)\nЛена про это… More рассказывает офигенно и очень умеет сделать так, чтобы родителей поотпустило, а договариваться с детьми стало легче.\nВ общем, кому актуально - сделайте подарок своей нервной системе и своим отношениям с детьми\nСсылка в первом комментарии', 'time': datetime.datetime(2023, 11, 22, 16, 13), 'timestamp': None, 'image': None, 'image_lowquality': None, 'images': [], 'images_description': [], 'images_lowquality': [], 'images_lowquality_description': [], 'video': None, 'video_duration_seconds': None, 'video_height': None, 'video_id': None, 'video_quality': None, 'video_size_MB': None, 'video_thumbnail': None, 'video_watches': None, 'video_width': None, 'likes': 517, 'comments': 6, 'shares': 19, 'post_url': 'https://facebook.com/story.php?story_fbid=pfbid037k8TVCCTbJM32kjHh2K6iWvdprSgmFjvxik8KogUx6UXfPVMYPiGQqfoS6Y5L1Ufl&id=100038659142270', 'link': None, 'links': [{'link': '/story.php?story_fbid=pfbid037k8TVCCTbJM32kjHh2K6iWvdprSgmFjvxik8KogUx6UXfPVMYPiGQqfoS6Y5L1Ufl&id=100038659142270&eav=AfYbBWQwOZPvO9oMASU4RqiPm00e-Zc64d2iVDj6qkX_wuXAa9EjL8Hbe1ZtWqRad2k&refid=17&paipv=0', 'text': 'More'}, {'link': '/story.php?story_fbid=pfbid037k8TVCCTbJM32kjHh2K6iWvdprSgmFjvxik8KogUx6UXfPVMYPiGQqfoS6Y5L1Ufl&id=100038659142270&eav=AfYbBWQwOZPvO9oMASU4RqiPm00e-Zc64d2iVDj6qkX_wuXAa9EjL8Hbe1ZtWqRad2k&m_entstream_source=timeline&refid=17&_ft_=encrypted_tracking_data.0AY-HW9xzbWptir5dWvRKhRNOjmGw9R42jBChOUVE-YjmHP3CmJLVD79ZyvGaoEEtQ_g9jwZv2CpcBMywwTcZJ6M96wxRm9B88q_moTHR36f_gsE-ns82yFvalMbnDDv1HfAervDsIjTetQwu9ywBAoTyPAbI65HoQORFOsYBCaXfyDS-CK7Uej7dKCqbz7Qfu8xdYDRDAiCJMrEFzR_fiAqZVxMcg3iKKFQPxw66c16qDXOVwPzvhL62ZEOHDRooV7HcN9v8ZvzdCTVNOTC-Rz0fowsshMrNhVbN67QYRV6rLgAML18WEWWtpMAq55uJWszh0zaL_vE6-fTK6EonSeqKox8glqlL1FCWNto0i3bscSN9mtvMwf_g8oSNm6UhoocUFMd7QuhUhJpRk1WM1YO8Bt76E9QGzl7cLo5ge8mZfaKDiNf5oKVAffk8ABIK7kMBaQVj9OWTmYCChek7e60-ICO12TGe-7qYuwrXGQXVmR55FZ1GiFt0TnoKorGvf3bVkCjIQWX75Lj1hdZXZIKfkayV33S0Rz-h5rFgnEZkSs1ozWLqmzZjtx8DIhzWbwNSBhbG_Rbqg7XnNCThDtZdf9oF5r7yKrUDhAFS74-3HTLPuN7GvV2ivyU3MSYQWc_fAu0AOEEd&__tn__=%2As%2As-R&paipv=0', 'text': ''}], 'user_id': None, 'username': 'Людмила Петрановская', 'user_url': 'https://facebook.com/lv.petranovskaya?lst=61553879472338%3A100038659142270%3A1702119234&eav=AfZbXyHoRHuMJQAou_VCGC4gY_1dqYi6vgNRNosqJSLTAXXa2bwyjg4UK84d5Ka7Yqg&refid=17&_ft_=encrypted_tracking_data.0AY-HW9xzbWptir5dWvRKhRNOjmGw9R42jBChOUVE-YjmHP3CmJLVD79ZyvGaoEEtQ_g9jwZv2CpcBMywwTcZJ6M96wxRm9B88q_moTHR36f_gsE-ns82yFvalMbnDDv1HfAervDsIjTetQwu9ywBAoTyPAbI65HoQORFOsYBCaXfyDS-CK7Uej7dKCqbz7Qfu8xdYDRDAiCJMrEFzR_fiAqZVxMcg3iKKFQPxw66c16qDXOVwPzvhL62ZEOHDRooV7HcN9v8ZvzdCTVNOTC-Rz0fowsshMrNhVbN67QYRV6rLgAML18WEWWtpMAq55uJWszh0zaL_vE6-fTK6EonSeqKox8glqlL1FCWNto0i3bscSN9mtvMwf_g8oSNm6UhoocUFMd7QuhUhJpRk1WM1YO8Bt76E9QGzl7cLo5ge8mZfaKDiNf5oKVAffk8ABIK7kMBaQVj9OWTmYCChek7e60-ICO12TGe-7qYuwrXGQXVmR55FZ1GiFt0TnoKorGvf3bVkCjIQWX75Lj1hdZXZIKfkayV33S0Rz-h5rFgnEZkSs1ozWLqmzZjtx8DIhzWbwNSBhbG_Rbqg7XnNCThDtZdf9oF5r7yKrUDhAFS74-3HTLPuN7GvV2ivyU3MSYQWc_fAu0AOEEd&__tn__=C-R&paipv=0', 'is_live': False, 'factcheck': None, 'shared_post_id': None, 'shared_time': None, 'shared_user_id': None, 'shared_username': None, 'shared_user_url': None, 'shared_post_url': None, 'available': True, 'comments_full': None, 'reactors': None, 'w3_fb_url': None, 'reactions': None, 'reaction_count': 517, 'with': None, 'page_id': None, 'sharers': None, 'is_truncated_text': 'true', 'full_post_url': 'https://mbasic.facebook.com/story.php?story_fbid=pfbid037k8TVCCTbJM32kjHh2K6iWvdprSgmFjvxik8KogUx6UXfPVMYPiGQqfoS6Y5L1Ufl&id=100038659142270&eav=AfYbBWQwOZPvO9oMASU4RqiPm00e-Zc64d2iVDj6qkX_wuXAa9EjL8Hbe1ZtWqRad2k&refid=17&paipv=0', 'translated_text': 'I have information for the parents of the beautiful city of Munich and everyone to whom I can help\nAlready this Sunday you will have a seminar there with Elena Zhuravleva "What should we do with this homework?" ", of course, not only about homework, but generally about the role of parents in the child\'s relationship with school, motivation and everything you always ask about :)\nLena talks about it awesomely and is very good at making it so that parents will be released and it became easier to negotiate with children.\nGenerally, to whom it may concern - make a gift to your nervous system and your relationship with children\nLink is in first comment', 'translated_post_text': 'I have information for the parents of the beautiful city of Munich and everyone to whom I can help\nAlready this Sunday you will have a seminar there with Elena Zhuravleva "What should we do with this homework?" ", of course, not only about homework, but generally about the role of parents in the child\'s relationship with school, motivation and everything you always ask about :)\nLena talks about it awesomely and is very good at making it so that parents will be released and it became easier to negotiate with children.\nGenerally, to whom it may concern - make a gift to your nervous system and your relationship with children\nLink is in first comment', 'translated_shared_text': '', 'image_id': None, 'image_ids': [], 'was_live': False}
У меня информация для родителей прекрасного города

get_posts returning only 5 posts

posts = get_posts(current_page, start_url="https://mbasic.facebook.com/" + current_page + "?v=timeline", pages=10, cookies="cookies.txt", options={"posts_per_page": 10, "allow_extra_requests": True, "comments": True})

I am using this code to scrap posts, this returns only 5 posts. They are the last 5 posts.

the log cat says:
Looking for next page URL Page parser did not find next page URL

Unable to get post from page and group

I have attempted to extract posts from Facebook pages and groups but to no avail. The for loops return no results.

The cookies json were extracted from Get cookies.txt LOCALLY.

### To extract from pages

for post in get_posts('nintendo', base_url="https://mbasic.facebook.com", start_url="https://mbasic.facebook.com/nintendo?v=timeline", pages=3, cookies='facebook_cookies.json'):
   print(post['text'][:50])
   print(post)

# This fails with error "facebook_scraper.exceptions.LoginRequired: A login (cookies) is required to see this page"
# for post in get_posts('nintendo', base_url="https://mbasic.facebook.com", 
#         start_url="https://mbasic.facebook.com/nintendo?v=timeline", pages=1,
#         cookies='fb_json_cookies.json'):
#     print(post['text'][:50])

### To extract from groups

for post in get_posts(group='702649679892269', pages=1):
   print(post)
for post in get_posts(group='702649679892269', pages=1, cookies="fb_json_cookies.json"):
   print(post)

Any idea to resolve this is appreciated.

post_text have text duplicated.

Most text field are duplicated:

import json
from facebook_scraper import get_posts
from datetime import datetime

def datetime_serializer(o):
    if isinstance(o, datetime):
        return o.isoformat()
    raise TypeError("Type not serializable")

all_posts = []

for post in get_posts(group='890901752414740', base_url="https://mbasic.facebook.com/groups", 
                      start_url="https://mbasic.facebook.com/groups/890901752414740?v=timeline", 
                      pages=3, cookies="cookies.txt"):
    all_posts.append(post)


print(json.dumps(all_posts, default=datetime_serializer, indent=4))

[
    {
        "post_id": "893131828858399",
        "post_text": "How are you today my fellow dev?\n\nHow are you today my fellow dev?",
        "time": "2023-12-15T23:59:00"
    },
    {
        "post_id": "890902099081372",
        "post_text": "I'm a sad cake",
        "time": "2023-12-12T21:36:00"
    },
    {
        "post_id": "890901879081394",
        "post_text": "",
        "time": "2023-12-12T21:36:00"
    }
]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.