Giter Site home page Giter Site logo

dannyvassallo / insta_scrape Goto Github PK

View Code? Open in Web Editor NEW
60.0 4.0 27.0 165 KB

The instagram swiss army knife. Restores all deprecated hashtag functionality and grants public api access from instagram's front end without any of the authorization.

Home Page: https://rubygems.org/gems/insta_scrape

License: MIT License

Ruby 99.15% Shell 0.85%

insta_scrape's Introduction

Build StatusGem Version

alt text

InstaScrape

[NO LONGER MAINTAINED]

The instagram swiss army knife. Restores all deprecated hashtag functionality and grants public api access from instagram's front end without any of the authorization.

With include_meta_data: true, you can return a posts image, link, text, date, username, hi_res_image, and likes. See the examples and usage pages for different methods and how to use them.

Note [ PLEASE READ ]

The number of results may vary when using certain methods as this IS NOT an official endpoint.

Installation

Add this line to your application's Gemfile:

gem "insta_scrape"

For bleeding edge, install from the development branch:

gem "insta_scrape", :git => "https://github.com/dannyvassallo/insta_scrape.git", :branch => "develop"

And then execute:

$ bundle

Or install it yourself as:

$ gem install insta_scrape

How To Use

You'll probably want to use the whenever gem or something similar in order to create hashtag widgets like you once could. Scheduling a job (polling) and storing each post's information in your database/cache is one way to do it.

Standard Hashtag Scrape Example:

#basic use case
scrape_result = InstaScrape.hashtag("test")
scrape_result.each do |post|
  puts post.image
  puts post.link
  puts post.text
end

Long Scrape a hashtag and get additional metadata:

#you can set include_meta_data to false if
#you want to speed up the scrape
scrape_result = InstaScrape.long_scrape_hashtag('test', 1, include_meta_data: true)
scrape_result.each do |post|
  puts post.image
  puts post.link
  puts post.text
  puts post.date
  puts post.username
  puts post.hi_res_image
  puts post.likes
end

See the InstaScrape Wiki HERE to learn the rest of InstaScrape's features.

Problems? Need Help?

Create an issue and I'll respond as soon as I can. If it's a feature request and you've got some free time -- PRs are gladly welcomed.

Contributing

❗️Bug reports and pull requests are ALWAYS welcome on GitHub❗️. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.

License

The gem is available as open source under the terms of the MIT License.

insta_scrape's People

Contributors

beydogan avatar dannyvassallo avatar jmiller656 avatar maruware avatar mbueti avatar oguzcanhuner avatar wdruzkawiecki avatar yohendry avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

insta_scrape's Issues

Capybara::Poltergeist::JavascriptError when trying anything

I added insta_scrape 1.1.2 to my current Rails project, Rails 5.0.3 with ruby 2.3.1p112. When trying to run any insta_scrape methods from a console session, I get the following error:

Capybara::Poltergeist::JavascriptError: Capybara::Poltergeist::JavascriptError
    from /Users/me/.rvm/gems/ruby-2.3.1/gems/poltergeist-1.9.0/lib/capybara/poltergeist/browser.rb:351:in `command'
    from /Users/me/.rvm/gems/ruby-2.3.1/gems/poltergeist-1.9.0/lib/capybara/poltergeist/browser.rb:34:in `visit'
    from /Users/me/.rvm/gems/ruby-2.3.1/gems/poltergeist-1.9.0/lib/capybara/poltergeist/driver.rb:95:in `visit'
    from /Users/me/.rvm/gems/ruby-2.3.1/gems/capybara-2.7.1/lib/capybara/session.rb:233:in `visit'
    from /Users/me/.rvm/gems/ruby-2.3.1/gems/capybara-2.7.1/lib/capybara/dsl.rb:52:in `block (2 levels) in <module:DSL>'
    from /Users/me/.rvm/gems/ruby-2.3.1/gems/insta_scrape-1.1.2/lib/insta_scrape.rb:105:in `scrape_user_info'
    from /Users/me/.rvm/gems/ruby-2.3.1/gems/insta_scrape-1.1.2/lib/insta_scrape.rb:35:in `user_info'

In this particular instance I ran InstaScrape.user_info("foofighters").

Interestingly, if I go to line 105 of insta_scrape.rb and run extend Capybara::DSL followed by visit "https://www.instagram.com/foofighters/", occasionally it works, and occasionally it throws the same error.

What could be the issue?

InstaScrape.long_scrape_hashtag only returns 4 posts

Hello,

I'm trying to retrieve all of the usernames that are associated with a specific hashtag. Here is the code.

require "csv"
require "insta_scrape"

def appendRowToCsv(row)
CSV.open("instagram_posts.csv", "a+",) do |csv|
csv << row
end
end

appendRowToCsv(['username']);

scrape_result = InstaScrape.long_scrape_hashtag('dripmoreicedteas', 0, include_meta_data: true)
scrape_result.each do |post|

appendRowToCsv([post.username])
end

The code writes 4 usernames to the CSV file. I cannot figure out how to retrieve more than 4 usernames. Any assistance would be greatly appreciated.

Thank you!

Method user_info does not return any value

If I call:
InstaScrape.user_info("foofighters")
I get:
Capybara::ElementNotFound: Unable to find css "h2"
from /Users/x/.asdf/installs/ruby/2.3.3/lib/ruby/gems/2.3.0/gems/capybara-2.7.1/lib/capybara/node/finders.rb:44:in block in find' from /Users/x/.asdf/installs/ruby/2.3.3/lib/ruby/gems/2.3.0/gems/capybara-2.7.1/lib/capybara/node/base.rb:85:in synchronize'
from /Users/x/.asdf/installs/ruby/2.3.3/lib/ruby/gems/2.3.0/gems/capybara-2.7.1/lib/capybara/node/finders.rb:33:in find' from /Users/x/.asdf/installs/ruby/2.3.3/lib/ruby/gems/2.3.0/gems/capybara-2.7.1/lib/capybara/session.rb:699:in block (2 levels) in class:Session'
from /Users/x/.asdf/installs/ruby/2.3.3/lib/ruby/gems/2.3.0/gems/insta_scrape-1.1.4/lib/insta_scrape.rb:123:in block in scrape_user_info' from /Users/x/.asdf/installs/ruby/2.3.3/lib/ruby/gems/2.3.0/gems/capybara-2.7.1/lib/capybara/session.rb:291:in within'
from /Users/x/.asdf/installs/ruby/2.3.3/lib/ruby/gems/2.3.0/gems/capybara-2.7.1/lib/capybara/dsl.rb:52:in block (2 levels) in <module:DSL>' from /Users/x/.asdf/installs/ruby/2.3.3/lib/ruby/gems/2.3.0/gems/insta_scrape-1.1.4/lib/insta_scrape.rb:116:in scrape_user_info'
from /Users/x/.asdf/installs/ruby/2.3.3/lib/ruby/gems/2.3.0/gems/insta_scrape-1.1.4/lib/insta_scrape.rb:35:in user_info' from (irb):6 from /Users/x/.asdf/installs/ruby/2.3.3/bin/irb:11:in

'

Running in OS X 10.11.6, Ruby 2.3.3
Was running OK until a couple weeks ago.

General Test Failing Due To 2018 IG css change by the looks of it

running rake spec, gives:

Finished in 1 minute 3.27 seconds (files took 0.30543 seconds to load)
16 examples, 12 failures

Failed examples:

rspec ./spec/insta_scrape_spec.rb:16 # InstaScrape connects to user's instagram scrapes and maps their info
rspec ./spec/insta_scrape_spec.rb:21 # InstaScrape connects to user's instagram scrapes and maps their info and posts
rspec ./spec/insta_scrape_spec.rb:78 # InstaScrape connects to instagram hashtag long_scrapes a user info with posts and gets all of them
rspec ./spec/insta_scrape_spec.rb:89 # InstaScrape connects to a user and checks their follower count
rspec ./spec/insta_scrape_spec.rb:94 # InstaScrape connects to a user and checks their following count
rspec ./spec/insta_scrape_spec.rb:99 # InstaScrape connects to a user and checks their post count
rspec ./spec/insta_scrape_spec.rb:104 # InstaScrape connects to a user and checks their description
rspec ./spec/insta_scrape_spec.rb:35 # InstaScrape#user_posts returns extra data for each post
rspec ./spec/insta_scrape_spec.rb:46 # InstaScrape#long_scrape_hashtag connects to instagram hashtag long_scrapes 'test' hashtag and gets over 200 posts
rspec ./spec/insta_scrape_spec.rb:51 # InstaScrape#long_scrape_hashtag returns extra data for each post
rspec ./spec/insta_scrape_spec.rb:62 # InstaScrape#long_scrape_users connects to instagram hashtag long_scrapes a user and gets all posts
rspec ./spec/insta_scrape_spec.rb:68 # InstaScrape#long_scrape_users returns extra data for each post

Returning 'created at' date from posts

Hey, thanks for creating this gem; it's been really useful!

It would be great to scrape the time that posts were created at. I'm happy to submit a PR with that change if that's okay with you :)

insta scrape save?

Hi! i downloaded and ran a few cases but where does the images save ? also, it was stalling in the command line and would not come to completion after 3 hours. is it in a queue or if the hashtag is like millions of post how long should i expect to wait? i tried with a 13 post hastag and it stalled the same for quite a while.

Thanks!
Tayler

Add post video scrape as well

Is there a way you can update the gem to scrape the video source files as well as other regular image posts? (from Instagram) I'm trying to edit it myself but am not having much success. I'm still learning how to edit your ruby gem to do so. Can you write the correct way to add this feature? Thanks for your time and consideration. :)

Non standard image sizes

insta_scrape will always return a url to an image that is square since these URLs come from Instagram's grid view. Most of the time this is acceptable except if the user decides to upload an image that is not the standard size. In this case the grid version crops the edges off the image.

Example:
Original image: https://scontent-sjc2-1.cdninstagram.com/t51.2885-15/e35/p480x480/16465323_259532871141640_7137122661011816448_n.jpg?ig_cache_key=MTQ0NjcxMjQyNTU1MDQ5NDQ4Mw%3D%3D.2

Image returned by insta_scrape: https://scontent-sjc2-1.cdninstagram.com/t51.2885-15/e35/c0.45.638.638/16465323_259532871141640_7137122661011816448_n.jpg?ig_cache_key=MTQ0NjcxMjQyNTU1MDQ5NDQ4Mw%3D%3D.2.c

This creates a problem for the app I'm working on because a lot of the images I need have text embedded.

Is it possible to grab the full size image URL from the HTML? If not, do you know if there's an easy way to modify the grid image URL to get the original version?

I'm still learning Ruby, but if you point me in the right direction I can try to find a solution and submit a PR.

Thanks so much! :)

undefined method `image' for nil:NilClass

on InstaScrape.user_info_and_posts(username) I got this error:

NoMethodError: undefined method image' for nil:NilClass from /usr/local/opt/rbenv/versions/2.3.3/lib/ruby/gems/2.3.0/gems/insta_scrape-1.1.4/lib/insta_scrape.rb:191:in log_posts'

Thanks

Ambiguous match Capybara

Hi,
In long_scrape_posts I'm facing this issue this might be due to they have updated their html tags etc.

Ambiguous match, found 3 elements matching css "div section span span" (Capybara::Ambiguous)

Get text

Hello,
Great job with the gem. I am a python developer and am using this gem as I couldn't find better alternatives in python. Please let me know how do i get the text for every image. I saw some PR trying to do the same, but I dont see that reflecting in the output( even after include_meta_data: true ). I am a complete newbie to ruby, please respond.
Thanks.

User info for hashtag search

Hi! Thanks for a great package!

I was wondering if it's possible to get user info when using the hashtag search (InstaScrape.long_scrape_hashtag and InstaScrape.hashtag). Looks like each post has only:
@image
@link
@Date
@text

I'd like to get the info about the user of each post as well.

Thank you!

Instagram Public_content vs scrape?

Sorry to troll you about insta_scrape, but I had a question. I work with a gorup who is unable to get granted public_content to pull post data of specific hashtags. Is that what your script does? Any tips on getting approved for public content? Any issues you have had with getting IP blocked or anything? Thanks in advance..

Trevyn

Capybara::Ambiguous: Ambiguous match, found 14 elements matching css "time"

Hi, maybe you could help me whit this error
Capybara::Ambiguous: Ambiguous match, found 14 elements matching css "time"
when I run this code?

scrape_result = InstaScrape.long_scrape_hashtag("test", 1, include_meta_data: true)
scrape_result.each do |post|
if Foto.find_by(name:post.text).nil?
Foto.create(name: post.text, otro: post.image, user: post.username)
end
end

thanks"

Date and location

HI!
also, how can I take the information about the place the pict was taken or uploaded and the date?
thanks

Great job

Hi Danny,

Love what you did with insta_scrape. I started using it and was pretty amazed of how easy it was to start scrapping profiles ..

I'm using this issue to drop a suggestion. Having number of likes and comments per post would be super cool.

I'm thinking of creating a web app that allows users to calculate their engagement rate vs other people so this would be super useful !

long_scrape does not working

Hello,

I'm tring to use long_scrape_hashtag method but it does not working.

Following code returns only 21 images.

require "insta_scrape"

scrape_result = InstaScrape.long_scrape_hashtag("test", 60)
scrape_result.each do |post|
puts post.image
puts post.link
end

insta_scrape gets frozen when running in resque

insta_scrape (version 1.1.3) sometimes get frozen running the method "user_info_and_posts". It doesn't matter if the user has public posts or private posts.

I use it in an application which is running via resque 1.27.4.

Setting Phantomjs timeout nor use of Timeout module around calling the method didn't help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.