Giter Site home page Giter Site logo

simplecto / screenshots Goto Github PK

View Code? Open in Web Editor NEW
144.0 144.0 11.0 596 KB

Simple Website Screenshots as a Service (Django, Selenium, Docker, Docker-compose)

License: MIT License

Dockerfile 1.61% Shell 0.54% Python 74.20% HTML 13.70% CSS 9.67% JavaScript 0.28%
django docker python screenshot-as-a-service screenshots selenium worker-processes

screenshots's People

Contributors

dependabot[bot] avatar elliotwutingfeng avatar glovebx avatar undernewmanagement avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

screenshots's Issues

url format validation on input

there is no input validation on the URL for screenshots. You can put in anything.

Malformed URLs will crash the screenshotting process because it assumes proper URLS.

force prefix of http or https with the ://

screenshots should save as JPG to save space

Currently we get a warning about needing to save as PNG with selenium driver

/Users/sheraz/src/screenshot/venv/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py:1031: UserWarning: name used for saved screenshot does not match file type. It should end with a `.png` extension
  "type. It should end with a `.png` extension", UserWarning)

Enable multiple screenshots of a domain

there should be a screenshot_file table that points to the screenshot table.

That way every time a new screenshot is made it does not overwrite the previous ones.

This will be helpful for when we want to track the screenshots over time.

screenshot_file table:
id (uuid)
screenshot_id
created_At

add delay (secs) to screenshots per entry

We can see from the screenshots that some of them need a little more time to complete once the screen is resized. Let's ass a delay with default to 5 seconds, and the option to refresh it with a longer interval.

delete files when screenshot is deleted

create a worker to only do delete house-cleaning

when a screenshot is marked as deleted, the worker should first delete the original screenshot folder or individual screenshot files

when the actual images are removed then it should delete the entry from the database

prevent duplicates with (www vs bare domain)

There will be duplicates recorded when using www or not.

We should have a way for the create_screenshot method to first lookup the domain and then redirect to the appropriate screenshot if it already exists.

If it does not exist then we continue as planned.

[worker] - unhandled exception (WebDriverException)

selenium.common.exceptions.WebDriverException: Message: Failed to decode response from marionette

It seems that sentry has a hard time pulling out all the context variables such as the actual input that caused the error. We should catch / wrap for this exception, send it to sentry, and then continue on.

if this is done then you should update the actual screenshot status to error or something else.

worker needs to handle cleanup when receiving KILL or ctrl-c

Right now it is possible that a screenshot will be left in PENDING state if it is killed while making the screenshot. This is because the worker does not handle a kill or shutdown properly.

It should listen for the KILL signal and return the screenshot status back to NEW

[screenshot view] - if shot is pending then show spinner

after the user kicks off a screenshot request and lands on the view page, he should see a notification asking them to wait while the screenshot is finished.

The page should refresh every 5 seconds until the page is done loading.

If the page status is not pending or success then it should show an error to the user.

worker handle with while loop

Hello,

I see screenshot_worker.py that you use while loop true for handle

 def handle(self, *args, **options):

        while True:

when I run manage.py screenshot_worker in terminal, I think it will neverstop. how to run manage.py runserver 0.0.0.0:8000

[worker] - aggressively attempt to get screenshots

Some websites are not configured properly with their SSL and redirection. It is entirely possible that you can hit a page like https://azcentral.com from outsize the US, and their GDPR/protection routers will drop traffic, provide invalid SSL settings, or simply hang.

However, when you hit http://azcentral.com, it will work as expected and redirect you to https://eu.azcentral.com

We should have a number of attempts before giving up.

if http fails, then stop.

if https fails, then try http

restful api call with webhook

build a REST endpoint to get a screenshot.

it receives a POST to the endpoint

JSON format

the request payload as POST

  • url
  • webhook endpoint of where to send the notification (optional)

in the response payload

  • url
  • id of the object created
  • timestamp

webhook payload as POST

  • url
  • id
  • status
  • screenshot encoded as base64

show stats on front page

Show a little window with the following:

  • number of images waiting
  • number of images rendering
  • average wait time for an image (last 5 minutes)
  • average time to render an image (speed of rendering)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.