Giter Site home page Giter Site logo

oduwsdl / mementoembed Goto Github PK

View Code? Open in Web Editor NEW
15.0 6.0 2.0 33.44 MB

A service that provides archive-aware oEmbed-compatible embeddable surrogates (social cards, thumbnails, etc.) for archived web pages (mementos).

License: MIT License

Python 32.08% HTML 39.96% JavaScript 25.47% Shell 1.89% CSS 0.09% Dockerfile 0.14% Makefile 0.37%
memento surrogate embed web-archives social-cards flask docker thumbnails

mementoembed's People

Contributors

himarshaj avatar ibnesayeed avatar machawk1 avatar shawnmjones avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

databill86 min2ha

mementoembed's Issues

Include images from loaded CSS files

Sometimes all images on a page are loaded via CSS. MementoEmbed should interrogate the CSS files for URIs found in the background-image property.

First attempt to create a card fails

Run the server and load the page http://localhost:5550/ then enter a URI-M and click "Create a Social Card" button. For the first time the card creation request is not processed, instead the page is redirected to http://localhost:5550/?# (notice the added ?# part in the URL), the form is cleared, and a message is logged in the console, saying we have failure with data: [object Object]. Any further attempts will be processed as expected (unless the ?# is removed from the URL again.)

Missing thumbnail image in cards

Every time I try to request a card for a URI-M like https://web.archive.org/web/20180604110141/http://www.example.com/, the card is missing the thumbnail image and a 404 is reported in the developer console of the browser which points to a resource at http://localhost:5550/undefined. This might be a duplicate of #30.

Use unobtrusive JavaScript for event listeners

It's generally not considered a good practice to place JS as HTML attribute value in an obtrusive way so that when all the JS is extracted out into an external file, it does not leak into the markup. The click handler (such as onclick="requestEmbed();") from the following code can be removed and can be bound externally in an unobtrusive way.

<button type="button" class="btn btn-primary" onclick="requestEmbed();">Create a Social Card</button>
<button type="reset" class="btn btn-secondary" onclick="clearEmbed();">Clear URL</button>

Add line ending to all files

Many files in this repo currently do not have a trailing newline character which is not an standard practice.

Test with URI-Ms from Bibliotheca Alexandrina Web Archive

This requirement will be delayed until we have URI-Ms to test.

"Please note that the Bibliotheca Alexandrina is currently migrating the web archive collection to a new storage system. Therefore, availability of archived webpages is not guaranteed for the time being. We appreciate your patience, and we look forward to the collection being fully available once again soon."

Dockerfile improvements

FROM python:3.6.4-stretch

The base image is so specific that it would not update to even any non-breaking security patches (i.e., third point release in the semver scheme) automatically. For example, python:3.6.4-stretch is already stale and python:3.6.5-stretch is published. We should perhaps use python:3.6-stretch instead to accommodate any security patch releases. Better yet, if Python 3.6 is not required specifically and any later version would work then we can use python:3-stretch instead.

# TODO: publish archiveit_utilities so that we don't need to do this
RUN git clone https://github.com/shawnmjones/archiveit_utilities.git

RUN cd archiveit_utilities && pip install .

Each docker instruction adds one layer to the image. It is better to put relevant tasks and any associated cleanup should in a single layer for a cleaner image. Hence, above instructions can be consolidated in a single one.

FutureWarning for the way a None object is checked

After booting the server when the first card creation request arrives, the server logs a warning.

/app/mementoembed/mementosurrogate.py:785: FutureWarning: The behavior of this method will change in future versions. Use specific 'len(elem)' or 'elem is not None' test instead.
  if maxpara:

ResourceWarning: unclosed ssl.SSLSocket

Is there anything we can do about this?

ResourceWarning: unclosed <ssl.SSLSocket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.86.26', 50357), raddr=('207.241.225.186', 443)>

This only shows up during unit tests and appears to be related to:
psf/requests#3912

Support for CarbonDate service

MementoEmbed should at least link to the CarbonDate service for a given URI. Perhaps the CarbonDate endpoint could be made configurable?

Improve the error message for non-mementos

Right now, the following error message appears on a red background:

The URL you supplied (https://www.flexispy.com/) is not a memento or comes from an archive that is not Memento-Compliant.

For a live web resource, you can create a memento that resides on the web in the following ways:

* Using the Internet Archive's Save Page Now button.
* Saving the web page at Archive.is
* Using the ArchiveNow service.
* Using a browser plugin, like Mink.

Happy Memento Making!

The red background is not friendly and there should be links to the recommended services.

Fix issue with #? in URI

It appears that MementoEmbed does not execute properly at first load. If a user submits a URI-M at /, the system reloads the page to /#? rather than submitting the request. Once this reload has occurred, subsequent URI-M submissions are successful.

Support for MementoDamage service

MementoEmbed should at least link to the MementoDamage service for a given URI. Perhaps the MementoDamage endpoint could be made configurable?

Time zone missing from card date

On the bottom of the card, there is a datetime and source but the time is ambiguous, as it lacks a timezone. "GMT", "Z", or what ever is applicable ought to be appended.

MementoEmbed 6563c01 using Docker.

screen shot 2018-06-19 at 5 19 36 pm

Force cache hits during automated testing

In spite of using a custom heuristic with cachecontrol, the system still skips the cache for some requests. Because web archives will likely block too many requests for the same URI-M, this will need to be alleviated for automated testing to be used with Travis CI.

Implement thumbnail generation

A feature of oEmbed is the generation of thumbnails based on given height/width requirements. MementoEmbed needs to support this.

Ensure that this project works with archive.is

This project has issues with resolving content from archive.is URI-Ms. A potential solution may exist in the use of the ZIP URI containing the content that is rendered within the archive.is banner.

Add a LICENSE file

We usually use MIT license for most of our codes, but you can chose whichever feels more suitable to you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.