Giter Site home page Giter Site logo

gildas-lormeau / singlefile Goto Github PK

View Code? Open in Web Editor NEW
13.7K 114.0 914.0 336.04 MB

Web Extension for saving a faithful copy of a complete web page in a single HTML file

License: GNU Affero General Public License v3.0

JavaScript 83.19% CSS 4.41% HTML 12.19% Shell 0.21%
browser archive auto-save chrome firefox offline-reading osint chrome-extension firefox-addon puppeteer

singlefile's Introduction

SingleFile

SingleFile is a Web Extension (and a CLI tool) compatible with Chrome, Firefox (Desktop and Mobile), Microsoft Edge, Safari, Vivaldi, Brave, Waterfox, Yandex browser, and Opera. It helps you to save a complete web page into a single HTML file.

Table of Contents

Demo

Demo.SingleFile.mp4

Install

SingleFile can be installed on:

You can also download the zip file (https://github.com/gildas-lormeau/SingleFile/archive/master.zip) of the project and install it manually by unzipping it somewhere on your disk and following these instructions:

Getting started

  • Click on the SingleFile button in the extension toolbar to save the page.
  • You can click again on the button to cancel the action when processing a page.

Additional notes

  • Open the context menu by right-clicking the SingleFile button in the extension toolbar or on the webpage. It allows you to save:
    • the current tab,
    • the selected content,
    • the selected frame.
  • You can also process multiple tabs in one click and save:
    • the selected tabs,
    • the unpinned tabs,
    • all the tabs.
  • Select "Annotate and save the page..." in the context menu to:
    • highlight text,
    • add notes,
    • remove content.
  • The context menu also allows you to activate the auto-save of:
    • the current tab,
    • the unpinned tabs,
    • all the tabs.
  • With auto-save active, pages are automatically saved every time after being loaded (or before being unloaded if not).
  • Right-click on the SingleFile button and select "Manage extension" (Firefox) / "Options" (Chrome) to open the options page.
  • Enable the option "Destination > save to Google Drive" or "Destination > upload to GitHub" to upload pages to Google Drive or GitHub respectively.
  • Enable the option "Misc. > add proof of existence" to prove the existence of saved pages by linking the SHA256 of the pages into the blockchain.
  • You can use the customizable shortkey Ctrl+Shift+Y to save the current tab or the selected tabs. Go to about:addons and select "Manage extension shortcuts" in the cogwheel menu to change it in Firefox. Go to chrome://extensions/shortcuts to change it in Chrome.
  • The default save folder is the download folder configured in your browser, cf. about:addons in Firefox and chrome://settings in Chrome.
  • See the extension help in the options page for more detailed information about the options and technical notes.

FAQ

See https://github.com/gildas-lormeau/SingleFile/blob/master/faq.md

Release notes

See https://addons.mozilla.org/firefox/addon/single-file/versions/

Known Issues

  • All browsers:
    • For security reasons, you cannot save pages hosted on https://chrome.google.com, https://addons.mozilla.org and some other Mozilla domains. When this happens, 🛇 is displayed on top of the SingleFile icon.
    • For security reasons, SingleFile is sometimes unable to save the image representation of canvas and snapshots of video elements.
    • The last saved path cannot be remembered by default. To circumvent this limitation, disable the option "Misc > save pages in background".
    • The following characters are replaced with _ in file names: ~, +, \, ?, %, *, :, |, ", <, >. This is done to maintain compatibility with various OSs and file systems. If you don't need that level of compatibility and know what you are doing, you can change the list of forbidden characters in Hidden options.
  • Chromium-based browsers:
    • You must enable the option "Allow access to file URLs" in the extension page to display the infobar when viewing a saved page, and to save or to annotate a page stored on the filesystem.
    • If the file name of a saved page looks like "56833935-156b-4d8c-a00f-19599c6513d3.html", disable the option "Misc > save pages in background". Reinstalling the browser may also fix this issue. You can find more info about this bug here.
    • Disabling the option "File name > open the "Save as" dialog to confirm the file name" will work if and only if the option "Ask where to save each file before downloading" is disabled in chrome://settings/downloads.
  • Firefox:
    • The "File name > file name conflict resolution" option does not work if set to "prompt for a name"
    • Sometimes, SingleFile is unable to save the contents of sandboxed iframes because of this bug.
    • When processing a page from the filesystem, external resources (e.g. images, stylesheets, fonts etc.) will not be embedded into the saved page. You can find more info about this bug here. This bug has been closed by Mozilla as "WontFix". But there is a simple workaround proposed here.
  • Waterfox Classic
    • User interface elements displayed in the page (progress bar, logs panel) won't be displayed unless dom.webcomponents.enabled is enabled in about:config.
    • When opening pages saved with the option "Images > group duplicate images together" enabled, some duplicate images might not displayed. It is recommended to disable this option.

Troubleshooting unknown issues

Please follow these steps if you find an unknown issue:

  • Save the page in incognito.
  • If saving page in incognito did not fix the issue, reset SingleFile options.
  • If resetting options did not fix the issue, restart the browser.
  • If restarting the browser did not fix the issue, try to disable all other extensions to see if there is a conflict.
  • If there is a conflict then try to determine against which extension(s).
  • Please report the issue with a short description on how to reproduce it here: https://github.com/gildas-lormeau/SingleFile/issues.

Command Line Interface (SingleFile CLI)

You can save web pages to HTML from the command line interface. See here for more info: https://github.com/gildas-lormeau/single-file-cli.

Integration with user scripts

You can execute a user script just before (and after) SingleFile saves a page. For more info, see https://github.com/gildas-lormeau/SingleFile/wiki/How-to-execute-a-user-script-before-a-page-is-saved.

File format comparison

HTML Self-extracting ZIP MHTML Webarchive (Safari) HTML+folder
Pages are saved as a single file
HTML and styles are minified
Unused HTML and styles are removed from files
Binary resources are not encoded in base 64
Files are compressed
Files can be viewed without installing any extension ✓¹ ✓² ✓³
Files can be viewed without running JavaScript
Files can be unzipped to extract page resources n/a
Files contains the text of the page (plain or formatted) which can be indexed ✓⁴

Footnotes:

¹ When using the "universal" self-extracting file format.

² Only in Chromium-based browsers, and Internet Explorer.

³ Only in Safari.

⁴ An option must be enabled in the extension.

Projects using/compatible with SingleFile

Privacy Policy

See https://github.com/gildas-lormeau/SingleFile/blob/master/privacy.md

Contributors

Code derived from third party projects

Icons

License

SingleFile is licensed under AGPL. Code derived from third-party projects is licensed under MIT. Please contact me at gildas.lormeau <at> gmail.com if you are interested in licensing the SingleFile code for a commercial service or product.

Suggestions are welcome :)

singlefile's People

Contributors

arvindsv avatar baloe avatar bannmann avatar blackspirits avatar cristianofromagio avatar dependabot[bot] avatar egor-duda avatar fastbyte01 avatar fgrehm avatar fletcherhaz avatar fregante avatar frostblazergit avatar geek-prince avatar gildas-lormeau avatar jakseb avatar kamil-cy avatar krasnayaploshchad avatar mikaelmorvan avatar mikkovedru avatar musabgultekin avatar perdolka avatar pirate avatar rstp14 avatar scruel avatar shitennouji avatar solokot avatar strel avatar totalcaesar659 avatar xesarni avatar yfdyh000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

singlefile's Issues

Stability problems in Chromium at Linux

Hi,
i am using Arch Linux 64 bits with chromium 19.0.1084.46, i have tried changing the settings but i have problems with any combination, for example:

  1. if i use "display save banner" option i get the banner above the original page after finish but without text, all blank.

  2. if i use "display save notification" option i get the notification window with a link inside that not ever works, only sometimes

So i am don't display nothing and keep using ctrl+s to save the web after have been processed by SingleFile but it seems a bit weird have to use internal chromium save us for this and i get the option to use in background.

SingleFile not working behind a proxy

During the day I'm working at office behind a proxy.
I could install SingleFile to a Chrome standalone version and "SingleFile Core0.3.6". Unfortunatelly:

  • the icon is grey,
  • the options are not saved.

Can you see what's wrong?

css data urls confuse SingleFile

The following page can't be saved with SingleFile:

<link type="text/css" rel="stylesheet" href="data:text/css; charset=utf-8,div {width:100%}">
<div>Food</div>

An exception is thrown in decodeDataURI when trying to decode "div {width:100%}"

Unable to save page - Display Save Banner not shown

Hello,
please go to "http://www.suntension.com/de/home.htm" and try to save any page there with Single File. The Display Banner won't show (only a small white banner with no 'Save..." text in it is shown). I also tried disabling the "Display Save Banner" (which also automatically disables "process in background") - but if i then try to save the page in Chrome as a "single webpage" it also doesn't save the content of the page.
I can remember that there was an option of single file in an earlier version to "show a popup save dialog" - why is this option not available anymore - this maybe could solve the problems of not showing up the save banner (there are a few sites more that will not show the save banner after using single file button, but i don't remember the links at the moment)
Greetings from Germany

Local file useage

Hello,
why is it not possible to process locally stored html files? Do you plan to add this feature? It would be ideal use case as a lot of other programs allow html export of content, but images and styles are as separate files and not contained in one file. Thank you.

Does not work in Incognito Mode

Congratulations for the wonderful extension. I instantly fell in love with it and that rarely happens for me. One thing I have observed, which is hampering my use of the extension is that it does not work in Incognito Mode.

After doing all the background processing and popping up the save dialog, the associated item in the download bar says Failed - No file (as shown in the screenshots below) and nothing gets saved. The only workaround (if it can be called as such) I have found is to open the page in normal mode (something I didn't want to do initially) and then save from there.

image
image

Additional Details

  • Google Chrome Version 36.0.1985.125 m
  • Microsoft Windows 7 Home Premium Service Pack 1 (32-bit)

Why not file:/// URLs?

I see in the Known Issues for SingleFile that file:/// URLs cannot be processed for security reasons. I'm curious about those reasons - how would it give me access to things I wouldn't already have access to?

I was hoping to use SingleFile to make periodic local archives of some web pages created with Org Mode, but they don't exist on a server anywhere.

Edit Filename does not work

Love the extension !!

I can't get the edit filename feature to work during the processing procedure - any tips ?

Also is there any way of disabling Chrome's annoying security keep/discard banner ?

Cheers

HTTPS support

SingleFile fails saving webpages over HTTPS protocol.

Support Firefox

It would be nice to have this available as a Firefox add-on

Disable keyboard shortcut

Hi there!

I'm currently having issues with the control+alt+shift+S shortcut and found no way to disable it.
How can I?

pst: it is used by Netflix

Does not work on WordPress posts

I tried to use this on a single post on a wordpress page, it only gives the wordpress sites' 404 page as the generated HTML file

screenshot 2018-07-04 at 9 26 04 am

Disqus not compatible (blank page)

disqus.com
and all web sites using it (like filehippo.com or tomshw.it for example)
If I try to save the page, all I get is a blank space where there are user comments.
please fix it.

missing img data

Latest Chrome Version 67.0.3396.99 (64-bit), Windows 10, latest version of SingleFile.

I have a local file with an img tag like this:

<img src="../images/Cyber Threat Controls Ontology.png" alt="Cyber Threat Controls Ontology.png" />

The image appears, and I have enabled "Allow access to file URLs" for this extension.

When I save it with SingleFile, the img data is missing:

<img src="data:base64," alt="Cyber Threat Controls Ontology.png">

This is a regression, there was no such problem in previous versions.

Thanks in advance for fixing this!

.html !!!

Please, how I can configure my SingleFile plugin to save a page with "html" instead of "htm" extension? "html" is international standard, while "htm" is a Microsoft fad.

Aww Snap!

Cannot process almost all pages.
It said Aww Snap!

Google Chrome 18.0.1017.2 (Official Build 118867)
OS Windows 7 x64
WebKit 535.19 (@105663)
JavaScript V8 3.8.7.1
Flash 11,2,202,160
User Agent Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1017.2 Safari/535.19
Command Line "C:\Users\Rizki Kurniawan\appdata\local\google\chrome\application\chrome.exe" --flag-switches-begin --flag-switches-end http://www.modemberry.com/huawei/ec1260.php?mdn=3616017374&meid=a000002e4dfe3b
Executable Path C:\Users\Rizki Kurniawan\appdata\local\google\chrome\application\chrome.exe
Profile Path C:\Users\Rizki Kurniawan\AppData\Local\Google\Chrome\User Data\Default

Sometimes it is not saved address of the page

Why does not always saved the page address (URL)? It is very important to me.
If you re-establish and maintain the same page, the address is successfully maintained.
p.s. Thank you for SingleFile. I like him even more than the MHTML-format.
p.p.s. This text was translated from Russian into English using Google Translate. The original text below.

Почему не всегда сохраняется адрес страницы (URL)? Это очень важно для меня.
Если повторно сформировать и сохранить ту же страницу, то адрес успешно сохраняется.
p.s. Большое спасибо за SingleFile. Мне он нравится даже больше чем MHTML-формат.

(Only on facebook) SingleFile Core has crashed

Well, Long time I didn't use SingleFile, but when I did, Last times, this problem appeared.
I just want to make it clear that I have this problem with facebook.com website only

its complete downloading to 100%, but the yellow bar on the top didn't appeared, after up about 10:20 second, I got this notification:
SingleFile Core has crashed, click here to reload.
I went to check if the Core is running, but I found that its really stopped working and crashed.

I've delete it, reinstall it, test it, nothing, I've tested it on Chrome incognito mode, but the same notification kept showing.

Chrome V: 33.0.17

[Feature Request] pre-parse regex replace, preferably on a domain/site basis

This would allow removing junk such as:

<meta ...>

<style type="text/css"></style>

and also configuring responsive pages before saving. For example, these 5 replacement sets for an udemy.com page would expand all the sections:

<span class="js-wrapper-simple-collapse-more-btn"> <button
class="js-simple-collapse-more-btn btn-sm btn btn-link">+
View More</button> </span>

(replace with null - remove the whole thing)

---

<div class="header-right"> <span class="js-toggle-all">
<div class="collapsed-text"> <a class="sections-toggle"
user-tracker-click="" data-user-tracker-schema="action-logs"
data-user-tracker-object-id="1791224" data-user-tracker-action="full-curriculum-read"
data-user-tracker-user-id=""> Expand All </a> </div>

<div class="header-right"> <span class="js-toggle-all js-toggle-all--expanded">
<div class="expanded-text"> <a class="sections-toggle">
Collapse All </a> </div>

---

in"> <span class="lecture-title-toggle-minus">–</span>

in" aria-expanded="true"> <span class="lecture-title-toggle-minus">–</span>

---

<div class="lectures-container collapse js-curriculum-collapse lecture-1791224-1 in">

<div class="lectures-container collapse js-curriculum-collapse lecture-1791224-1 in" aria-expanded="true">

---

"><span class="lecture-title-toggle-plus">+

collapse in"\aria-expanded="true" style=""> <span class="lecture-title-toggle-minus">–

[Feature request] Ability to embed items manually

The Save Page WE extension has more flexible options to choose which items can be saved into single HTML, such as CSS images, web fonts, etc. If you get this extension, then you will see it when you open their options at add-on manager.
1
2
It would be nice if SingleFile can implement similar functionality.

SingleFile does not work on local files

I love SingleFile, there's only one use-case I wish it would deal with.

Some programs when exporting to html will export to a html file and then store resources separately. SingleFile does not handle making a single-file from local files.

I don't know if it's by design or a bug, but please consider fixing.

Cannot save file name

It looks like it should be possible to edit the file name from the save banner.

This does not work.

Editing the file name has no effect.

Am I using it wrongly ?

Thanks

[Feature Request] support for page source ="file:"

I have installed SingleFile on Macintosh Chrome. It works some of the time. Some attempts to use it result in a "download blocked" on the right-hand side of top of screen URI window. Closing the tab and starting over with a new tab seems to solve the problem.

Unfortunately SingleFile does not completely meet my needs. It has one major advantage over the browser command "Save Page As.. " raw HTML. SingleFile saves the current HTML rather than the original HTML. (Macintosh Chrome under some mysterious condition will save the current HTML rather than original). My first problem is that SingleFile only works with HTTP or HTPTPS files. When page source is "file://" SingleFile won't process the page. Perhaps documentation should mention this.

There is one aspect which I consider to be a bug.
Given a link of the form <a href="#anchorname>text , the link is modified to
<a href="sourcefile#anchorname>text
However this is wrong because the anchorname might not exist in the original file.
IMHO modifying a local link is always wrong.
Macintosh Chrome version: Version 67.0.3396.99 (Official Build) (64-bit)
Cheers,
Roger D. Moore

menuToggle for html menus and javascript not preserved

menuToggle buttons and the like for collapsing html menus are not working in the single file generated (clicking them doesn't trigger the event listeners). Also, it seems all java script (contained in the <script> tags) are not released in the single file (though it is simple enough to manually paste it in the file).

Is there a way to do this or can these features become available for future versions of single file? Having java script and menu css actions come to life really makes the single file much more professional.

License

Just discovered this plugin and find it very useful. Thanks for maintaining it!
Can you post a clear license?

please add this for reducing memory use

First off thanks for this wonderful extension.

Please read this link and add these api and rules to your extension to make it use less memory, many extensions now use it and the result is that chrome uses much less memory:
Please also can you remove access of the extension to Your tabs and browsing activity and Your data on all websites for privacy reasons?

http://www.ghacks.net/2012/06/22/chrome-extensions-eventually-will-use-less-memory-than-before/

http://blog.chromium.org/2012/06/put-your-extensions-on-diet-with-event.html

http://developer.chrome.com/stable/extensions/event_pages.html

Thank you again for this wonderful extension

[Feature Request] DataURI Deduplication

Hello again, The DataURI's should be able to be reduced via CSS/javascript rules, and I beleive this would greatly improve the size of the file, memory usage, and speed of capture.

Chrome Save As... no longer works

This is known, as it is commented on the Chrome store description. But it is a huge issue... if you want to save the page into a file system location, you need to use the Save As to get a file dialog. Using the new Single File method always puts the file into the Downloads. Yuk.

new version 0.3 does not work with gmail

hi there, i'm a great fan of singlefile, i've been using it for over a year. however, the new version 0.3 does not work with gmail.

normally i save single gmail web emails by clicking on 'Print' in gmail, which opens up 2 new tabs, the email itself and another tab showing print options. then i use singlefile on the email, and 'save as'.

in the new version 0.3 singlefile, when i click on singlefile it works. but when i try to save it, it does not work:

  1. when click on the banner, it saves a webpage but with an extra folder containing all the images
  2. when i uncheck 'display save banner' and 'display save notification' (according to your instructions), then click 'save as' in chrome, it saves a 0 kb file.

i have also tried looking for a place to reinstall the previous version of singlefile (0.2.33) but i cannot find this.

please help!! :) thank you for creating a great product :D

yy

sent 2011-11-21

[Feature request] Saving from Android

I have installed SingleFile in Firefox for Android just now, but it doesn't works at all when I press Save Page from the menu. It would be nice if this addon allowing save single HTML from Firefox for Android.

Need option to add source URL link to saved page

There is a frequent situation when I want to browse to original URL from which page has been saved with SingleFile.

It will be very nice if only we can have source URL link added to saved page in some way (at the top or bottom).

Thanks!

.

,

Date in new file name

Anyway I can omit the date inserted in the file name?.. It is making my file names too long..

Processed page cannot be saved.

Reproduce steps:

  1. Choose "process page" from the right-click menu;
  2. Click save prompt bar as below:
    save page

Result: The prompt bar disappeared and nothing get saved.

If I right click the prompt bar and choose "open in a new tab". I can see the page (probably the processed one) with strange URL like this:

blob:chrome-extension://mpiodijhokgodhhofbcjdecpffjipkle/7d2e641e-2306-43bf-9edb-9a552e3b31e1

I am using:

Chorme 65.0.3325.181 64bit version
Windows 10 version 1607.

suddenly stopped saving to download folder (or anywhere else)

I've been using singlefile for over a year. Great extension! I run Windows 7, Chrome browser.
Just today, for the first time, the file is processed, the banner presents, I click Download, and nothing gets downloaded. The banner goes away as usual. Searching the whole machine can't find the target files.
(I can't uncheck "process in background" as it is greyed out and checked.)
Apologies if I'm overlooking something silly.
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.