Capture web pages to local device or backend server for future retrieval, organization, annotation, and editing.
WebScrapBook is a browser extension that captures the web page faithfully with various archive formats and customizable configurations, for future retrieval, organization, annotation, and editing. This project inherits from legacy Firefox add-on ScrapBook X.
1. Capture faithfully: A web page shown in the browser can be captured without losing any subtle detail. Metadata such as source URL and timestamp are also recorded.
2. Customizable capture: WebScrapBook can save selected area in a page, save source page (before processed by scripts), or save page as a bookmark. How to capture images, audio, video, fonts, frames, styles, scripts, etc. are also customizable. A web page can be saved as a folder, a ZIP-based archive file (HTZ or MAFF), or a single HTML file.
3. Organizable collections: Captured pages can be organized in the browser sidebar using one or more "scrapbooks". A scrapbook holds a hierarchical tree structure to organize data items, and can be further indexed for a rich-feature search (using a combination of title, fulltext keywords, custom comment, source URL, or other metadata). (*)
4. Page editing: A web page can be highlighted, annotated, or edited before or after a capture. You can additionally create and manage notes using HTML or markdown format. (*)
5. Remote access: Captured data can be hosted with a central backend server and be read or edited from other devices. Alternatively, a static site index can be generated for a scrapbook, which can therefore be hosted on a shared web server that doesn't support dynamic web hosting. (*)
6. Mobile support: WebScrapBook supports mobile browsers such as Firefox for Android and Kiwi browser. You can capture and edit the web page from a mobile phone or tablet.
7. Legacy ScrapBook support: Scrapbooks created from legacy ScrapBook or ScrapBook X can be converted into WebScrapBook-compliant format for usage. (*)
* All or partial functionality of a starred feature above requires a running collaborating backend server, which can be easily set up using PyWebScrapBook. (*)
* An HTZ or MAFF archive file can be viewed using the built-in archive page viewer, with PyWebScrapBook or other assistant tools, or by opening the index page after unzipping.
* For further information and frequently asked questions, visit the documentation wiki: https://github.com/danny0838/webscrapbook/wiki/Intro
* For better discussion, please report an issue to the source repository: https://github.com/danny0838/webscrapbook/issues
* Donate to support us if you find this tool helpful: https://www.paypal.me/danny0838/5usd