All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Gwtar: A New HTML Archival Format That Solves the Static, Single-File, Efficient Trilemma

By

theblazehen

3mo ago· 24 min readenInsight

Summary

Gwtar is an experimental new HTML archival format that solves the trilemma of creating a format that is simultaneously static (self-contained with all assets), single-file (zero additional files), and efficient (lazy-loads assets only as needed). The format uses JavaScript in the HTML header to make HTTP range requests, allowing browsers to efficiently load only necessary portions of large HTML archives while maintaining a single, self-contained file structure. It's currently used on Gwern.net for serving large HTML archives.

Key quotes

· 4 pulled
Archiving HTML files faces a trilemma: it is easy to create an archival format which is any two of 'static' (self-contained ie. all assets included, no special software or server support), 'single-file' (when stored on disk, zero additional files or modifications), and 'efficient' (lazy-loads assets only as necessary to display to a user), but no known format allows all 3 simultaneously.
We introduce an experimental new format, Gwtar (logo; pronounced 'guitar', .gw⁠tar.html extension), which achieves all 3.
Gwtar is a new polyglot HTML archival format which provides a single, self-contained, HTML file which still can be efficiently lazy-loaded by a web browser.
This is done by a header's JavaScript making HTTP range requests. It is used on Gwern.net to serve large HTML archives.
Snippet from the RSS feed
Gwtar is a new polyglot HTML archival format which provides a single, self-contained, HTML file which still can be efficiently lazy-loaded by a web browser. This is done by a header’s JavaScript making HTTP range requests. It is used on Gwern.net to serve la

You might also wanna read