<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/rss-style.xsl" type="text/xsl"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	    xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	     xmlns:dc="http://purl.org/dc/elements/1.1/"
	   xmlns:atom="http://www.w3.org/2005/Atom"
	     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	  xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>
<channel>
	<title>compression &#8211; Terence Eden’s Blog</title>
	<atom:link href="https://shkspr.mobi/blog/tag/compression/feed/" rel="self" type="application/rss+xml" />
	<link>https://shkspr.mobi/blog</link>
	<description>Regular nonsense about tech and its effects 🙃</description>
	<lastBuildDate>Mon, 17 Nov 2025 07:16:17 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://shkspr.mobi/blog/wp-content/uploads/2023/07/cropped-avatar-32x32.jpeg</url>
	<title>compression &#8211; Terence Eden’s Blog</title>
	<link>https://shkspr.mobi/blog</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title><![CDATA[Would adding Brotli Compression help shrink ePubs?]]></title>
		<link>https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/</link>
					<comments>https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Sat, 26 Jul 2025 11:34:35 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[compression]]></category>
		<category><![CDATA[epub]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=62085</guid>

					<description><![CDATA[The ePub format is the cross-platform way to package an eBook. At its heart, an ePub is just a bundled webpage with extra metadata - that makes it extremely easy to build workflows to create them and apps to read them.  Once you&#039;ve finished authoring your ePub, you&#039;ve got a folder full of HTML, CSS, metadata documents, and other resources.  The result is then stored in a standard Zip file and is…]]></description>
										<content:encoded><![CDATA[<p>The ePub format is the cross-platform way to package an eBook. At its heart, an ePub is just a bundled webpage with extra metadata - that makes it extremely easy to build workflows to create them and apps to read them.</p>

<p>Once you've finished authoring your ePub, you've got a folder full of HTML<sup id="fnref:x"><a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#fn:x" class="footnote-ref" title="OK! It is actually XHTML, but let's not quibble." role="doc-noteref">0</a></sup>, CSS, metadata documents, and other resources.  The result is then stored in a standard Zip file and is then renamed to <code>.epub</code>.  This is known as the <a href="https://www.w3.org/TR/epub-33/#sec-ocf">Open Container Format</a> (OCF).</p>

<p>There are actually a few different compression schemes for Zip files, but <a href="https://www.w3.org/TR/epub-33/#sec-zip-container-zipreqs">the specification says</a>:</p>

<blockquote><p>OCF ZIP containers MUST include only stored (uncompressed) and Deflate-compressed ZIP entries within the ZIP archive.</p></blockquote>

<p>The Deflate algorithm is venerable<sup id="fnref:old"><a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#fn:old" class="footnote-ref" title="That's a fancy way of saying &quot;old&quot;." role="doc-noteref">1</a></sup> and, while incredible for its time, has been superseded by more modern compression schemes. For example, <a href="https://brotli.org/">Brotli</a>.</p>

<p>What happens if we unzip an ePub and then recompress it with Brotli?  Will that dramatically reduce the file size?</p>

<h2 id="steps"><a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#steps">Steps</a></h2>

<ul>
<li>Unzip the book

<ul>
<li><code>unzip book.epub -d book/</code></li>
</ul></li>
<li>Brotli files can't contain directories, so tar the directory without any compression

<ul>
<li><code>tar -cvf book.tar book/</code></li>
</ul></li>
<li>Create a Zip file with maximum compression

<ul>
<li><code>zip -9 book.tar.zip book.tar</code></li>
</ul></li>
<li>Create a Brotli file with maximum compression

<ul>
<li><code>brotli -k -q 11 book.tar</code></li>
</ul></li>
</ul>

<h2 id="results"><a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#results">Results</a></h2>

<p>I took a random(ish) sample from <a href="https://standardebooks.org/">Standard eBooks</a> and a few from my personal stash<sup id="fnref:auto"><a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#fn:auto" class="footnote-ref" title="I couldn't be bothered automating this. Go ahead a run it on every ePub if you want something more representative." role="doc-noteref">2</a></sup>.</p>

<table>
<thead>
<tr>
  <th align="right"></th>
  <th align="right">Book 1</th>
  <th align="right">Book 2</th>
  <th align="right">Book 3</th>
  <th align="right">Book 4</th>
</tr>
</thead>
<tbody>
<tr>
  <td align="right">Contents</td>
  <td align="right">768KB</td>
  <td align="right">911KB</td>
  <td align="right">389KB</td>
  <td align="right">594KB</td>
</tr>
<tr>
  <td align="right">Deflate</td>
  <td align="right">250KB</td>
  <td align="right">248KB</td>
  <td align="right">103KB</td>
  <td align="right">175KB</td>
</tr>
<tr>
  <td align="right">Brotli</td>
  <td align="right">190KB</td>
  <td align="right">187KB</td>
  <td align="right">82KB</td>
  <td align="right">137KB</td>
</tr>
</tbody>
</table>

<p>The good news is that ePubs compress pretty well already! That isn't much of a surprise - compression algorithms love the repetitious nature of HTML and human-readable text.  Obviously Brotli is better but, on the file sizes we're talking about, not <em>dramatically</em> better. Saving 60KB is OK - but in a world of terabyte sized SD cards does it matter?</p>

<p>Brotli is also computationally harder to decompress, which makes it slightly less attractive for low-powered eReaders.</p>

<p>It's also possible to make a small saving by reducing the complexity and verbosity of the CSS and HTML.</p>

<p>However, that's not the <em>real</em> problem.</p>

<h2 id="i-lied-to-you"><a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#i-lied-to-you">I lied to you</a></h2>

<p>An ePub contains more than just text and text-based metadata. It can contain web fonts, images, even music.  The above books had all their fonts and media stripped out.  Let's run the experiment again but, this time, including <em>everything</em> in the original book.</p>

<table>
<thead>
<tr>
  <th align="right"></th>
  <th align="right">Book 1</th>
  <th align="right">Book 2</th>
  <th align="right">Book 3</th>
  <th align="right">Book 4</th>
</tr>
</thead>
<tbody>
<tr>
  <td align="right">Contents</td>
  <td align="right">23MB</td>
  <td align="right">3.8MB</td>
  <td align="right">0.76MB</td>
  <td align="right">0.93MB</td>
</tr>
<tr>
  <td align="right">Deflate</td>
  <td align="right">22MB</td>
  <td align="right">1.7MB</td>
  <td align="right">0.46MB</td>
  <td align="right">0.51MB</td>
</tr>
<tr>
  <td align="right">Brotli</td>
  <td align="right">22MB</td>
  <td align="right">1.5MB</td>
  <td align="right">0.43MB</td>
  <td align="right">0.47MB</td>
</tr>
</tbody>
</table>

<p>All of a sudden, Brotli makes next to no difference. Yes, the textual compression is still there, but it is overshadowed by the huge cost of the media files.</p>

<h2 id="mixed-media"><a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#mixed-media">Mixed Media</a></h2>

<p>The <a href="https://www.w3.org/TR/epub-33/#sec-core-media-types">ePub 3.3 specification lays out which multimedia formats are acceptable</a>. As well as the older formats like gif, png, and jpeg - newer formats like WebP are acceptable. Similarly, TTF fonts are listed in the standard along with WOFF2.</p>

<p>Modern image and font formats have better compression than their ancestors. Indeed, WOFF2 uses Brotli as its compression scheme.</p>

<p>The biggest filesize saving in ePubs comes from properly compressing images and fonts.</p>

<h2 id="can-you-picture-that"><a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#can-you-picture-that">Can You Picture That?</a></h2>

<p>It is a matter of opinion as to what resolution is best suited to an ePub. Most modern eReaders have, at best, 300ppi resolution. They're also normally monochrome. But eBooks aren't always read on low-resolution, black and white eInk screens - so it probably makes sense to have high-resolution colour images in order to future-proof books.</p>

<p>But the <em>compression</em> of those images is <em>not</em> a matter of opinion. Lossless compression algorithms are well supported for legacy and modern image formats.</p>

<p>Let's take a specific example.  <a href="https://standardebooks.org/ebooks/jane-addams/twenty-years-at-hull-house/downloads/jane-addams_twenty-years-at-hull-house_advanced.epub">Twenty Years at Hull House</a> is the 22MB book above. Less than a MB of that is for text, the rest is images.</p>

<p>The largest illustration in the book is a 1937x1971, transparent PNG weighing in at 1MB.  Increasing the lossless compression level takes it down to 840KB. Reducing the palette to something more suitable takes it to 640KB. If you were releasing this as an ePub 3.3 file, using WebP would take the image to a hair over 600KB.</p>

<p>Basically, a 20%-40% filesize reduction with no loss of fidelity.</p>

<p>Across all the PNG images in the ePub, I was able to easily get the filesize from 20MB to 16MB.</p>

<p>Converting to lossless WebP got it down to 13MB.</p>

<h2 id="what-the-font"><a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#what-the-font">What The Font?</a></h2>

<p>Fonts can be shrunk in a number of ways.  The most obvious way is to compress to WOFF2 which, as described above, uses Brotli compression.</p>

<p>Based on my quick tests, a typical ePub's TTF will see about a 50% reduction in font size. For typical "English" language fonts, that's a reduction from 30KB to 15KB. So big relative compression, but small absolute compression.</p>

<p>Complex decorative fonts can go from 800KB to 80KB. But it is rare for a font to exceed a megabyte.</p>

<p>If it does, that usually means that it has more glyphs than strictly necessary.  If your book is written entirely in the Latin alphabet, do you really need all those fancy accents, Chinese ideographs, and emoji? Probably not.</p>

<p>I've previously written about <a href="https://shkspr.mobi/blog/2013/05/subsetting-chinese-fonts/">Subsetting Fonts</a> and the perils of <a href="https://shkspr.mobi/blog/2015/11/premature-subsetting-of-web-fonts/">excessive trimming</a>.</p>

<h2 id="back-to-basics"><a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#back-to-basics">Back to Basics</a></h2>

<p>Brotli is magic - but changing the compression algorithm for the ePub standard is probably a false economy. The text portion of modern eBooks is already fairly small and compresses with reasonable efficiency.</p>

<p>The best compression gains come from either using next-generation image and font formats or, if legacy compatibility is necessary, using the most aggressive compression settings for traditional images.</p>

<div id="footnotes" role="doc-endnotes">
<hr aria-label="Footnotes">
<ol start="0">

<li id="fn:x">
<p>OK! It is actually <strong>X</strong>HTML, but let's not quibble.&nbsp;<a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#fnref:x" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>

<li id="fn:old">
<p>That's a fancy way of saying "old".&nbsp;<a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#fnref:old" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>

<li id="fn:auto">
<p>I couldn't be bothered automating this. Go ahead a run it on every ePub if you want something more representative.&nbsp;<a href="https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/#fnref:auto" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>

</ol>
</div>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=62085&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2025/07/would-adding-brotli-compression-help-shrink-epubs/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Compressing Text into Images]]></title>
		<link>https://shkspr.mobi/blog/2024/01/compressing-text-into-images/</link>
					<comments>https://shkspr.mobi/blog/2024/01/compressing-text-into-images/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Sat, 13 Jan 2024 12:34:11 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[compression]]></category>
		<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[python]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=49184</guid>

					<description><![CDATA[(This is, I think, a silly idea. But sometimes the silliest things lead to unexpected results.)  The text of Shakespeare&#039;s Romeo and Juliet is about 146,000 characters long. Thanks to the English language, each character can be represented by a single byte.  So a plain Unicode text file of the play is about 142KB.  In Adventures With Compression, JamesG discusses a competition to compress text…]]></description>
										<content:encoded><![CDATA[<p>(This is, I think, a silly idea. But sometimes the silliest things lead to unexpected results.)</p>

<p>The text of Shakespeare's Romeo and Juliet is about 146,000 characters long. Thanks to the English language, each character can be represented by a single byte.  So a plain Unicode text file of the play is about 142KB.</p>

<p>In <a href="https://jamesg.blog/2023/12/29/compression-adventures/">Adventures With Compression</a>, JamesG discusses a competition to compress text and poses an interesting thought:</p>

<blockquote><p>Encoding the text as an image and compressing the image. I would need to use a lossless image compressor, and using RGB would increase the number of values associated with each word. Perhaps if I changed the image to greyscale? Or perhaps that is not worth exploring.
</p></blockquote>

<p>Image compression algorithms are, generally, pretty good at finding patterns in images and squashing them down. So if we convert text to an image, will image compression help?</p>

<p>The English language and its punctuation are not very complicated, so the play only contains 77 unique symbols. The ASCII value of each character spans from 0 - 127. So let's create a greyscale image which each pixel has the same greyness as the ASCII value of the character.</p>

<p>Here's what it looks like when losslessly compressed to a PNG:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2024/01/ascii_grey.png" alt="Random grey noise." width="512" height="277" class="aligncenter size-full wp-image-49360">

<p>That's down to 55KB! About 40% of the size of the original file. It is slightly <em>smaller</em> than ZIP, and about 9 bytes larger than Brotli compression.</p>

<p>The file can be read with the following Python:</p>

<pre><code class="language-python">from PIL import Image
image  = Image.open("ascii_grey.png")
pixels = list(image.getdata())
ascii  = "".join([chr(pixel) for pixel in pixels])
with open("rj.txt", "w") as file:
    file.write(ascii)
</code></pre>

<p>But, even with the latest image compression algorithms, it is unlikely to compress much further; the image looks like random noise.  Yes, you and I know there is data in there. And a statistician looking for entropy would probably determine that the file contains readable data. But image compressors work in a different realm. They look for solid blocks, or predictable gradients, or other statistical features.</p>

<p>But there you go! A lossless image is a pretty efficient way to compress ASCII text.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=49184&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2024/01/compressing-text-into-images/feed/</wfw:commentRss>
			<slash:comments>12</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[What's the smallest file size for a 1 pixel image?]]></title>
		<link>https://shkspr.mobi/blog/2024/01/whats-the-smallest-file-size-for-a-1-pixel-image/</link>
					<comments>https://shkspr.mobi/blog/2024/01/whats-the-smallest-file-size-for-a-1-pixel-image/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Fri, 05 Jan 2024 12:34:10 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[compression]]></category>
		<category><![CDATA[images]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=49135</guid>

					<description><![CDATA[There are lots of new image compression formats out there. They excel at taking large, complex pictures and algorithmically reducing them to smaller file sizes.  All of the comparisons I&#039;ve seen show how good they are at squashing down big files.  I wanted to go the other way. How good are modern codecs at dealing with tiny files?  Using GIMP, I created an image which was a single white pixel,…]]></description>
										<content:encoded><![CDATA[<p>There are lots of new image compression formats out there. They excel at taking large, complex pictures and algorithmically reducing them to smaller file sizes.  All of the comparisons I've seen show how good they are at squashing down big files.</p>

<p>I wanted to go the other way. How good are modern codecs at dealing with <em>tiny</em> files?</p>

<p>Using GIMP, I created an image which was a single white pixel, and saved it as a PNG. I then used <a href="https://squoosh.app">Squoosh</a> to convert it to a variety of modern formats using different encoding options. This is what I found:</p>

<table>
<thead>
<tr>
  <th>Filetype</th>
  <th align="right">Bytes</th>
</tr>
</thead>
<tbody>
<tr>
  <td><a href="https://shkspr.mobi/blog/wp-content/uploads/2024/01/1.avif">AVIF</a></td>
  <td align="right">303</td>
</tr>
<tr>
  <td><a href="https://shkspr.mobi/blog/wp-content/uploads/2024/01/1.jpg">JPG</a></td>
  <td align="right">155</td>
</tr>
<tr>
  <td><a href="https://shkspr.mobi/blog/wp-content/uploads/2024/01/1.bmp">BMP</a> <sup id="fnref:BMP"><a href="https://shkspr.mobi/blog/2024/01/whats-the-smallest-file-size-for-a-1-pixel-image/#fn:BMP" class="footnote-ref" title="The BMP was created in ImageMagick and compressed with FileFormat.app." role="doc-noteref">0</a></sup></td>
  <td align="right">126</td>
</tr>
<tr>
  <td><a href="https://shkspr.mobi/blog/wp-content/uploads/2024/01/1.ico">ICO</a> <sup id="fnref:ICO"><a href="https://shkspr.mobi/blog/2024/01/whats-the-smallest-file-size-for-a-1-pixel-image/#fn:ICO" class="footnote-ref" title="The ICO was created with convert -size 1x1 canvas:white w.ico" role="doc-noteref">1</a></sup></td>
  <td align="right">70</td>
</tr>
<tr>
  <td><a href="https://shkspr.mobi/blog/wp-content/uploads/2024/01/1.png">PNG</a> <sup id="fnref:PNG"><a href="https://shkspr.mobi/blog/2024/01/whats-the-smallest-file-size-for-a-1-pixel-image/#fn:PNG" class="footnote-ref" title="See Evan Hahn's post about why this is the smallest PNG." role="doc-noteref">2</a></sup></td>
  <td align="right">67</td>
</tr>
<tr>
  <td><a href="https://shkspr.mobi/blog/wp-content/uploads/2024/01/1.gif">GIF</a>  <sup id="fnref:GIF"><a href="https://shkspr.mobi/blog/2024/01/whats-the-smallest-file-size-for-a-1-pixel-image/#fn:GIF" class="footnote-ref" title="It is possible to go slightly smaller if you don't care about colour." role="doc-noteref">3</a></sup></td>
  <td align="right">35</td>
</tr>
<tr>
  <td><a href="https://shkspr.mobi/blog/wp-content/uploads/2024/01/1.webp">WEBP</a></td>
  <td align="right">30</td>
</tr>
<tr>
  <td><a href="https://shkspr.mobi/blog/wp-content/uploads/2024/01/1.jxl">JXL</a></td>
  <td align="right">24</td>
</tr>
<tr>
  <td><a href="https://shkspr.mobi/blog/wp-content/uploads/2024/01/1.qoi">QOI</a></td>
  <td align="right">23</td>
</tr>
</tbody>
</table>

<p>This is designed to be the "minimum viable <em>viewable</em> image".  I loved <a href="https://www.phpied.com/minimum-viable-no-image-image-src/">Stoyan Stefanov's "Minimum viable no-image image src"</a>. That creates a 42 byte SVG with no image data in it - so I thought I'd see what happens if you make a displayable image.</p>

<p>Some important things to note:</p>

<ul>
<li>Older image formats like BMP and GIF are smaller than newer formats like AVIF.</li>
<li>Some compression options make files larger in unexpected ways. The lossy WebP was <em>larger</em> than the lossless version.</li>
<li>Similarly, increasing the effort on an AVIF can also result in a larger filesize.</li>
<li>Neither <a href="https://en.wikipedia.org/wiki/JPEG_XL">JPEG XL</a> nor <a href="https://qoiformat.org/">QOI</a> are supported in mainstream browsers yet.</li>
<li><a href="https://aomediacodec.github.io/av1-avif/#image-item-properties">AVIF has a rather long and complex header</a> - that makes sense for large images, but bloats it for smaller ones.</li>
<li>Using <a href="https://github.com/google/brotli">Brotli</a>, it is possible to further compress the AVIF (203 Bytes), JPG (69 Bytes), BMP (63 Bytes), and ICO (30 Bytes) files.</li>
<li>WebP is the smallest file if you still need a <a href="https://en.wikipedia.org/wiki/Spacer_GIF">Spacer.gif</a>!</li>
</ul>

<p>Here's my challenge to you - can you do any better? What's the smallest filesize you can find for a <em>viewable</em> image?</p>

<div id="footnotes" role="doc-endnotes">
<hr aria-label="Footnotes">
<ol start="0">

<li id="fn:BMP">
<p>The BMP was created in ImageMagick and compressed with <a href="https://products.fileformat.app/image/compress/bmp">FileFormat.app</a>.&nbsp;<a href="https://shkspr.mobi/blog/2024/01/whats-the-smallest-file-size-for-a-1-pixel-image/#fnref:BMP" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>

<li id="fn:ICO">
<p>The ICO was created with <code>convert -size 1x1 canvas:white w.ico</code>&nbsp;<a href="https://shkspr.mobi/blog/2024/01/whats-the-smallest-file-size-for-a-1-pixel-image/#fnref:ICO" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>

<li id="fn:PNG">
<p>See <a href="https://evanhahn.com/worlds-smallest-png/">Evan Hahn's post about why this is the smallest PNG</a>.&nbsp;<a href="https://shkspr.mobi/blog/2024/01/whats-the-smallest-file-size-for-a-1-pixel-image/#fnref:PNG" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>

<li id="fn:GIF">
<p>It is possible to go <a href="http://probablyprogramming.com/2009/03/15/the-tiniest-gif-ever">slightly smaller if you don't care about colour</a>.&nbsp;<a href="https://shkspr.mobi/blog/2024/01/whats-the-smallest-file-size-for-a-1-pixel-image/#fnref:GIF" class="footnote-backref" role="doc-backlink">↩︎</a></p>
</li>

</ol>
</div>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=49135&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2024/01/whats-the-smallest-file-size-for-a-1-pixel-image/feed/</wfw:commentRss>
			<slash:comments>15</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Sometimes gzip beats Brotli]]></title>
		<link>https://shkspr.mobi/blog/2023/08/sometimes-gzip-beats-brotli/</link>
					<comments>https://shkspr.mobi/blog/2023/08/sometimes-gzip-beats-brotli/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Wed, 09 Aug 2023 11:34:25 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[compression]]></category>
		<category><![CDATA[http]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=46387</guid>

					<description><![CDATA[Perhaps this was obvious to you, but it wasn&#039;t to me. So I&#039;m sharing in the hope that you don&#039;t spend an evening trying to trick your webserver into doing something stupid.  For years, HTTP content has been served with gzip compression (gz). It&#039;s basically the same sort of compression algorithm you get in a .zip file. It&#039;s pretty good!  But there&#039;s a new(er) compression algorithm called Brotli…]]></description>
										<content:encoded><![CDATA[<p>Perhaps this was obvious to you, but it wasn't to me. So I'm sharing in the hope that you don't spend an evening trying to trick your webserver into doing something stupid.</p>

<p>For years, HTTP content has been served with gzip compression (gz). It's basically the same sort of compression algorithm you get in a .zip file. It's pretty good!</p>

<p>But there's a new(er) compression algorithm called <a href="https://datatracker.ietf.org/doc/html/rfc7932">Brotli</a> (br). It's Better, Faster, Stronger, Harder than gzip. Mostly.</p>

<p>Looking through my browser's request logs, I noticed everything was being transferred with Brotli compression <em>except</em> for one specific text file was being served as gz.</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2023/07/Screenshot-from-2023-07-24-09-46-53.png" alt="Screenshot showing a transfer with the content-encoding as gzip." width="382" height="362" class="aligncenter size-full wp-image-46388">

<p>What's going on?</p>

<p>Well, let's take a look at the file's size.</p>

<p><code>curl -s "https://openbenches.org/api/benches.tsv" | wc -c</code></p>

<p>That downloads the file, counts the number of bytes, then formats it for readability.  It's 2,727,104 bytes.</p>

<p>Now let's request it as a gzipped file:</p>

<p><code>curl -s -H'Accept-Encoding: gzip' "https://openbenches.org/api/benches.tsv" | wc -c</code>
It's 1,085,372 bytes.</p>

<p>Finally, requesting a Brotli compressed transfer:
<code>curl -s -H'Accept-Encoding: br' "https://openbenches.org/api/benches.tsv" | wc -c</code></p>

<p>That's 1,0<strong>9</strong>8,151 bytes. A whole 12,779 bytes <em>larger</em>!</p>

<p>My server was correct in using gzipped rather than Brotli for this specific file.</p>

<p>But, that's not the entire case here! I manually compressed the full file using different compression levels. Here's a quick graph showing the filesize at different compression strengths:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2023/07/brotli-vs-gzip.png" alt="Graph showing how Brotli is a generally better algorithm, but at lower strengths it is outperformed by gzip's higher strengths." width="718" height="425" class="aligncenter size-full wp-image-46390">

<p>So, in this case, Brotli ≤ 3 is <em>worse</em> than gzip ≥ 5.</p>

<p>I suspect my host's server is configured to prioritise faster compression over absolutely smallest file size. That's probably a reasonable trade-off. I couldn't see a way to tell it to use a higher strength Brotli algorithm all the time - but I would probably be chasing marginal gains.</p>

<p>So, there you go. Don't be surprised if you occasionally see gzip where you expect to see Brotli.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=46387&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2023/08/sometimes-gzip-beats-brotli/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Selectively Compressed Images - A Hybrid Format]]></title>
		<link>https://shkspr.mobi/blog/2023/06/selectively-compressed-images-a-hybrid-format/</link>
					<comments>https://shkspr.mobi/blog/2023/06/selectively-compressed-images-a-hybrid-format/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Thu, 01 Jun 2023 11:34:53 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[compression]]></category>
		<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[images]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=45892</guid>

					<description><![CDATA[I have a screenshot of my phone&#039;s screen. It shows an app&#039;s user interface and a photo in the middle. Something like this:    If I set the compression to be lossy - the photo looks good but the UI looks bad. If I set the compression to be lossless - the UI looks good but the filesize is huge.  Is there a way to selectively compress different parts of an image? I know WebP and AVIF are pretty…]]></description>
										<content:encoded><![CDATA[<p>I have a screenshot of my phone's screen. It shows an app's user interface and a photo in the middle. Something like this:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2023/05/Camera-screenshot-40.jpg" alt="Screenshot of a camera app on a phone. The middle is a photo, the sides show the user interface." width="1024" height="512" class="aligncenter size-full wp-image-45893">

<p>If I set the compression to be lossy - the photo looks good but the UI looks bad.
If I set the compression to be lossless - the UI looks good but the filesize is huge.</p>

<p>Is there a way to selectively compress different parts of an image? I know WebP and AVIF are pretty magical but, as I understand it, the whole image is compressed with the same algorithm and the same settings.</p>

<p>There are two ways to do this. The impossible way and the cheating way.</p>

<h2 id="selective-compression"><a href="https://shkspr.mobi/blog/2023/06/selectively-compressed-images-a-hybrid-format/#selective-compression">Selective Compression</a></h2>

<p>In <em>theory</em> it should be possible to tell an image format to compress some chunks of an image with a different compression algorithm.</p>

<p>And yet... <em>none</em> of the documentation I've found shows that's possible.</p>

<p>GiMP's native XCF and Photoshop's PSD files work; they store different layers each of which can have a different filetype. I understand that TIFF and .djvu also have that capability.</p>

<p>But those sorts of files don't display in web browsers.</p>

<p>So...</p>

<h2 id="lets-cheat"><a href="https://shkspr.mobi/blog/2023/06/selectively-compressed-images-a-hybrid-format/#lets-cheat">Let's Cheat!</a></h2>

<p>It's possible to use an SVG to embed multiple images of different formats. SVG is used as, effectively, a layout engine.</p>

<p>The syntax is relatively straightforward:</p>

<pre><code class="language-svg">&lt;svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" viewBox="0 0 1080 512"&gt;
   &lt;image width="1080" height="512" x="0" y="0"
      xlink:href="data:image/jpeg;base64,..........."
   /&gt;
   &lt;image width="1080" height="512" x="0" y="0"
      xlink:href="data:image/png;base64,..........."
   /&gt;
&lt;/svg&gt;
</code></pre>

<p>That draws the JPG then draws the PNG on top of it. If the PNG has a transparent section, the JPG will show through. The JPG can be set to as low a quality as you like and the PNG remains lossless.</p>

<p>Here's what it looks like - click for full size:</p>

<p><a href="https://shkspr.mobi/blog/wp-content/uploads/2023/06/Mixed-Compression.svg"><img src="https://shkspr.mobi/blog/wp-content/uploads/2023/06/Mixed-Compression.svg" alt="Screenshot of a phone's camera app with a heavily compressed photo inside it." class="aligncenter size-full wp-image-45906" width="1080" height="512"></a></p>

<p>Embedded images are Base 64 encoded, which does lose some of the compression advantages. But, overall, it's smaller than a full PNG and better quality than a full JPG.</p>

<p>Look, if it's stupid but it works it's not stupid.</p>

<p>But surely there must be a way of doing this natively?</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=45892&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2023/06/selectively-compressed-images-a-hybrid-format/feed/</wfw:commentRss>
			<slash:comments>7</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Energy efficiency of modern codecs]]></title>
		<link>https://shkspr.mobi/blog/2021/12/energy-efficiency-of-modern-codecs/</link>
					<comments>https://shkspr.mobi/blog/2021/12/energy-efficiency-of-modern-codecs/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Sun, 26 Dec 2021 12:34:20 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[compression]]></category>
		<category><![CDATA[power]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=41389</guid>

					<description><![CDATA[How efficient are modern codecs? Can we ever work out whether the power use of compression algorithms is a net gain for global power consumption?  Come on a thought experiment with me.  I have invented a new image compression format. It shrinks images to 50% smaller sizes than AVIF and is completely lossless. Brilliant!  There&#039;s only one problem - it is 1 million times slower.  If it takes your…]]></description>
										<content:encoded><![CDATA[<p>How efficient are modern codecs? Can we ever work out whether the power use of compression algorithms is a net gain for global power consumption?</p>

<p>Come on a thought experiment with me.</p>

<p>I have invented a new image compression format. It shrinks images to 50% smaller sizes than AVIF and is completely lossless. Brilliant!</p>

<p>There's only one problem - it is <strong>1 million times slower</strong>.</p>

<p>If it takes your computer 10 seconds to compress an AVIF, it'll take 115 <em>days</em> to compress using my new format.</p>

<p>Is that worth it?</p>

<p>There are five aspects to think about here.</p>

<h2 id="time"><a href="https://shkspr.mobi/blog/2021/12/energy-efficiency-of-modern-codecs/#time">Time</a></h2>

<p>If you're someone like, say, Netflix - this might be fine. You can throw a whole bunch of cloud servers at the problem, so you're not time-bound.</p>

<p>If you're a smaller service, or if you require instantaneous image compression, this is a pointless format.  The time-cost - let alone processing costs - is prohibitive.</p>

<h2 id="space"><a href="https://shkspr.mobi/blog/2021/12/energy-efficiency-of-modern-codecs/#space">Space</a></h2>

<p>If you're a library with billions of images - storage costs are a real concern. Halving those costs could save you more than it would cost to recompress all your images.</p>

<p>But storage is pretty cheap these days. Terabytes of storage are easily in range of home users. Is the cost of encoding worth it given that space isn't a limiting factor?</p>

<h2 id="transit"><a href="https://shkspr.mobi/blog/2021/12/energy-efficiency-of-modern-codecs/#transit">Transit</a></h2>

<p>The Netflix logo is downloaded a hundred million times per day. So would a massive saving in bandwidth costs be worth the trade-off? Probably, yes.</p>

<p>If you pay per MB for your Internet access, you'll probably welcome this new codec. Saving money is great.</p>

<p>But how much does bandwidth <em>really</em> cost? Is it <em>that</em> expensive for either party?</p>

<h2 id="user-experience"><a href="https://shkspr.mobi/blog/2021/12/energy-efficiency-of-modern-codecs/#user-experience">User Experience</a></h2>

<p>I've only mentioned the speed of encoding so far. Most codecs are asymmetric; they take a long time to encode but are fairly quick to decode.</p>

<p>Image formats like JPG and PNG are displayed practically instantly even on modest hardware. What if my wonderful new image format takes a couple of seconds to decode and display?  That might be unacceptable. Asking the user to buy a new computer or upgrade their software may also be impossible.</p>

<h2 id="power-and-pollution"><a href="https://shkspr.mobi/blog/2021/12/energy-efficiency-of-modern-codecs/#power-and-pollution">Power and Pollution</a></h2>

<p>I think this is what it comes down to. How much energy does it take to create and use these images?  I don't know how much CO<sub>2</sub> it costs to transport a GB of data from one side of the planet to the other. And I don't know how energy efficient data centres are around the world.</p>

<p>I suspect that this is an impossible calculation. This daft image format would be a disaster if run on coal-powered server which sent data over a solar-powered fibre-optic cable. But, conversely, if we have a wind-turbine powered server sending data over the <a href="https://mars.nasa.gov/msl/mission/communications/">Deep Space Network to Mars</a> (expensive and bandwidth-constrained) then this format is probably less polluting.</p>

<p>I worry that new compression formats face diminishing returns when faced with massive power costs. If the power required to encode and decode images continues to get larger, it could outstrip the savings made on transit and storage.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=41389&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2021/12/energy-efficiency-of-modern-codecs/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
	</channel>
</rss>
