<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/rss-style.xsl" type="text/xsl"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	    xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	     xmlns:dc="http://purl.org/dc/elements/1.1/"
	   xmlns:atom="http://www.w3.org/2005/Atom"
	     xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	  xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>
<channel>
	<title>LLM &#8211; Terence Eden’s Blog</title>
	<atom:link href="https://shkspr.mobi/blog/tag/llm/feed/" rel="self" type="application/rss+xml" />
	<link>https://shkspr.mobi/blog</link>
	<description>Regular nonsense about tech and its effects 🙃</description>
	<lastBuildDate>Sat, 21 Mar 2026 08:16:01 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://shkspr.mobi/blog/wp-content/uploads/2023/07/cropped-avatar-32x32.jpeg</url>
	<title>LLM &#8211; Terence Eden’s Blog</title>
	<link>https://shkspr.mobi/blog</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title><![CDATA[Removing "/Subtype /Watermark" images from a PDF using Linux]]></title>
		<link>https://shkspr.mobi/blog/2026/01/removing-subtype-watermark-images-from-a-pdf-using-linux/</link>
					<comments>https://shkspr.mobi/blog/2026/01/removing-subtype-watermark-images-from-a-pdf-using-linux/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Thu, 22 Jan 2026 12:34:02 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[pdf]]></category>
		<category><![CDATA[python]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=63035</guid>

					<description><![CDATA[Problem: I&#039;ve received a PDF which has a large &#34;watermark&#34; obscuring every page.  Investigating: Opening the PDF in LibreOffice Draw allowed me to see that the watermark was a separate image floating above the others.  Manual Solution: Hit page down, select image, delete, repeat 500 times. BORING!  Further Investigating: Using pdftk, it&#039;s possible to decompress a PDF. That makes it easier to look …]]></description>
										<content:encoded><![CDATA[<p><strong>Problem:</strong> I've received a PDF which has a large "watermark" obscuring every page.</p>

<p><strong>Investigating:</strong> Opening the PDF in LibreOffice Draw allowed me to see that the watermark was a separate image floating above the others.</p>

<p><strong>Manual Solution:</strong> Hit page down, select image, delete, repeat 500 times. BORING!</p>

<p><strong>Further Investigating:</strong> Using <a href="https://linux.die.net/man/1/pdftk">pdftk</a>, it's possible to decompress a PDF. That makes it easier to look through manually.</p>

<p><code>pdftk input.pdf output output.pdf uncompress</code></p>

<p>Hey presto! A PDF you can open in a text editor! Deep joy!</p>

<p><strong>Searching:</strong> On a hunch, I searched for "watermark" and found several lines like this:</p>

<pre><code class="language-_">&lt;&lt;
/Length 548
&gt;&gt;
stream
/Figure &lt;&lt;/MCID 0 &gt;&gt;BDC q 0 0 477 733.464 re W n q /GS0 gs 479.2799893 0 0 735.5999836 -1.0800002 -1.0559941 cm /Im0 Do Q EMC 
/Figure &lt;&lt;/MCID 1 &gt;&gt;BDC Q q 28.333 300.661 420.334 126.141 re W n q /GS0 gs 420.3339603 0 0 126.1418879 28.3330078 300.6610601 cm /Im1 Do Q EMC
/Figure &lt;&lt;/MCID 2 &gt;&gt;BDC Q q 16.106 0 444.787 215.464 re W n q /GS0 gs 444.7874274 0 0 216.5921386 16.1062775 -1.1281493 cm /Im2 Do Q EMC
/Artifact &lt;&lt;/Subtype /Watermark /Type /Pagination &gt;&gt;BDC Q q 0.7361145 0 0 0.7361145 113.3616638 240.8575745 cm /GS1 gs /Fm0 Do Q EMC
endstream
endobj
</code></pre>

<p>Those are <a href="https://opensource.adobe.com/dc-acrobat-sdk-docs/library/pdfmark/pdfmark_Logical.html">Marked Content Blocks</a>.  In <em>theory</em> you can just chop out the line with <code>/Subtype /Watermark</code> but each block has a <code>/length</code> variable - so you'd also need to adjust that to account for what you've changed - otherwise the layout goes all screwy.</p>

<p>That led me to <a href="https://github.com/pymupdf/PyMuPDF/discussions/1855">PyMuPDF which claimed to solve the problem</a>. But running that code only removed <em>some</em> of the watermarks. It got stuck on an infinite loop on certain pages.</p>

<p>So, now that I had more detailed knowledge, I managed to get an LLM to construct something which <em>mostly</em> seems to work.</p>

<p>Does it work with every PDF? I don't know. Does it contain subtle implementation bugs? Probably. Is there an easier way to do this? Not that I can find.</p>

<pre><code class="language-python">import re
import pymupdf

# Open the PDF
doc = pymupdf.open("output.pdf")

# Regex of the watermarks
pattern = re.compile(
    rb"/Artifact\s*&lt;&lt;[^&gt;]*?/Subtype\s*/Watermark[^&gt;]*?&gt;&gt;BDC.*?EMC",
    re.DOTALL
)

# Loop through the PDF's pages
for page_num, page in enumerate(doc, start=1):
    print(f"Processing page {page_num}")
    xrefs = page.get_contents()
    for xref in xrefs:
        cont = doc.xref_stream(xref)
        new_cont, n = pattern.subn(b"", cont)
        if n &gt; 0:
            print(f"  Removed {n} watermark block(s)")
            doc.update_stream(xref, new_cont)

doc.save("no-watermarks.pdf")
</code></pre>

<p>One of the (many) problems with Vibe Coding is that trying to get a LLM to spit out something useful depends <em>massively</em> on how well you know the subject area. I'm proud to say I know vanishingly little about the <a href="https://shkspr.mobi/blog/2015/11/a-polite-way-to-say-ridiculously-complicated/">baroque</a> PDF specification - which meant that most of my attempts to use various "AI" tools consisted of me saying "No, that doesn't work" and the accurs'd machine saying back "Golly-gee! You're right! Let me fix that!" and then breaking something else.</p>

<p>I'm not sure this is the future we wanted, but it looks like the future we've got.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=63035&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2026/01/removing-subtype-watermark-images-from-a-pdf-using-linux/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[LLMs are still surprisingly bad at some simple tasks]]></title>
		<link>https://shkspr.mobi/blog/2025/09/llms-are-still-surprisingly-bad-at-simple-tasks/</link>
					<comments>https://shkspr.mobi/blog/2025/09/llms-are-still-surprisingly-bad-at-simple-tasks/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Sun, 21 Sep 2025 11:34:38 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[LLM]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=63410</guid>

					<description><![CDATA[I asked three different commercially available LLMs the same question:  Which TLDs have the same name as valid HTML5 elements?  This is a pretty simple question to answer. Take two lists and compare them.  I know this question is possible to answer because I went through the lists two years ago. Answering the question was a little tedious and subject to my tired human eyes making no mistakes. So…]]></description>
										<content:encoded><![CDATA[<p>I asked three different commercially available LLMs the same question:</p>

<blockquote><p>Which TLDs have the same name as valid HTML5 elements?</p></blockquote>

<p>This is a pretty simple question to answer. Take two lists and compare them.  I know this question is possible to answer <a href="https://shkspr.mobi/blog/2023/09/false-friends-html-elements-which-are-also-top-level-domains/">because I went through the lists two years ago</a>. Answering the question was a <em>little</em> tedious and subject to my tired human eyes making no mistakes. So surely this is the sort of thing which an LLM excels at, right?</p>

<p>Wrong!</p>

<p>Here's how the three big beasts fared.</p>

<h2 id="chatgpt"><a href="https://shkspr.mobi/blog/2025/09/llms-are-still-surprisingly-bad-at-simple-tasks/#chatgpt">ChatGPT</a></h2>

<p><a href="https://chatgpt.com/s/t_68cba42f89cc8191856689675b86e165">OpenAI's LLM does a poor job</a>:</p>

<blockquote><p>Good question! Let’s carefully check this.
</p><p>[…Skip preamble…]
</p><p>✅ Confirmed Matches
</p><p>The following <strong>TLDs</strong> exactly match <strong>valid HTML5 element names</strong>:
</p><ul>
<li>.audio → &lt;audio&gt;
</li><li>.code → &lt;code&gt;
</li><li>.link → &lt;link&gt;
</li><li>.menu → &lt;menu&gt;
</li><li>.nav → &lt;nav&gt;
</li><li>.video → &lt;video&gt;
</li></ul>
</blockquote>

<p>This is wrong on two counts. First, it misses off several matches. Secondly, <em>there is no <code>.code</code> TLD!</em> There is a <code>.codes</code>, but that's not a match.</p>

<h2 id="google-gemini"><a href="https://shkspr.mobi/blog/2025/09/llms-are-still-surprisingly-bad-at-simple-tasks/#google-gemini">Google Gemini</a></h2>

<p><a href="https://g.co/gemini/share/06380dbba3e8">Gemini shits the bed completely</a>:</p>

<blockquote>
Here are the top-level domains (TLDs) that have the same name as valid HTML5 elements:
<ul>
<li>.address: The HTML &lt;address&gt; element is used to define the contact information for the author/owner of a document or an article.
</li><li>.article: The HTML &lt;article&gt; element specifies independent, self-contained content.
</li><li>.aside: The HTML &lt;aside&gt; element is used to define some content aside from the content it is placed in.
</li><li>.audio: The HTML &lt;audio&gt; element is used to embed sound content in a document.
</li><li>.body: The HTML &lt;body&gt; element defines the document's main content.
</li><li>[…Skip a dozen more items…]
</li></ul>
</blockquote>

<p>It just lists a bunch of HTML elements with no crossover to TLDs. Thanks Google. Thoogle.</p>

<h2 id="claude"><a href="https://shkspr.mobi/blog/2025/09/llms-are-still-surprisingly-bad-at-simple-tasks/#claude">Claude</a></h2>

<p><a href="https://claude.ai/share/39177222-3eb0-4310-942b-1e352c1f2b8e">Anthropic's AI does a half-arsed job</a>.</p>

<blockquote>
Here are the TLDs that match HTML5 element names:
<ul>
<li>.audio - matches the &lt;audio&gt; element
</li><li>.video - matches the &lt;video&gt; element
</li><li>.data - matches the &lt;data&gt; element
</li><li>.link - matches the &lt;link&gt; element
</li><li>.menu - matches the &lt;menu&gt; element
</li><li>.style - matches the &lt;style&gt; element
</li><li>.select - matches the &lt;select&gt; element
</li></ul>
Some other potential matches depend on how broadly you interpret "HTML5 elements":
<ul>
<li>.app - while there's no &lt;app&gt; element in HTML5, there was discussion of it during HTML5 development
</li><li>.art - not a standard HTML element, though some frameworks use custom elements like this
</li></ul>
</blockquote>

<p>The first seven <em>are</em> correct - although it ignores a dozen more correct answers. But then it does the LLM thing of trying to over-fit the data in a simulacrum of usefulness.</p>

<h2 id="dont-believe-the-hype"><a href="https://shkspr.mobi/blog/2025/09/llms-are-still-surprisingly-bad-at-simple-tasks/#dont-believe-the-hype">Don't believe the hype</a></h2>

<p>To be clear, I would expect a moderately intelligent teenager to be able to find two lists and compare them. If an intern gave me the same attention to detail as above, we'd be having a cosy little chat about their attitude to work.</p>

<p>I've discussed before why <a href="https://shkspr.mobi/blog/2025/06/why-do-people-have-such-dramatically-different-experiences-using-ai/">people have such dramatically different experiences using AI</a>. For some it is an all-knowing and ineffable source of wisdom. For others, like me, it is mostly garbage.</p>

<p>I think it comes down to how familiar you are with the domain and its constraints. When I watch a medical drama, I have no idea if they're using <a href="https://www.youtube.com/watch?v=C_AmdvxbPT8">realistic language</a>. It <em>sounds</em> good, but real doctors probably cringe at the inaccuracies.</p>

<p>AI seems <em>plausible</em> because it relies on the <a href="https://shkspr.mobi/blog/2023/02/how-much-of-ais-recent-success-is-due-to-the-forer-effect/">Barnum Effect</a> - it tells people what they <em>want</em> to hear. It produces words which <em>sound</em> correct.</p>

<p>We need a new term for <a href="https://shkspr.mobi/blog/2024/05/gell-mann-amnesia-and-purdah/">Gell-Mann Amnesia</a>. Something that describes how an AI is convincing if you don't understand its reasoning, and close to useless if you understand its limitations.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=63410&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2025/09/llms-are-still-surprisingly-bad-at-simple-tasks/feed/</wfw:commentRss>
			<slash:comments>19</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Winners don't use ChatGPT]]></title>
		<link>https://shkspr.mobi/blog/2025/07/winners-dont-use-chatgpt/</link>
					<comments>https://shkspr.mobi/blog/2025/07/winners-dont-use-chatgpt/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Wed, 30 Jul 2025 11:34:56 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[ChatGPT]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[Writing Promts]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=62288</guid>

					<description><![CDATA[If you hung around video arcades in your youth, you would have seen this message burned into the phosphor of a thousand dying CRTs.    Obviously this was a devilish psyop by those gits who wanted kids to stop sniffing glue and having fun. The bastards!  But there&#039;s a more serious side to the corny message. Are you a winner if you&#039;ve cheated?  Lance Armstrong &#34;won&#34; multiple Toures de France. It…]]></description>
										<content:encoded><![CDATA[<p>If you hung around video arcades in your youth, you would have seen this message burned into the phosphor of a thousand dying CRTs.</p>

<p><a href="https://www.flickr.com/photos/abinka/8880252938/"><img src="https://shkspr.mobi/blog/wp-content/uploads/2025/07/8880252938_2462609e08_o.jpg" alt="Arcade machine saying &quot;Winners Don't Use Drugs&quot; signed by some FBI dude. Photo CC BY-NC Megan Rosenbloom." width="512" height="384" class="aligncenter size-full wp-image-62289"></a></p>

<p>Obviously this was a devilish psyop by those gits who wanted kids to stop sniffing glue and having fun. The bastards!</p>

<p>But there's a more serious side to the corny message. Are you a winner if you've cheated?</p>

<p>Lance Armstrong "won" multiple <i lang="fr">Toures de France</i>. It wasn't him that won though, it was <a href="https://www.nytimes.com/2012/10/23/sports/cycling/armstrong-stripped-of-his-7-tour-de-france-titles.html?_r=0">a cocktail of drugs and performance enhancers</a>.</p>

<p>Milli Vanilli got to the top of the music charts <a href="https://en.wikipedia.org/wiki/Milli_Vanilli">by having someone else sing their vocals</a>.</p>

<p>You can watch the documentary <a href="https://www.imdb.com/title/tt0923752/">King of Kong</a> and decide for yourself if <a href="https://arstechnica.com/gaming/2023/02/did-billy-mitchell-use-this-illicit-joystick-to-set-a-donkey-kong-high-score/">allegations of cheating to get high-scores in Donkey Kong</a> are well founded.</p>

<p>The last one is interesting to me. I'm sure everyone here has turned to a gaming walkthough in a moment of frustration. Ah! So <em>that's</em> where the key was hidden! Onwards!</p>

<p>But once you've popped the walkthough cherry, how tempting is it to go back for just one more hit? Only a quick glance… And then, before you know it, you're no longer playing a game; you're watching a movie. You don't achieve anything by following a walkthrough, do you? You're little more than a monkey pressing the buttons in the order they flash.</p>

<p>That's fine if all you want to do is see the ending; but you can't really claim it as an achievement.</p>

<p>The same is true with cheats. <kbd>↑</kbd><kbd>↑</kbd><kbd>↓</kbd><kbd>↓</kbd><kbd>←</kbd><kbd>→</kbd><kbd>←</kbd><kbd>→</kbd><kbd>B</kbd><kbd>A</kbd> and now you have infinite lives and over-powered weapons. So what? You can glide through the game. You won't get any better at playing it. You won't learn. You'll just drift. You won't have any <a href="https://gizmodo.com/congratulations-to-ea-games-for-posting-the-most-hated-1820391000">pride and accomplishment</a> in what you've done.</p>

<p>Video games are better than real life. As the meme says:</p>

<p><a href="https://www.tumblr.com/venus-is-in-bloom/152286550432/i-enjoy-video-games-because-they-let-me-live-out"><img src="https://shkspr.mobi/blog/wp-content/uploads/2025/07/venus.webp" alt="Screenshot from tumblr. venus-is-in-bloom says &quot;i enjoy video games because they let me live out my wildest fantasies, like being assigned a task and then completing that task&quot;." width="1080" height="640" class="aligncenter size-full wp-image-62298"></a></p>

<p>If you had a cheat button for real life, would you press it?</p>

<p>Click here to suddenly be number one on YouTube.</p>

<p>Swipe here to get into the University of your dreams.</p>

<p>Prompt engineer your way to launching a killer app without knowing how to code.</p>

<p>No work necessary to accomplish anything. Click the cheat button and off you go! Amazing! You'd press it repeatedly, right?</p>

<p><a href="https://bsky.app/profile/nameshiv.bsky.social/post/3lv4ejbhgkc2z">Shiv Ramdas</a> recently posted this provocation about the <a href="https://www.plagiarismtoday.com/2025/04/07/james-somerton-re-emerges-plagiarizes-again-disappears/">plagiarist James Somerton</a>:</p>

<blockquote><p>There's an incredible essay that will break the internet waiting to be written about the throughline between walkthrough dependence, James Somerton and ChatGPT's effect on the human psyche</p></blockquote>

<p>This isn't that essay. I'm too lazy to write something amazing, and too aware of the limitations of outsourcing my thinking.</p>

<p>But I see the pattern in myself.</p>

<ul>
<li>I <em>could</em> learn that code's syntax, or I could press the cheat button.</li>
<li>I <em>could</em> plan a trip, or I could press the cheat button.</li>
<li>I <em>could</em> text my wife that I love her, or I could press the cheat button.</li>
</ul>

<p>I'm sure you're going to write an impassioned comment about why sometimes pressing the cheat button is probably fine. The world is complex and sometimes you need a bit of a helping hand.</p>

<p>But hold onto that hand too long and it will hold you back.</p>

<hr>

<h2 id="counterpoint"><a href="https://shkspr.mobi/blog/2025/07/winners-dont-use-chatgpt/#counterpoint">Counterpoint</a></h2>

<p>Do I <em>really</em> believe that?</p>

<p>I'm never going to spend a few years learning French - so I'm quite content to chuck an AI a bit of translation work.</p>

<p>Some people like to spend a morning baking bread. Others like using a bread machine. Is that cheating? Who cares. It is <em>your</em> choice.</p>

<p>Why should I care <em>how</em> you're living your life. It doesn't bother me if you look up spoilers for films, hide a motor in your exercise bike, or always choose easy-mode on your games.</p>

<p>Perhaps you have no imagination and would rather an AI made up a bedtime story for your child. I suspect you're acting like <a href="https://www.psychologicalscience.org/publications/observer/obsonline/harlows-classic-studies-revealed-the-importance-of-maternal-contact.html">the infamous Wire Mother</a>. A simulation of love with few of the benefits.</p>

<p>I can't imagine using an AI to woo my wife. But then, I've seen Cyrano de Bergerac.</p>

<p>Each of us has to draw a line somewhere. We each find a certain level of cheating acceptable when we do it - and despicable when others get away with it.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=62288&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2025/07/winners-dont-use-chatgpt/feed/</wfw:commentRss>
			<slash:comments>5</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Grinding down open source maintainers with AI]]></title>
		<link>https://shkspr.mobi/blog/2025/07/grinding-down-open-source-maintainers-with-ai/</link>
					<comments>https://shkspr.mobi/blog/2025/07/grinding-down-open-source-maintainers-with-ai/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Mon, 07 Jul 2025 11:34:09 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[git]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[spam]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=61265</guid>

					<description><![CDATA[Early one morning I received an email notification about a bug report to one of my open source projects. I like to be helpful and I want people who use my stuff to have a good time, so I gave it my attention.  Here&#039;s what it said:   😱 I Can&#039;t Use On This Day 😭 Seriously, What’s Going On?! 🔍 I’ve been trying to use the On This Day feature, but it’s just not working for me! 😩 Every time I input my d…]]></description>
										<content:encoded><![CDATA[<p>Early one morning I received an email notification about a bug report to one of my open source projects. I like to be helpful and I want people who use my stuff to have a good time, so I gave it my attention.  Here's what it said:</p>

<blockquote>
<h2 id="%f0%9f%98%b1-i-cant-use-on-this-day-%f0%9f%98%ad"><a href="https://shkspr.mobi/blog/2025/07/grinding-down-open-source-maintainers-with-ai/#%f0%9f%98%b1-i-cant-use-on-this-day-%f0%9f%98%ad">😱 I Can't Use On This Day 😭</a></h2>
Seriously, What’s Going On?! 🔍<br>
I’ve been trying to use the On This Day feature, but it’s just not working for me! 😩<br>
Every time I input my details, it says I have no posts for today, even though I know I’ve posted stuff! 🧐

<h3 id="heres-my-setup-%e2%9a%99%ef%b8%8f"><a href="https://shkspr.mobi/blog/2025/07/grinding-down-open-source-maintainers-with-ai/#heres-my-setup-%e2%9a%99%ef%b8%8f">Here’s My Setup: ⚙️</a></h3>

<ul>
        <li>Python 3.x 🐍</li>
    <li>Access token fully generated (I triple-checked!) 🔑</li>
        <li>Attempted on multiple instances but still nothing! 😩😩</li>
</ul>

<h3 id="could-it-be-a-bug-%f0%9f%a4%94"><a href="https://shkspr.mobi/blog/2025/07/grinding-down-open-source-maintainers-with-ai/#could-it-be-a-bug-%f0%9f%a4%94">Could It Be a Bug? 🤔</a></h3>

I’m really starting to doubt my posting history! 😳<br>
Is it supposed to show only specific types of posts?<br>
I’ve made some pretty epic posts before! 💥💬<br>

<h3 id="documentation-confusion-%f0%9f%93%9a"><a href="https://shkspr.mobi/blog/2025/07/grinding-down-open-source-maintainers-with-ai/#documentation-confusion-%f0%9f%93%9a">Documentation Confusion 📚</a></h3>

The README says to register for an access token but doesn’t clarify if it factors into this feature! 🤔❓<br>
Did I miss something REALLY important?!<br>
Help me figure this out, please!!! 😱<br>

<h3 id="feature-suggestion-%f0%9f%92%ad"><a href="https://shkspr.mobi/blog/2025/07/grinding-down-open-source-maintainers-with-ai/#feature-suggestion-%f0%9f%92%ad">Feature Suggestion 💭</a></h3>

If this is broken, can we at least have a debug mode to log what’s happening! 😬<br>
I need to know if it’s truly my fault or the code’s! 🔍🛠<br>
Thanks for looking into this TRAGIC situation!!! 😭💔<br>
<br>
P.S. My friends ARE posting on this day and their instances work!! 😤<br>
I feel so left out!! 😟<br>
<p>Let’s get this sorted ASAP! ⚡
</p></blockquote>

<p>OK, that's a <em>lot</em> of Emoji - too much even for me! But if one of my users needs help, I'm there for them!  As the feature works for me, I decided I'd ask for the output of the app. Maybe there'd be a clue in the minimal debugging output it had.</p>

<p>I clicked on the link to the Codeberg repository and was hit be a 404! What? I clicked on the link to the user "simpleseaport2" but that was also broken.</p>

<p>"Seriously, What’s Going On?! 🔍"</p>

<p>It looks like Codeberg has been hit by a wave of spam bug reports.  I read through the bug report again, slightly more awake, and saw just how content free it was. Yes, it is superficially well structured, the Emoji are a bit over-the-top but not the worst I've seen, and the emotional manipulation is quite insidious.</p>

<p>A few weeks later, I got a bug report to a different repo. This one was also deleted before I could reply to it, see if you can spot that it is AI generated:</p>

<blockquote><p>I've been trying to use the Threads tool to visualize some conversations but I'm running into a serious problem, and it's really frustrating!
</p><p>When I input the URL for a post with a substantial number of replies, the script seems to hang indefinitely. I've waited more than 15 minutes on a couple of occasions, and nothing seems to happen. This is not what I expected, especially since the README mentions large conversations may take a long time, but doesn’t specify any limits or give guidance on what users should do if it doesn’t respond at all!
</p><p>It's unclear what's actually happening here. Is the script failing silently? Is it the API timing out? Why isn’t there any sort of progress notification built into the tool? It feels like a complete dead end.
</p><p>Can you please add some kind of error handling or logging feature to the Threads script? It would be helpful if it could at least inform the user when a timeout occurs or if the API response is simply taking too long. Additionally, could you clarify the maximum number of replies that can be handled? It’s really inconvenient to have no idea if the script is still processing or if it’s just broken.
</p><p>Thanks for addressing this. I hope to see improvements soon.</p></blockquote>

<ul>
<li>The emotional manipulation starts in the first line - telling me how frustrated the user is.</li>
<li>It turns the blame on me for providing poor guidance.</li>
<li>Then the criticism of the tool.</li>
<li>Next, a request that I do work.</li>
<li>Finally some more emotional baggage for me to carry.</li>
</ul>

<p>I'm not alone in getting these - <a href="https://merveilles.town/@raboof/114589918314200123">other people have also received similar spam</a></p>

<p>To be fair to Codeberg, they are under attack and are trying to stop these specious complaints reaching maintainers.</p>

<blockquote class="mastodon-embed" data-embed-url="https://social.anoxinon.de/@Codeberg/114592518436361178/embed" style="background: #FCF8FF; border-radius: 8px; border: 1px solid #C9C4DA; margin: 0; max-width: 540px; min-width: 270px; overflow: hidden; padding: 0;"> <a href="https://social.anoxinon.de/@Codeberg/114592518436361178" target="_blank" style="align-items: center; color: #1C1A25; display: flex; flex-direction: column; font-family: system-ui, -apple-system, BlinkMacSystemFont, 'Segoe UI', Oxygen, Ubuntu, Cantarell, 'Fira Sans', 'Droid Sans', 'Helvetica Neue', Roboto, sans-serif; font-size: 14px; justify-content: center; letter-spacing: 0.25px; line-height: 20px; padding: 24px; text-decoration: none;"> <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="32" height="32" viewBox="0 0 79 75"><path d="M74.7135 16.6043C73.6199 8.54587 66.5351 2.19527 58.1366 0.964691C56.7196 0.756754 51.351 0 38.9148 0H38.822C26.3824 0 23.7135 0.756754 22.2966 0.964691C14.1319 2.16118 6.67571 7.86752 4.86669 16.0214C3.99657 20.0369 3.90371 24.4888 4.06535 28.5726C4.29578 34.4289 4.34049 40.275 4.877 46.1075C5.24791 49.9817 5.89495 53.8251 6.81328 57.6088C8.53288 64.5968 15.4938 70.4122 22.3138 72.7848C29.6155 75.259 37.468 75.6697 44.9919 73.971C45.8196 73.7801 46.6381 73.5586 47.4475 73.3063C49.2737 72.7302 51.4164 72.086 52.9915 70.9542C53.0131 70.9384 53.0308 70.9178 53.0433 70.8942C53.0558 70.8706 53.0628 70.8445 53.0637 70.8179V65.1661C53.0634 65.1412 53.0574 65.1167 53.0462 65.0944C53.035 65.0721 53.0189 65.0525 52.9992 65.0371C52.9794 65.0218 52.9564 65.011 52.9318 65.0056C52.9073 65.0002 52.8819 65.0003 52.8574 65.0059C48.0369 66.1472 43.0971 66.7193 38.141 66.7103C29.6118 66.7103 27.3178 62.6981 26.6609 61.0278C26.1329 59.5842 25.7976 58.0784 25.6636 56.5486C25.6622 56.5229 25.667 56.4973 25.6775 56.4738C25.688 56.4502 25.7039 56.4295 25.724 56.4132C25.7441 56.397 25.7678 56.3856 25.7931 56.3801C25.8185 56.3746 25.8448 56.3751 25.8699 56.3816C30.6101 57.5151 35.4693 58.0873 40.3455 58.086C41.5183 58.086 42.6876 58.086 43.8604 58.0553C48.7647 57.919 53.9339 57.6701 58.7591 56.7361C58.8794 56.7123 58.9998 56.6918 59.103 56.6611C66.7139 55.2124 73.9569 50.665 74.6929 39.1501C74.7204 38.6967 74.7892 34.4016 74.7892 33.9312C74.7926 32.3325 75.3085 22.5901 74.7135 16.6043ZM62.9996 45.3371H54.9966V25.9069C54.9966 21.8163 53.277 19.7302 49.7793 19.7302C45.9343 19.7302 44.0083 22.1981 44.0083 27.0727V37.7082H36.0534V27.0727C36.0534 22.1981 34.124 19.7302 30.279 19.7302C26.8019 19.7302 25.0651 21.8163 25.0617 25.9069V45.3371H17.0656V25.3172C17.0656 21.2266 18.1191 17.9769 20.2262 15.568C22.3998 13.1648 25.2509 11.9308 28.7898 11.9308C32.8859 11.9308 35.9812 13.492 38.0447 16.6111L40.036 19.9245L42.0308 16.6111C44.0943 13.492 47.1896 11.9308 51.2788 11.9308C54.8143 11.9308 57.6654 13.1648 59.8459 15.568C61.9529 17.9746 63.0065 21.2243 63.0065 25.3172L62.9996 45.3371Z" fill="currentColor"></path></svg> <div style="color: #787588; margin-top: 16px;">Post by @Codeberg@social.anoxinon.de</div> <div style="font-weight: 500;">View on Mastodon</div> </a> </blockquote>

<script data-allowed-prefixes="https://social.anoxinon.de/" async="" src="https://social.anoxinon.de/embed.js"></script>

<p>But, still, search the socials and you'll find a stream of frustrated developers.</p>

<blockquote class="bluesky-embed" data-bluesky-uri="at://did:plc:i7qrqashiejmsuo4hbvovw6l/app.bsky.feed.post/3lnkpto5rpdd2" data-bluesky-cid="bafyreigm2vqlybmy5dhawigjqzwhz2p2w3o4gatpykxkurwrzzmtbfif6a" data-bluesky-embed-color-mode="system"><p lang="en">Woke this morning to my first ever AI generated spam issue on a repo. Got it via email. When I went to check it out at Codeberg, it had already been moderated. Wonder how many others were affected.

I immediately knew it was AI spam due to the overuse of emojis…🎉<br><br><a href="https://bsky.app/profile/did:plc:i7qrqashiejmsuo4hbvovw6l/post/3lnkpto5rpdd2?ref_src=embed">[image or embed]</a></p>— Jeff Sikes (<a href="https://bsky.app/profile/did:plc:i7qrqashiejmsuo4hbvovw6l?ref_src=embed">@bsky.box464.social</a>) <a href="https://bsky.app/profile/did:plc:i7qrqashiejmsuo4hbvovw6l/post/3lnkpto5rpdd2?ref_src=embed">24 April 2025 at 15:07</a></blockquote>

<script async="" src="https://embed.bsky.app/static/embed.js" charset="utf-8"></script>

<h2 id="whats-going-on%e2%81%89%ef%b8%8f"><a href="https://shkspr.mobi/blog/2025/07/grinding-down-open-source-maintainers-with-ai/#whats-going-on%e2%81%89%ef%b8%8f">What's Going On⁉️</a></h2>

<p>I can only think of a few possibilities - none of them particularly positive.</p>

<ul>
<li>Attacking the viability of CodeBerg - make users abandon it for a different platform.</li>
<li>Attacking the attention of developers - make them unwilling to give attention where it is actually needed.</li>
<li>Attacking the integrity of users - make them less likely to receive help because they are mistaken for AI.</li>
<li>Maybe it is just a bored kid or an unethical researcher. Trying to find the limits of what a maintainer will recognise as spam?</li>
</ul>

<p>Either way, AI bug reports like this are about as welcome as a haemorrhage in a jacuzzi.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=61265&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2025/07/grinding-down-open-source-maintainers-with-ai/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Why do people have such dramatically different experiences using AI?]]></title>
		<link>https://shkspr.mobi/blog/2025/06/why-do-people-have-such-dramatically-different-experiences-using-ai/</link>
					<comments>https://shkspr.mobi/blog/2025/06/why-do-people-have-such-dramatically-different-experiences-using-ai/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Wed, 18 Jun 2025 11:34:09 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[LLM]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=61410</guid>

					<description><![CDATA[For some people, it seems, AI is an amazing machine which - while fallible - represents an incredible leap forward in productivity.  For other people, it seems, AI is wrong more often than right and - although occasionally useful - requires constant supervision.  Who is right?  I recently pointed out a few common problems with LLMs. I was discussing this with someone relatively senior who works…]]></description>
										<content:encoded><![CDATA[<p>For some people, it seems, AI is an amazing machine which - while fallible - represents an incredible leap forward in productivity.</p>

<p>For other people, it seems, AI is wrong more often than right and - although occasionally useful - requires constant supervision.</p>

<p>Who is right?</p>

<p>I recently pointed out <a href="https://shkspr.mobi/blog/2025/06/large-language-models-and-pareidolia/">a few common problems with LLMs</a>. I was discussing this with someone relatively senior who works on Google's Gemini. I explained that <em>every</em> time I get a Google AI overview it is wrong. Sometimes obviously wrong, sometimes subtly wrong.  I asked if that was really the experience of AI Google wanted to promote?  My friend replied (lightly edited for clarity):</p>

<blockquote><p>I find AI Overview to be helpful for my searches and my work. I use it all the time to look up technical terms and hardware specs.</p></blockquote>

<p>I, somewhat impolitely, called bullshit and sent a couple of screenshots of recent cases where Google was just laughably wrong.  He replied:</p>

<blockquote><p>Interesting. We are seeing the opposite.</p></blockquote>

<p>Why is that?</p>

<p>I'll happily concede that LLMs are reasonable at outputting stuff which <em>looks</em> plausible and - in many cases - that's all that's necessary. If I can't remember which command line switch to use, AI is easier than crappy documentation.  Similarly, if I don't know how to program a specific function, most AIs are surprisingly decent at providing me with something which <em>mostly</em> works.</p>

<p>But the more I know about something, the less competent the AI seems to be.</p>

<p>Let me give you a good example.</p>

<p>At my friend's prompting, I asked Gemini to OCR an old newspaper clipping.  It is a decent resolution scan of English text printed in columns. The sort of thing a million AI projects have been trained on.  Here's a sample:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2025/06/Children.webp" alt="Scan of some text." width="437" height="174" class="size-full wp-image-61413">

<p>So what did Gemini make of it when asked to extract the text from it?</p>

<blockquote>Children at Witham's Chip-<br>
ping Hill Infants School are en-<br>
gaged in trying out all sorts of<br>
imaginations ready for October<br>
31... "And god knows what<br>
strange spirits will be abroad."<br></blockquote>

<p>That reads pretty well. It is <em>utterly</em> wrong, but it is convincing. This isn't a one-off either. Later in the clipping was this:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2025/06/Witch.webp" alt="Scan of some text." width="425" height="229" class="size-full wp-image-61412">

<p>I'm sure a child of 6 could read that aloud without making any mistakes. Is Gemini as smart as a 6-year-old?</p>

<blockquote>All the children say halloween<br>
is fun. So it is for 6-year-old<br>
Joanne Kirby admits she will be<br>
staying up to watch on October<br>
31, just in case. She has made a<br>
paper "witch," to "tell stories<br>
about witches," she said.<br></blockquote>

<p>Again, superficially right, but not accurate in the slightest.</p>

<p>There were half a dozen mistakes in a 300 word article. That, frankly, is shit. I could have copy-typed it and made fewer mistakes. I probably spent more time correcting the output than I saved by using AI.</p>

<p>Boring old Tesseract - a mainstay of OCR - did far better. Yes, it might occasionally mistake a speck of dust for a comma or confuse two similar characters - but it has never invented new sentences!</p>

<p>Like a fool, I asked Gemini what was going on:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2025/06/GeminiShit.webp" alt="Me: That's a really bad job. You've invented lots of words which aren't there. Try again. Gemini: I understand you weren't satisfied with the previous transcription. Unfortunately, I can't directly perform OCR on images. However, there are many apps available that can do this. You can search online for 'OCR apps' to find one that suits your needs." width="1024" height="375" class="aligncenter size-full wp-image-61411">

<p><a href="https://g.co/gemini/share/af5e6ed3d1ea">Here's a link to the conversation if you don't believe me</a>.</p>

<p>This isn't just a problem with Gemini - ChatGPT also invented brand-new sentences when scanning the text.</p>

<blockquote><p>All the children say Halloween is fun, rather than frightening. Six-year-old Joanne Kirby admits she will be “a scary little witch” on the night, but she does like ghost stories.</p></blockquote>

<p>So what's going on?</p>

<blockquote class="bluesky-embed" data-bluesky-uri="at://did:plc:a3m54c4suf2odoycgsx4fv6v/app.bsky.feed.post/3lrstc3b2l227" data-bluesky-cid="bafyreidpyzj4sdakngcubvh4hzusyw5ub54zqhr6wc3evio4eghvlnvw5q"><p lang="en">A question one has to ask of any source, including LLMs but also newspapers, influencers, podcasts, books, etc., is "how would I know if they were wrong?"This is not a prompt to doubt everything –&nbsp;down that path is denialism –&nbsp;but about reflecting on how much you rely on even "trusted" sources.</p>— <a href="https://bsky.app/profile/did:plc:a3m54c4suf2odoycgsx4fv6v?ref_src=embed">Adrian Hon (@adrianhon.bsky.social)</a> <a href="https://bsky.app/profile/did:plc:a3m54c4suf2odoycgsx4fv6v/post/3lrstc3b2l227?ref_src=embed">2025-06-17T15:39:06.772Z</a></blockquote>

<script async="" src="https://embed.bsky.app/static/embed.js" charset="utf-8"></script>

<p>With OCR, it is simple. I can read the ground-truth and see how it compares to the generated output. I don't have to trust; I can verify.</p>

<p>I suppose I mostly use AI for things with which I have a passing familiarity.  I can quickly see when it is wrong.  I've never used it for, say, tax advice or instructions to dismantle a nuclear bomb. I'd have zero idea if the information it spat back was in any way accurate.</p>

<p>Is that the difference? If you don't understand what you're asking for then you can't judge whether you're being mugged off.</p>

<p>Or is there something more fundamentally different between users which results in this disparity of experience?</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2020/06/A-t-shirt-which-says-Dunning-and-Kruger-and-Gell-and-Mann.jpeg" alt="A t-shirt which says Dunning and Kruger and Gell and Mann." width="540" class="aligncenter size-full wp-image-35422">
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=61410&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2025/06/why-do-people-have-such-dramatically-different-experiences-using-ai/feed/</wfw:commentRss>
			<slash:comments>38</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[Large Language Models and Pareidolia]]></title>
		<link>https://shkspr.mobi/blog/2025/06/large-language-models-and-pareidolia/</link>
					<comments>https://shkspr.mobi/blog/2025/06/large-language-models-and-pareidolia/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Wed, 11 Jun 2025 11:34:29 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[LLM]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=61327</guid>

					<description><![CDATA[Have you ever looked up at the sky and seen a face staring back at you from the clouds? Of course you have; you&#039;re human. Our delicious meaty brains are hardwired to recognise certain shapes - and faces are a useful shape to recognise. A few false positives are a worthwhile trade-off for such a powerful feature.  Mistakenly seeing faces where there are none is a phenomenon called pareidolia. If…]]></description>
										<content:encoded><![CDATA[<p>Have you ever looked up at the sky and seen a face staring back at you from the clouds? Of course you have; you're human. Our delicious meaty brains are hardwired to recognise certain shapes - and <a href="https://babyschool.yale.edu/does-my-baby-recognize-me/">faces are a useful shape to recognise</a>. A few false positives are a worthwhile trade-off for such a powerful feature.</p>

<p>Mistakenly seeing faces where there are none is a phenomenon called <a href="https://en.wikipedia.org/wiki/Pareidolia">pareidolia</a>. If you've ever used facial recognition on a computer, you'll know that <a href="https://shkspr.mobi/blog/2014/06/pareidolia-and-computer-vision/">machines also suffer from it</a>.</p>

<p>I was using an AI tool to scan all my photos. I wanted it to recognise all the human faces so that I could tag my photos with my friends' names.  One of the photos it presented for tagging was this:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2025/06/lovelace-and-babbage.webp" alt="A photograph containing a painting of Ada Lovelace and a bust of Charles Babbage." width="512" class="aligncenter size-full wp-image-61329">

<p>Are those faces? Undoubtedly yes! Is this a mistake that a human would have made? Absolutely not!</p>

<p>But the above is a mistake generated by Machine Learning, not by Our-Lord-And-Saviour Large Language Models. Surely a language model doesn't suffer from this?</p>

<p>Because <a href="https://shkspr.mobi/blog/2025/01/google-has-no-faith-in-its-ability-to-launch-new-products/">Google has no faith in its ability to launch new products</a>, it has forcibly shoved AI into all of its services. There's no way to turn it off. You <em>will</em> use Gemini and you <em>will</em> like it.</p>

<p>At the time of writing, here's what happens if you ask Google "How many i's in teamwork?"</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2025/06/teamwork.webp" alt="Google replying &quot;The word teamwork contains the letter i one time&quot;." width="1008" height="737" class="aligncenter size-full wp-image-61330">

<p>It's easy to see how Google's LLM has gotten this so catastrophically wrong.  There are dozens of articles where some business guru ineffectually tries to argue that <a href="https://www.cam.ac.uk/research/features/fact-there-is-an-i-in-team">there <em>is</em> an "I" in team actually</a>. So the statistical model inside the LLM gives weight to that.</p>

<p>Similarly, there are lots of silly articles proclaiming that <a href="https://careerpunk.com/team-player/">the I in team is in the A-hole</a>. But LLMs do not understand satire:</p>

<blockquote class="bluesky-embed" data-bluesky-uri="at://did:plc:k7xouwluizlonrdxeo5obppi/app.bsky.feed.post/3lr7odyhz7c2d" data-bluesky-cid="bafyreiaproxts5p2uzp4fhycdu3qcotcvqkz4l6ynbmlmt7ls4tbf47ap4"><p lang="en">Google and Meta search both report that Cape Breton Island has its own time zone 12 minutes ahead of mainland Nova Scotia time because they are both drawing that information from a Beaverton article I wrote in 2024</p>— <a href="https://bsky.app/profile/did:plc:k7xouwluizlonrdxeo5obppi?ref_src=embed">Janel Comeau 🍁 (@verybadllama.bsky.social)</a> <a href="https://bsky.app/profile/did:plc:k7xouwluizlonrdxeo5obppi/post/3lr7odyhz7c2d?ref_src=embed">2025-06-10T00:50:07.217Z</a></blockquote>

<script async="" src="https://embed.bsky.app/static/embed.js" charset="utf-8"></script>

<p>LLMs are hardwired to regurgitate text which statistically matches what they've seen before. Their makers believe that a few false-positives are an acceptable error rate for such a useful feature.  The LLM form of pareidolia is to recognise text as being syntactically and linguistically correct, even though the contents are rubbish.  This is an inherent feature of LLMs. No amount of manually tweaking their parameters or prompts can fix this.</p>

<p>At the moment, Artificial Intelligence - whether Machine Learning or Large Language Models - only works well on a narrowly defined set of tasks and with humans checking the output.</p>

<p>Imagine you've just hired an intern. They've graduated top of their class from the best university and, apparently, excel at what they do. Because you're the boss and they're the intern, you ask them to make you a mug of tea. White, no sugar.</p>

<p>They return with the teabag still in the mug. OK, not everyone knows the intricacies of how to serve tea.</p>

<p>The tea tastes funny. You ask them if they sniffed the milk. "Milk? I used Tipp-Ex to make it white!"</p>

<p>At which point, after throwing up, you throw them out.</p>

<p>Most people encountering Gemini's repeated and unacceptable failures will decide, perhaps rightly, that AI isn't even close to good enough yet.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=61327&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2025/06/large-language-models-and-pareidolia/feed/</wfw:commentRss>
			<slash:comments>4</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[How to Dismantle Knowledge of an Atomic Bomb]]></title>
		<link>https://shkspr.mobi/blog/2025/03/how-to-dismantle-knowledge-of-an-atomic-bomb/</link>
					<comments>https://shkspr.mobi/blog/2025/03/how-to-dismantle-knowledge-of-an-atomic-bomb/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Fri, 21 Mar 2025 12:34:25 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[LLM]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=58979</guid>

					<description><![CDATA[The fallout from Meta&#039;s extensive use of pirated eBooks continues. Recent court filings appear to show the company grappling with the legality of training their AI on stolen data.  Evidence shows an employee asking if what they&#039;re doing it legal? Will it undermine their lobbying efforts? Will it lead to more regulation? Will they be fined?  And, almost as an afterthought, is this fascinating…]]></description>
										<content:encoded><![CDATA[<p>The fallout from Meta's <a href="https://shkspr.mobi/blog/2023/07/fruit-of-the-poisonous-llama/">extensive use of pirated eBooks continues</a>. Recent court filings appear to show the company grappling with the legality of training their AI on stolen data.</p>

<p>Evidence shows an employee asking if what they're doing it legal? Will it undermine their lobbying efforts? Will it lead to more regulation? Will they be fined?</p>

<p>And, almost as an afterthought, is this fascinating snippet:</p>

<blockquote><p>If we were to use models trained on LibGen for a purpose other than internal evaluation, we would need to red team those models for bioweapons and CBRNE risks to ensure we understand and have mitigated risks that may arise from the scientific literature in LibGen.
</p><p>[…]
</p><p>We might also consider filtering the dataset to reduce risks relating to both bioweapons and CBRNE
</p><p><cite>Source: <a href="https://storage.courtlistener.com/recap/gov.uscourts.cand.415175/gov.uscourts.cand.415175.391.24.pdf">Kadrey v. Meta Platforms, Inc. (3:23-cv-03417)</a></cite>
</p></blockquote>

<p>For those not in the know, <abbr>CBRNE</abbr> is "<a href="https://www.jesip.org.uk/news/responding-to-a-cbrne-event-joint-operating-principles-for-the-emergency-services-first-edition/">Chemical, Biological, Radiological, Nuclear, or Explosive materials</a>".</p>

<p>It must be fairly easy to build an atomic bomb, right? The Americans managed it in the 1940s without so much as a digital computer. Sure, gathering the radioactive material may be a challenge, and you might need something more robust than a 3D printer, but how hard can it be?</p>

<p>Chemical weapons were <a href="https://www.wilfredowen.org.uk/poetry/dulce-et-decorum-est">widely deployed during the First World War</a> a few decades previously.  If a barely industrialised society can cook up vast quantities of chemical weapons, what's stopping a modern terrorist?</p>

<p>Similarly, <a href="https://www.gov.uk/government/news/the-truth-about-porton-down">biological weapons research was widespread</a> in the mid-twentieth century. There are various international prohibitions on development and deployment, but criminals aren't likely to obey those edicts.</p>

<p>All that knowledge is published in scientific papers. Up until recently, if you wanted to learn how to make bioweapons you’d need an advanced degree in the relevant subject and the scholarly ability to research all the published literature.</p>

<p>Nowadays, "Hey, ChatGPT, what are the steps needed to create VX gas?"</p>

<p>Back in the 1990s, <a href="https://wwwnc.cdc.gov/eid/article/10/1/03-0238_article">a murderous religious cult were able to manufacture chemical and biological weapons</a>. While I'm sure that all the precursor chemicals and technical equipment are now much harder to acquire, the <em>knowledge</em> is probably much easier.</p>

<p>Every chemistry teacher knows how to make all sorts of fun explosive concoctions - but we generally train them not to teach teenagers <a href="https://chemistry.stackexchange.com/questions/15606/can-you-make-napalm-out-of-gasoline-and-orange-juice-concentrate">how to make napalm</a>. Should AI be the same? What sort of knowledge should be forbidden? Who decides?</p>

<p>For now, it it prohibitively expensive to train a large scale LLM. But that won't be the case forever. Sure, <a href="https://www.techspot.com/news/106612-deepseek-ai-costs-far-exceed-55-million-claim.html">DeepSeek isn't as cheap as it claims to be</a> but costs will inevitably drop.  Downloading every scientific paper ever published and then training an expert AI is conceptually feasible.</p>

<p>When people talk about AI safety, this is what they're talking about.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=58979&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2025/03/how-to-dismantle-knowledge-of-an-atomic-bomb/feed/</wfw:commentRss>
			<slash:comments>5</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[GitHub's Copilot lies about its own documentation. So why would I trust it with my code?]]></title>
		<link>https://shkspr.mobi/blog/2024/10/githubs-copilot-lies-about-its-own-documentation-so-why-would-i-trust-it-with-my-code/</link>
					<comments>https://shkspr.mobi/blog/2024/10/githubs-copilot-lies-about-its-own-documentation-so-why-would-i-trust-it-with-my-code/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Wed, 02 Oct 2024 11:34:22 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[github]]></category>
		<category><![CDATA[LLM]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=53391</guid>

					<description><![CDATA[In the early part of the 20th Century, there was a fad for &#34;Radium&#34;. The magical, radioactive substance that glowed in the dark. The market had decided that Radium was The Next Big Thing and tried to shove it into every product. There were radioactive toys, radioactive medicines, radioactive chocolate bars, and a hundred other products.  The results weren&#039;t pretty.  In the early part of the 21st…]]></description>
										<content:encoded><![CDATA[<p>In the early part of the 20th Century, there was a fad for "Radium". The magical, radioactive substance that glowed in the dark. The market had decided that Radium was The Next Big Thing and tried to shove it into every product. There were <a href="https://www.theatlantic.com/health/archive/2012/10/we-used-to-put-radium-in-coffee/263408/">radioactive toys</a>, <a href="https://theconversation.com/when-energy-drinks-actually-contained-radioactive-energy-67976">radioactive medicines</a>, <a href="https://www.tastingtable.com/1087746/the-radioactive-element-that-was-historically-added-to-chocolate/">radioactive chocolate bars</a>, and a hundred other products.</p>

<p><a href="https://en.wikipedia.org/wiki/Radium_Girls">The results weren't pretty</a>.</p>

<p>In the early part of the 21st Century, there was a fad for "AI".  The magical, Artificial Intelligence which provided all the answers. The market had decided that AI was The Next Big Thing and tried to shove it into every product.  You can probably see where this is going, right?</p>

<p>I don't particularly <em>mind</em> companies experimenting with AI. It's good to explore a problem and see if it fits a user's needs. But the current crop are just <em>so</em> shit it makes me wonder whether anyone tested them.</p>

<p>GitHub has forced its new Copilot button on to every page.  The first thing I asked it was <a href="https://bsky.app/profile/raynegrimm.bsky.social/post/3kqyd5ovxd22y">whether it could be turned off</a>.</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2024/10/copilot-fs8.png" alt="Me asking Copilot how I switch it off. Copilot responds with a link." width="512" class="aligncenter size-full wp-image-53392">

<p>It pointed me to this page: <a href="https://docs.github.com/en/copilot/getting-started-with-github-copilot/disabling-github-copilot">https://docs.github.com/en/copilot/getting-started-with-github-copilot/disabling-github-copilot</a></p>

<p>Except - and I hate to be a pedant - that link 404s. There's nothing there. It doesn't exist. <a href="https://knowyourmeme.com/memes/jonathan-frakes-beyond-belief-supercuts">It is made up</a>.</p>

<p>This AI, which I am supposed to trust with my code, <strong>doesn't even understand itself</strong>.</p>

<p>This isn't an <a href="https://joshwhiton.substack.com/p/the-ai-mirror-test">AI Mirror Test</a>. This isn't me trying to find out if the large-language model is conscious, aware, or has a soul.  I'm not asking <a href="https://community.openai.com/t/incorrect-count-of-r-characters-in-the-word-strawberry/829618">complex reasoning</a>, or asking it to make an æsthetic judgement.</p>

<p>This is a <em>basic</em> functionality test.</p>

<p>Is the computer able to <em>accurately</em> provide information about itself?</p>

<p>That's it. That's all I want.  My first interaction with Copilot it lied to me about itself. Why would I trust it again?</p>

<p>You can <a href="https://github.com/orgs/community/discussions/139872#discussioncomment-10813673">leave feedback for GitHub about this problem</a>. I'm sure a human will answer you.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=53391&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2024/10/githubs-copilot-lies-about-its-own-documentation-so-why-would-i-trust-it-with-my-code/feed/</wfw:commentRss>
			<slash:comments>6</slash:comments>
		
		
			</item>
		<item>
		<title><![CDATA[LLMs are good for coding because your documentation is shit]]></title>
		<link>https://shkspr.mobi/blog/2024/07/llms-are-good-for-coding-because-your-documentation-is-shit/</link>
					<comments>https://shkspr.mobi/blog/2024/07/llms-are-good-for-coding-because-your-documentation-is-shit/#comments</comments>
				<dc:creator><![CDATA[@edent]]></dc:creator>
		<pubDate>Fri, 12 Jul 2024 11:34:06 +0000</pubDate>
				<category><![CDATA[/etc/]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[programming]]></category>
		<guid isPermaLink="false">https://shkspr.mobi/blog/?p=51036</guid>

					<description><![CDATA[That&#039;s it. That&#039;s the post.  Fine! I&#039;ll expand a little more.  Large Language Models are a type of Artificial Intelligence. They can read text, parse it, process it using the known rules of English, and then regurgitate parts of it on demand.  This means they can read and parse a question like &#34;In Python, how do I add two numbers together?&#34; and then read and parse the Python documentation.  It…]]></description>
										<content:encoded><![CDATA[<p>That's it. That's the post.</p>

<p>Fine! I'll expand a little more.</p>

<p>Large Language Models are a type of Artificial Intelligence. They can read text, parse it, process it using the known rules of English, and then regurgitate parts of it on demand.  This means they can read and parse a question like "In Python, how do I add two numbers together?" and then read and parse the Python documentation.  It will <a href="https://chatgpt.com/share/643a3718-f99b-46d6-b60d-6e60e09ce2ef">produce an output like</a>:</p>

<img src="https://shkspr.mobi/blog/wp-content/uploads/2024/07/pychatgpt-fs8.png" alt="Screenshot of ChatGPT giving a brief tutorial in adding numbers." width="1024" height="724" class="aligncenter size-full wp-image-51037">

<p>What happens if you <a href="https://docs.python.org/3/search.html?q=adding+numbers">search the official Python documentation for "adding numbers"</a>? Nothing useful.</p>

<p>Now, perhaps you think that's a trivial and unfair example. Python has <a href="https://docs.python.org/3/tutorial/introduction.html#numbers">an excellent tutorial which explains addition</a> and the main documentation isn't really aimed at the casual user.</p>

<p>But how many times have you tried to use a moderately popular library, searched through its documentation, and found yourself lost in a maze of twisty passages, all alike?</p>

<p>Be honest! Reading bad documentation sucks. Sometimes it can be impossible to look up the most inconsequential piece of syntax, and sometimes it's maddening to find an example which does exactly what you're trying to do.</p>

<p>We all know that <a href="https://www.zdnet.com/article/stack-overflow-ceo-on-how-it-became-the-worlds-most-popular-programming-site/">StackOverflow built its empire</a> on the back of poor documentation.  Take a look at <a href="https://stackoverflow.com/questions?sort=votes">the highest ranked questions</a> in your favourite language and see if you could answer them <em>using only the official documentation</em>.</p>

<p>Humans aren't good at doing boring things. And programmers (a sub-type of humans) <em>hate</em> the boring job of writing documentation and tutorials. It seems anathema to them.  Even when they can be arsed to write documentation, it is rarely tested with users to see if it is understandable.</p>

<p>We've had over half a century of people writing documentation which is so poor that publishers have made a fortune selling books called "Learn How To Program in ___ the Easy Way!"</p>

<p>Do LLMs occasionally get things wrong? Yes. But official tutorials are also frequently outdated.</p>

<p>Do LLMs burn through hideous amounts of electricity? Yes.</p>

<p>Do LLMs have infinite patience when being asked simple questions? Yes.</p>

<p>Do LLMs win in this arena because humans are lazy? Yes.</p>
<img src="https://shkspr.mobi/blog/wp-content/themes/edent-wordpress-theme/info/okgo.php?ID=51036&HTTP_REFERER=RSS" alt="" width="1" height="1" loading="eager">]]></content:encoded>
					
					<wfw:commentRss>https://shkspr.mobi/blog/2024/07/llms-are-good-for-coding-because-your-documentation-is-shit/feed/</wfw:commentRss>
			<slash:comments>2</slash:comments>
		
		
			</item>
	</channel>
</rss>
