Raster. Vector. Generative.

@edentAI future images · 6 comments · 400 words · Viewed ~207 times.

When I was a kid, I "invented" a brilliant new compression format. Rather than sending a digital image of, say, the Mona Lisa a user could just send the ASCII characters "Mona Lisa". The receiving computer could look up the full image in its memory-banks and reproduce the work of art on screen. Genius! Of course, it relies on the receiver have a copy of every single image in existence, but that's just details...

It strikes me that AI might now get us part way to that being a reality.

Traditionally, images are stored in raster format - essentially a grid of pixel values. These files tend to be rather large, so compression is used to make them smaller using increasingly complex schemes.

Similarly, vector graphics - which are usually written as text - can be compressed with all sorts of wonderful algorithms.

So what happens if we add AI to the mix?

A self-hosted AI image generator takes, very roughly, a few hundred GB of disk space, plus a large amount of RAM, and some fairly hefty compute. But, once that's done, you could "compress" images by specifying an AI engine and weighted prompts like this:

<img src="data:image/AI;MidJourney/Mona_Lisa+/Original_Style+++/Ornate_Frame---/600x800/" alt="An AI generated Mona Lisa" />

Perhaps that's a little far fetched. But back when DVDs first came out, people had to buy specialised hardware cards to decode the MPEG compression on their computers. Nowadays a cheap Raspberry Pi does the trick. In the future, all our computers will have multi-terrabyte AI models baked in.

But people generally want to ensure that the receiver sees roughly the same picture as the sender. Something you can't easily guarantee with a generative model.

What about this idea from Ben Hardill?

Take something like a ThumbHash of an image, which is less than 1KB:

You could, theoretically, embed that in HTML using something like:

<img src="thumbhash.png" width="256" upscale="DALL-E" keyword="Field of grass with cloudy skies" />

And get:

I think of things like the psychoacoustic compression of MP3 files and wonder whether those crappy 64kbps rips I have from the 1990s could be magically restored to sonic perfection by an AI? Or perhaps I'll just ask ChatGPT to sing me a song, in the style of Paul McCartney, called "Maybe I'm Amazed".

I don't know if that's the future. It's certainly one future.

6 thoughts on “Raster. Vector. Generative.”

2023-03-26 12:39

mike says:

Of course no AI generated Mona Lisa will ever match the magnificence of your original.
Reply
2023-03-26 12:48

TongLen said on indieweb.social:

@Edentreally interesting - thank you 🖖🏽
Reply | Reply to original comment on indieweb.social
2023-03-26 13:31

Andrew L said on mastodon.me.uk:

@Edent although wouldn't it be nice if every visitor could be presented with their own unique version of the image? Maybe if their device has on-board processing seeded with their own image style preferences?
Reply | Reply to original comment on mastodon.me.uk
2023-03-26 15:20

m says:

Yes, it's almost as if you would point your Smartphone Camera toward the moon and zoom in and it would get all it's detailed imagery from an 'AI' source and not from it actual lens. I'm sure that would never happen 🙂
Reply
2023-03-26 16:39

Sherif Azmy says:

There’s an ongoing research topic right now called “Semantic Communications” that talks about this notion on a broader scheme. It seeks to redefine concepts such as channel capacity and achieve “Beyond Shannon Communication”. Back in the 1950s Shannon and others, in their work on information theory, identified three levels of communication: 1. Level A: how accurately can the symbols of transmission be transmitted? (The technical problem); Level B: How precisely do the transmitted symbols convey the desired meaning? (The Semantic Problem);
Level C: How effectively does the received meaning affect conduct in the desired way? (The effectiveness problem).

There is a good IEEE Magazine paper that explains it further: https://ieeexplore.ieee.org/document/9955312 I think of the AI boom as an enabler of Level B, but things are still too wide for us to develop precise systems. Demystifying the “Black Box” is one key to achieving precise abstraction of communication into semantics.
Reply
2023-03-26 20:00

Nicolas Bouilleaud said on mamot.fr:

@Edent This article in the newyorker uses the same "compression" analogy: ChatGPT is a lossy compression of the whole web. https://www.newyorker.com/tech/annals-of-technology/chatgpt-is-a-blurry-jpeg-of-the-web
Reply | Reply to original comment on mamot.fr
More comments on Mastodon.

Share this post on…

6 thoughts on “Raster. Vector. Generative.”

mike says:

TongLen said on indieweb.social:

Andrew L said on mastodon.me.uk:

m says:

Sherif Azmy says:

Nicolas Bouilleaud said on mamot.fr:

More comments on Mastodon.

What are your reckons? Cancel reply