Before we got overexcited about AI text-to-image, we had… options.

Robert Russell
8 min readApr 24, 2023

--

Distractions. Life is full of them. I have a stack of drafts to work on and have chosen to finish this one — for no particular reason. Other than to get something finished.

Apologies for any delay here, but I’m swamped, but in a good way.

I was musing over who or what will adapt best, or most quickly, to a new age of increasing AI integration and what will not persist. Or fail to proceed. And then I was distracted, mostly by the usual AI-related hype and spin.

So, this draft will adapt accordingly, now sprinkled with new, up-to-the-hour insights and observations. Or not, as the case may be.

  • My original point was that AI text-to-image (and image-to-image) is amazing, sensational, and useful. But perhaps not as amazing, sensational or usable as the hype (and near-daily updates) suggests.
  • We have existing products that variously do similar or even more magical things, which were once just as staggering yet often quite ignored.
  • With or without machine learning (ML) involved, they deserved more love. What will happen to them now?

Ah, disruption. Gotta love it. Unless you are the ones in the crosshairs.

Deep Art Effects is one such program. It seemed cutting-edge for a while until suddenly we all looked to see what the commotion was…

Deep Art Effects

Another magical program I have used a lot is Filter Forge. Can’t really say that ML is embedded in its code, but it is a fantastic tool, irrespective.

Filter Forge 12

Will Filter Forge see a decline in use and sales? I hope not. Will it adapt? Hope so.

I still see a role for post-processing, obviously. But for how long? What happens when all of these filters, aggregators, add-ons, and editors merge and integrate fully — or even get close — to a one-stop shop?

  • Indeed, the more extensions I add to Automatic1111, the less need I see for other options. If it included some simple editing tools (which it almost does if you squint a little) then… wow.
  • I currently use Automatic111 and Stable Diffusion, coupled with Github, HuggingFace and Civitai, far more than Midjourney et al.
  • In short, Stable Diffusion and an endless proliferation of WebUI extensions keep me interested, engaged, and increasingly in one spot.

Plus ChatGPT, of course. A mature product that brings image generation and style transfer abilities into a multipurpose editor could be a winner. Maybe. Unless a reinvented browser, OS and cloud are all we need.

And yet even whilst such reimagined tools are being thrown at us almost daily on ProductHunt or GitHub, something related yet different again is already proliferating. Autonomous AI models like AgentGPT and a host of browser extensions leverage different approaches to embedding LLM chat into our lives. Indeed, I’m continually loading something new, breaking something else, and moving on.

It’s fast, exciting… and exhilarating. But hard to follow if you already have more than enough on your plate.

Thinking of editors, how will Adobe respond, for example? Well, they have responded with Firefly:

Adobe Firefly

And whilst not a huge leap forward, it’s at least what you might expect. Slick, controlled, safe and buttoned-down. Professional, even.

And if you have an Adobe ID, you may join the beta program. It’s probably worth your time if you care about Abode’s products at all or have a history with Photoshop. If nothing else, it gives us a clue about Adobe’s direction.

Microsoft also responded with a similar but different take, although with an arguably less exciting Designer offering:

Microsoft Designer

Workmanlike, as it were. Functional. Then again, it’s solid. Along with Bing’s chat feature appearing everywhere now, I suspect it’s just the start of a major revamp and integration of OpenAI’s expertise within Microsoft’s browser, OS, and apps.

I was also recently (like yesterday) prompted to update Topaz Video AI:

A great product, by the way. The whole Topaz AI suite is worth a look. But for how much longer will Topaz, a company that has made AI, photography and machine learning its home, persist in this new world?

Are we just one new Stable Diffusion WebUI extension away from replacing that product, or can they fend off the threat with their own special sauce and innovation? I’m barely using it now, as I can do much of my work (and have more fun) in Automatic1111.

With regard to OpenAI, and against a backdrop of constant distraction, we then get challenged to think about taking a “big LLM” pause. As if a universal or even useful pause would be easy, let alone possible. It’s like corralling cats, I suspect.

I do also suspect some conflicts of interest here. In that some — just a few, perhaps — of those calling for a pause in development may have plans to catch up and join in. Some have a deep entrepreneurial track record in this regard, perhaps.

  • As usual, it’s all about the big end of town, or Capital, or the owners of property, if you prefer.
  • But that idea — of vested interests and private advantage — will float or die according to circumstances, education, and personal worldview.
  • Mind you, I own a few shares in some of these mega-corporations, after all. Not enough to really prosper either way, but it’s a fact. Getting sucked into the vortex and finding yourself on both sides of a divisive issue is not hard.

This may be where Open AI, its management, owners, and employees find themselves right now. Trying to invent (or discover?) AGI, and wanting to do it for both their own reasons and for more noble ones as well. Whilst mindful that whoever gets there first will either go down in history for the best or perhaps the worst reasons.

They — or the remaining cofounders I am most aware of, namely Sam Altman and Ilya Sutskever — appear both smart and reasonable people who know the importance of getting this right and are taking action to make this an informed and timely public debate with minimum rancour.

An interesting list that I’ve already alluded to with regard to vested interests in a pause.

Not that you’d necessarily know much of this from the commentators, journalists, politicians and assorted opinion-givers who have done their level best to obscure the facts and spin things every which way. To think that achieving AGI in our lifetimes was once a madcap idea, back in 2015 or so. Now it’s apparently knocking on our doors.

Oh yeah, that’s way less than a decade. Although many will argue that we are nowhere close, I, for one (having chatted daily with a very polite and informed chat interface of my acquaintance) think that it’s not as far as we once thought.

But back to what came before and what deserves to persist. The more control we can exert over our new generative online (or locally hosted) AI creations, the more competitive that shiny “new” toolkit becomes.

Yet for the best part of a decade (is it that long?) I used Studio Artist as my go-to for generative art, for example. Now I find myself and my time consumed by the new stuff. But Studio Artist still stacks up. It’s AI-driven, if that matters, and remains useful. Don’t discount it for creative work.

Studio Artist 5.5 screenshot.

It has thousands of tweakable settings and batch-generation options. It also includes animation effects, will process video, and work with text. It has much more control over the final product than the current level of Stable Diffusion-style generation. And it’s different. Check it out at synthetik.com.

And AutoFX went all-in on Style Transfer with GRFX Pro, way before it was popular. I’m not sure what’s happening with this product, but it’s still available on the Microsoft Store if you want to look.

Meanwhile, it’s also worth checking out Pixbim. They have an excellent and low-cost suite of useful AI image and video tools that predate the latest mania but do much that you may wish for, off the shelf. Animate Photos is perhaps the most notable.

And the Topaz Toolkit I’ve mentioned already.

Lastly, the infinitely confusing suite of Skylum’s Luminar tools. Pick one, pick ’em all. I’m not sure, but I think I use Luminar NEO these days, but Luminar AI is an option, too. To be honest, I’ve almost given up on it. If I need a quick edit, then there are lighter tools that suit me better.

And just about everything else can be done online at RunwayML or (if you can get in) LeonardoAI, for that matter. or in Automatic1111.

But occasionally, I may drag it back out and give it a run— to see if they have finally added whatever option they have been promising forever in their copious emails. I’ve just run out of confidence in the constantly evolving product suite and the endless offers of amazing new style templates. At extra cost, of course.

You may prefer that sort of model, perhaps. And it does an amazing job once you get the hang of it.

Anyway, that’s enough for now. As always, just one person’s personal opinion. Take care.

--

--

Robert Russell

I came, I saw, I developed a habit of writing and reading. And photography. And the rest.