top of page

Stable Diffusion Has Arrived and I Am Obsessed.


Can You Believe It's Not a Photograph or Artwork?

I confess I am obsessed. The newest iteration of AI Drawing has been launched. Stable Diffusion is the most advanced text-to-image algorithm and the beta version has been released on @nightcafestudio.

The announcement has been highly anticipated, and I can say for the most part it hasn't disappointed. The clarity is superb, and the ability to reporduce the human form has improved tenfold from previous versions.



Growing Pains - Is AI at the teenage stage?

Have you ever tried to show a teenager how to use a washing machine or any other task they may not be interested in learning? In my experience they play the "you literally said that." game. They will take it to the extreme. My first experience with SD was similar. You have to be specific or it will follow your literal instructions; with not so funny results.

As an artist, I am in two minds. The results have been very 'photographic'. This is, of course, what most people were looking forward to. The current Beta release is not able to use a source image as part of the prompt, so you need to be skillful with your text input and rather knowledgable about the modifiers to get the image you want. My first prompt was "seeing myself in a mirror." with a modifier of a detailed charcoal drawing. While the mirror was very good, there was no reflection.


The second iteration was impressive. "Old woman looking at reflection in a mirror. " Again I used a detailed charcoal drawing as modifier.

Encouraged by the initial results I continued to play with various prompts and the results varied from "hard to believe it's not art" to hilarious.

I then tried 1920s flapper dancing the charleston in a smokey speakeasy and ended up with a photograph of female flappers, but the guy had a suit coat on and a fringed skirt.

Another small issue was evolving the image using the artistic algorithm. The image immediately lost quality and reverted to the "normal" inability to translate the human form. A small issue in the scheme of things and I am sure people will have hints and tips as they try different combinations.


I need time to learn how to teach the AI to fully evaluate it. My observations, from my perspective, are:



  • The results are satisfyingly photographic, for those who want that result. The detail is phenominal (with the right prompt/text input. I found I needed to go around a few time, evolving the image, and being very specific with the artistic modifiers.

  • I discovered that, even if you go right back to the start and not evolve a previous prompt, the results tend to get stuck in a loop. You need to use very specific style/artist prompts. This may, of course, be from my inexperience. I ended up with basically the same style of image over and over.

  • Once you decide to evolve an image, created with Stable Diffusion, by going back to the original algorithms (especially artistic), there seems to be a disconnect and you end up with the distorted human. Again, it is only as good as the information it has to work with, so user input is critical.

Overall, I am definitely going to explore some more and see what happens. If you want to see some of the mixed results you can find them on my YouTube Channel.



Links and References

Stable Diffusion Definition: A latent text-to-image diffusion model capable of generating photo-realistic images given any text input.

Author : Anna Buxton 8th August 2022

1 view0 comments
bottom of page