Why I'm Skeptical of AI "Art"

OutrageousHairdo [he/him]@hexbear.net · 3 months ago

Why I'm Skeptical of AI "Art"

AlicePraxis [any]@hexbear.net · 3 months ago

preface: I’m about to disagree with something you said, which will read as a defense of generative AI. In fact I have many criticisms and reservations about the technology, but I also disagree with some of the arguments leveled against it. Keep in mind I’m not some AI loving zealot, and I actually agree with much of what you’ve said, I just think this one specific point needs to be clarified

offloading of all creative responsibility to an algorithm

Not all of it. Generative AI absolutely does involve human decision making. The choice in what to enter as a prompt is a creative decision. Now… that being said, I don’t think simple text prompting is sufficient to make good art, as it does still leave way too many arbitrary decisions to the algorithm, to your point. Good art is more than just listing a bunch of keywords.

But the process doesn’t have to be as simple as entering a single text prompt, taking what the algorithm spits out, and calling it a day. It can be an iterative process where the person continuously makes changes based on what they like or don’t like about the last generated result. Instead of just stopping at the first image, you might say, okay, this is a good start but make that mountain in the back taller and the sky darker. now remove that person standing in the background. also I don’t like the color of the subject’s hair, make it black instead, etc.

Current genAI tech is not particularly good at doing what I just described, but the technology is still in its infancy. As the tech continues to develop it will allow for more and more human input, more iteration, more creative control. But even in it’s current crude state there are many methods for having finer control over the results than a simple text prompt. Speaking in Stable Diffusion terms, there is img2img for generating an image based on another image. There is inpainting which allows you to make changes to a small portion of the original image. ControlNet allows you to generate humans with a specific pose. Regional prompter is used to control the composition of an image.

I once saw an AI generated image of Putin wearing a ballerina dress. This is, of course, a bad piece of art. It’s not bad because of the decisions made by the algorithm, it’s bad because of the poor creative decisions made by a human being.

OutrageousHairdo [he/him]@hexbear.net · 3 months ago

While I don’t think delineating “real art” from mere images is as simple as coming up with the right math formula, I do think that we can get somewhere by looking at how many decisions a human is making versus the decisions they aren’t making. For the sake of discussion, let’s ignore all the commentary pieces specifically about AI, or for which the lack of any guiding intention is a part of the artistic message, let’s just talk about plain old art. The thing that’s important to me is that every single piece of an image is a choice - every line, every brush stroke, every pixel. Take, for example, Starry Night - the reason that painting looks the way it does, looks so meaningful, isn’t just because van Gogh painted a really pretty town, it’s not just about the complete image, but the way that each individual stroke swirls and loops into each other. There were thousands of mutually reinforcing decisions that the artist made there, each movement of the brush was chosen deliberately to reinforce the piece’s intended viewing experience. The comparison to current technology is almost comically unfavorable, and while I don’t think images created with AI assistance are categorically incapable of being art, the vast majority of this material is indisputably tripe, and I would argue the use of AI in the process does something to taint the final product in many cases.

KobaCumTribute [she/her]@hexbear.net · 3 months ago

The problem with fetishizing the specific methods and intentionality of details is how that squares with available tools and materials and the purpose of a piece in the first place. That is, how intentional are details in a photograph? How much thought was put into some fuzzy background bit that’s only there because something had to be there? How intentional is some procedurally filled environment in a 3d render? How does mass produced rote decorations like Roman mosaics and statues, or older lost-wax casting fit in? So much of art is in practice just pragmatic methodology: things are reused to save labor and resources, rote actions or patterns are used because they simplify the process of creation or are part of a standard abstract language of a genre, things can be present just because they were available to use, large chunks can be handed off to other workers that are just performing some rote task to follow a description, etc.

Hitting the “give treat” button on some proprietary off-site treatbox with no controls can only produce nonsense, but the actual mechanical stuff that facilitates that can - in an open source and locally-run package - be used in the same manner as prefab models in a render or already-existing places in a photograph.

the vast majority of this material is indisputably tripe

At this point the majority of people interacting with generative AI are techbro dipshits, and most AI art is being churned out by treatboxes with a simple prompt as the only control. Even among the “enthusiast” community most people just use local prompt-only treatboxes because even a basic flowchart UI is “too complicated” for them. I said this in another thread, but I’m heavily reminded of the original proliferation of highly accessible 3d renderer software and how that created a flood of low quality garbage where it’s just some prefab model standing in some prefab set with bad lighting and a bad pose, that people tried to turn into “comics” that are just a long progression of single panel pages with speech bubbles. Even now CGI is something that either has to be carefully hidden and blended into film or animation or which occupies its own realm of stylization and accepted conventions, because anything else looks jarring as hell and bad.

Generative AI is that cranked up to 11: a tool where the barrier to reach “this is a picture with a character in it, sort of doing something I guess” is basically nothing. And just like CGI, most uses of it are bad and the good uses are more trying to shortcut off it and then hide its presence than use it directly.