Before we start, let’s just get the basics out of the way - yes, stealing the work of hundreds of thousands if not millions of private artists without their knowledge or consent and using it to drive them out of business is wrong. Capitalism, as it turns out, is bad. Shocking news to all of you liberals, I’m sure, but it’s easy to call foul now because everything is wrong at once - the artists are losing their jobs, the slop being used to muscle them out is soulless and ugly, and the money is going to lazy, talentless hacks instead. With the recent implosion of the NFT space, we’re still actively witnessing the swan song of the previous art-adjacent grift, so it’s easy to be looking for problems (and there are many problems). But what if things were different?

Just to put my cards on the table, I’ve been pretty firmly against generative AI for a while, but I’m certainly not opposed to using AI or Machine Learning on any fundamental level. For many menial tasks like Optical Character Recognition and audio transcription, AI algorithms have become indispensable! Tasks like these are grunt work, and by no means is humanity worse off for finding ways to automate them. We can talk about the economic consequences or the quality of the results, sure, but there’s no fundamental reason this kind of work can’t be performed with Machine Learning.

AI art feels… different. Even ignoring where companies like OpenAI get their training data, there are a lot of reasons AI art makes people like me uneasy. Some of them are admittedly superficial, like the strange proportions or extra fingers, but there’s more to it than that.

The problem for me is baked into the very premise - making an AI to do our art only makes sense if art is just another task, just work that needs to be done. If sourcing images is just a matter of finding more grist for the mill, AI is a dream come true! That may sound a little harsh, and it is, but it’s true. Generative AI isn’t really art - art is supposed to express something, or mean something, or do something, and Generative AI is fundamentally incapable of functioning on this wavelength. All the AI works with is images - there’s no understanding of ideas like time, culture, or emotion. The entirety of the human experience is fundamentally inaccessible to generative AI simply because experience itself is inaccessible to it. An AI model can never go on a walk, or mow a lawn, or taste an apple, it’s just an image generator. Nothing it draws for us can ever really mean anything to us, because it isn’t one of us. Often times, I hear people talk about this kind of stuff almost like it’s just a technical issue, as if once they’re done rooting out the racial bias or blocking off the deepfake porn, then they’ll finally have some time to patch in a soul. When artist Jens Haaning mailed in 2 blank canvases titled “Take the Money and Run” to the Kunsten Museum of Modern Art, it was a divisive commentary on human greed, the nature of labor, and the nonsequitir pricing endemic to modern art. The knowledge that a real person at that museum opened the box, saw a big blank sheet, and had to stick it up on the wall, the fact that there was a real person on the other side of that transaction who did what they did and got away with it, the story around its creation, that is the art. If StableDiffusion gave someone a blank output, it’d be reported as a bug and patched within the week.

All that said, is AI image generation fundamentally wrong? Sure, the people trying to make money off of it are definitely skeevy, but is there some moral problem with creating a bunch of dumb, meaningless junk images for fun? Do we get to cancel Neil Cicierega because he wanted to know how Talking Heads frontman David Byrne might look directing traffic in his oversized suit?

Maybe just a teensy bit, at least under the current circumstances.

I’ll probably end up writing a part 2 about my thoughts on stuff like data harvesting and stuff, not sure yet. I feel especially strongly about the whole “AI is just another tool” discourse when people are talking about using these big models, so don’t even get me started on that.

  • AlicePraxis [any]@hexbear.net
    link
    fedilink
    English
    arrow-up
    5
    ·
    3 months ago

    preface: I’m about to disagree with something you said, which will read as a defense of generative AI. In fact I have many criticisms and reservations about the technology, but I also disagree with some of the arguments leveled against it. Keep in mind I’m not some AI loving zealot, and I actually agree with much of what you’ve said, I just think this one specific point needs to be clarified

    offloading of all creative responsibility to an algorithm

    Not all of it. Generative AI absolutely does involve human decision making. The choice in what to enter as a prompt is a creative decision. Now… that being said, I don’t think simple text prompting is sufficient to make good art, as it does still leave way too many arbitrary decisions to the algorithm, to your point. Good art is more than just listing a bunch of keywords.

    But the process doesn’t have to be as simple as entering a single text prompt, taking what the algorithm spits out, and calling it a day. It can be an iterative process where the person continuously makes changes based on what they like or don’t like about the last generated result. Instead of just stopping at the first image, you might say, okay, this is a good start but make that mountain in the back taller and the sky darker. now remove that person standing in the background. also I don’t like the color of the subject’s hair, make it black instead, etc.

    Current genAI tech is not particularly good at doing what I just described, but the technology is still in its infancy. As the tech continues to develop it will allow for more and more human input, more iteration, more creative control. But even in it’s current crude state there are many methods for having finer control over the results than a simple text prompt. Speaking in Stable Diffusion terms, there is img2img for generating an image based on another image. There is inpainting which allows you to make changes to a small portion of the original image. ControlNet allows you to generate humans with a specific pose. Regional prompter is used to control the composition of an image.


    I once saw an AI generated image of Putin wearing a ballerina dress. This is, of course, a bad piece of art. It’s not bad because of the decisions made by the algorithm, it’s bad because of the poor creative decisions made by a human being.

    • OutrageousHairdo [he/him]@hexbear.netOP
      link
      fedilink
      English
      arrow-up
      3
      ·
      3 months ago

      While I don’t think delineating “real art” from mere images is as simple as coming up with the right math formula, I do think that we can get somewhere by looking at how many decisions a human is making versus the decisions they aren’t making. For the sake of discussion, let’s ignore all the commentary pieces specifically about AI, or for which the lack of any guiding intention is a part of the artistic message, let’s just talk about plain old art. The thing that’s important to me is that every single piece of an image is a choice - every line, every brush stroke, every pixel. Take, for example, Starry Night - the reason that painting looks the way it does, looks so meaningful, isn’t just because van Gogh painted a really pretty town, it’s not just about the complete image, but the way that each individual stroke swirls and loops into each other. There were thousands of mutually reinforcing decisions that the artist made there, each movement of the brush was chosen deliberately to reinforce the piece’s intended viewing experience. The comparison to current technology is almost comically unfavorable, and while I don’t think images created with AI assistance are categorically incapable of being art, the vast majority of this material is indisputably tripe, and I would argue the use of AI in the process does something to taint the final product in many cases.

      • KobaCumTribute [she/her]@hexbear.net
        link
        fedilink
        English
        arrow-up
        3
        ·
        3 months ago

        The problem with fetishizing the specific methods and intentionality of details is how that squares with available tools and materials and the purpose of a piece in the first place. That is, how intentional are details in a photograph? How much thought was put into some fuzzy background bit that’s only there because something had to be there? How intentional is some procedurally filled environment in a 3d render? How does mass produced rote decorations like Roman mosaics and statues, or older lost-wax casting fit in? So much of art is in practice just pragmatic methodology: things are reused to save labor and resources, rote actions or patterns are used because they simplify the process of creation or are part of a standard abstract language of a genre, things can be present just because they were available to use, large chunks can be handed off to other workers that are just performing some rote task to follow a description, etc.

        Hitting the “give treat” button on some proprietary off-site treatbox with no controls can only produce nonsense, but the actual mechanical stuff that facilitates that can - in an open source and locally-run package - be used in the same manner as prefab models in a render or already-existing places in a photograph.

        the vast majority of this material is indisputably tripe

        At this point the majority of people interacting with generative AI are techbro dipshits, and most AI art is being churned out by treatboxes with a simple prompt as the only control. Even among the “enthusiast” community most people just use local prompt-only treatboxes because even a basic flowchart UI is “too complicated” for them. I said this in another thread, but I’m heavily reminded of the original proliferation of highly accessible 3d renderer software and how that created a flood of low quality garbage where it’s just some prefab model standing in some prefab set with bad lighting and a bad pose, that people tried to turn into “comics” that are just a long progression of single panel pages with speech bubbles. Even now CGI is something that either has to be carefully hidden and blended into film or animation or which occupies its own realm of stylization and accepted conventions, because anything else looks jarring as hell and bad.

        Generative AI is that cranked up to 11: a tool where the barrier to reach “this is a picture with a character in it, sort of doing something I guess” is basically nothing. And just like CGI, most uses of it are bad and the good uses are more trying to shortcut off it and then hide its presence than use it directly.