I had always seen requests for it on various Swift posts. Someone will post a long lyric acronym, and someone else who isn’t in the know will be frustrated and lament a site like this not existing.

It’s pretty basic right now, but you can search any acronym thats between 4 and 50 letters long, and it will tell you the song and album of any match.

Behind the scenes, there are 30,000 files, split by the first 4 letters of the acronym, so when you do a search you’ll pull the file that matches the first 4. Eventually I’d like to make it so you can get more letters from a short acronym, but I had to start somewhere.

So… Wdyt? I’d love some feedback

  • devdad@programming.dev
    link
    fedilink
    arrow-up
    3
    ·
    edit-2
    10 months ago

    I have no skin in the game for the app itself, I just saw your post on the “front page” while scrolling and shitting….

    there are 30,000 files

    I’m more intrigued why you appear to be managing this with files. Why not use a database?

    Edit: you’re also not handling misses very well. I just tried a random string and got the below error. That doesn’t tell me if it was my fault or a server error.

    An error occurred while processing your request. Please try again later.

    • charles@poptalk.scrubbles.techOP
      link
      fedilink
      arrow-up
      4
      ·
      10 months ago

      I appreciate the reasonse, even from a non-swiftie.

      Yeah, that error message is left over from an earlier version where I sharded by acronym length instead. At that time, there would always be a file. The problem was the files were getting to be huge at the 50 character length (20MB) and performance went to shit on poor mobile connections. So I refactored to shard by first 4, and files dropped down to a few K each and became a lot snappier.

      As for why files, there are a few reasons:

      • This way, the data are statically hosted, which means I can take advantage of a number of different free hosting services. In this case, I used firebase.
      • The data doesn’t change frequently. There’s gonna be two more re-records released, and then we’ll probably get back to the old cadence of an album every year or two.
      • Why would I use a whole ass database if I could avoid it? From the client perspective, the request looks basically the same: “hey server, give me this data”, but this way it’s 100% static with a CDN, so the response will be sub-10ms for a lot of people
      • Not doing any processing server side means I don’t have to worry about it going viral and breaking under the load.
      • devdad@programming.dev
        link
        fedilink
        arrow-up
        2
        ·
        10 months ago

        Yeah, that error message is left over from an earlier version where I sharded by acronym length instead. At that time, there would always be a file. The problem was the files were getting to be huge at the 50 character length (20MB) and performance went to shit on poor mobile connections. So I refactored to shard by first 4, and files dropped down to a few K each and became a lot snappier.

        I don’t think that’s necessarily true. I just tried a random string, and I got the correct 404 response back, but it doesn’t look like the app handles that case and it just prints that error on any error.

                    .catch(error => {
                        console.error("Error fetching or processing the JSON file:", error);
                        displayError("An error occurred while processing your request. Please try again later.", tableBody);
                    });
        

        Anyway, I wasn’t trying to shit on it. Good job :)

        • charles@poptalk.scrubbles.techOP
          link
          fedilink
          arrow-up
          2
          ·
          10 months ago

          Yeah I sincerely appreciate the feedback.

          The 404 should just say the same message as “acronym not found”. It just means the first 4 letters didn’t match a file on the backend; I didn’t enumerate all the blank json for A-Z*4.

          It was a really challenging project to process all the data. As with most large datasets, there are tons of pain points. Like 60% of the time spent was parsing out the song name from the janky first line of metadata. Some pain I dealt with over the project off the top of my head:

          • Song titles with special characters (Question…?)
          • Song titles that start with special characters (…Ready For It?)
          • song titles without capitalization (all of folklore/evermore albums)
          • Inconsistent apostrophes ' vs ’
          • Lyric words that start with special characters
          • The god damn ZWJ’s littered throughout everything
          • For some reason the Cyrillic “e” is used everywhere, which leads to some dupe lyrics
          • How to treat numbers (I decided the acronym should be the first letter of the number; search for “OTTF”)
          • The source (genius) littering random strings throughout “you might also like…”
          • devdad@programming.dev
            link
            fedilink
            arrow-up
            1
            ·
            10 months ago

            The 404 should just say the same message as “acronym not found”. It just means the first 4 letters didn’t match a file on the backend; I didn’t enumerate all the blank json for A-Z*4.

            Yeah, but it doesn’t translate to the site. That’s what I’m trying to say :) Your catch above doesn’t distinguish between 404 or anything else (5xxx) and displays An error occurred while processing your request. Please try again later for all eventualities. So, regardless whether the acronym wasn’t found or there was a genuine server error, the same error message is displayed.

            It was a really challenging project to process all the data. As with most large datasets, there are tons of pain points. Like 60% of the time spent was parsing out the song name from the janky first line of metadata. Some pain I dealt with over the project off the top of my head:

            I honestly have no idea what you had to drag it all out from, but it looks well implemented from the small amount I played around with it. I’ve never used Firebase, but it looks like you got it working so that’s a good job too.

            It’s probably just my old man brain that saw you were doing this all with files and it felt odd. That’s not to say it’s wrong, it’s just different to what I would have done.

            There’s a bunch of advantages to databases, like indexes and partial/fuzzy text matching - but I can certainly understand why you went this route if you needed to keep costs down and didn’t want to bother with any DB maintenance.

            Well done :)

            • charles@poptalk.scrubbles.techOP
              link
              fedilink
              arrow-up
              1
              ·
              10 months ago

              As a fellow old man, at least relative to this fanbase, I fully understand, and this is exactly the kind of feedback I was hoping to get. Thanks!

              As a devops engineer, sometimes the most efficient server is the one that doesn’t exist; next best is the one that someone else pays for. If heroku free tier existed, I’d consider using that to handle queries server side and aggressively cache them in a CDN.