• 2 Posts
  • 48 Comments
Joined 1 year ago
cake
Cake day: June 20th, 2023

help-circle


  • Without knowing anything about this model or what it was trained on or how it was trained, it’s impossible to say exactly why it displays this behavior. But there is no “hidden layer” in llama.cpp that allows for “hardcoded”/“built-in” content.

    It is absolutely possible for the model to “override pretty much anything in the system context”. Consider any regular “censored” model, and how any attempt at adding system instructions to change/disable this behavior is mostly ignored. This model is probably doing much the same thing except with a “built-in story” rather than a message that says “As an AI assistant, I am not able to …”.

    As I say, without knowing anything more about what model this is or what the training data looked like, it’s impossible to say exactly why/how it has learned this behavior or even if it’s intentional (this could just be a side-effect of the model being trained on a small selection of specific stories, or perhaps those stories were over-represented in the training data).





  • This sounds like a timing issue to me. The thread bunching up may be due to the hook not grabbing the thread or the take-up lever not taking up the slack at the correct time. If it’s missing stitches in zig-zag mode then that would also be due to either hook timing or possibly needle bar alignment.

    Simple things to check:

    • Make sure that the needle is installed correctly, especially that it is oriented the right way and inserted all the way in

    • Make sure that the take-up lever is threaded correctly

    Assuming these are both correct, you can try the following:

    • If possible, insert a fresh needle (at least, you will need a needle that is undamaged and not bent from the shank up to the eye)

    • Remove the plate, leave the machine unthreaded

    • On the straight stitch setting, turn the hand wheel slowly and check that the eye of the needle is exactly level with the hook as they pass each other (this should happen close to the bottom of the needle’s stroke but may not be exactly at the bottom)

    • On the widest zig-zag stitch setting, again turn the hand wheel slowly and check that the eye of the needle passes closely to the hook (it won’t be exact because the needle has moved, but it should be just slightly early on one side and just slightly late on the other, not noticeably early or late on one side) and also check that the needle is not colliding with any solid parts of the machine on either side

    If the eye and the hook are not aligned as they pass each other, then you have either a timing or a needle height alignment issue. If they pass correctly on the straight stitch but the needle is noticeably early or late on one side of the zig-zag stitch (and fine on the other side) then you have an issue with the horizontal alignment of the zig-zag stitch.


  • That machine is a pretty solid choice if it works, and a worthwhile repair project if it doesn’t (it may have seized up if not maintained recently or it may have timing or alignment issues from age).

    Machines like that are quite solidly built compared to modern machines, I would be surprised if it can’t get through a few layers of denim for a few stitches (I wouldn’t recommend doing 6 layers continuously, but crossing over the side seam should be OK). If you’re concerned you can always hand crank it for that part.

    The lack of a free arm may be somewhat limiting for hems. The “stupid” solution would be to stand the machine up on top of a crate or similar, as long as the circumference of the leg/other fabric is large enough to fit around the bottom metal “plate” of the machine. (These machines have a metal body designed to be built into a cabinet or shelf top. I’m not sure if yours includes a wooden box around the bottom or if it is just the machine itself, but if there is any wood then the machine can be removed from this leaving just the metal body of the machine itself which may provide more flexibility in this regard.)


  • I haven’t come across any significant discussion surrounding this before and I wouldn’t recommend choosing a machine on this basis.

    A front-loading bobbin is only an advantage for changing mid-task if you catch it before the thread runs out, otherwise you’ll be backtracking and starting again anyway once you’ve replaced it. I suppose if there is a viewing window and you can see when it is about to run out then this is an advantage, otherwise you won’t know when to stop and change it anyway until you notice that it has already run out.

    In terms of speed I doubt you will find any typical sewing machine “too slow” unless you plan to sew a lot and you want it finished quickly. For a few repairs or alterations and the occasional custom piece speed is not a priority, most of the time you will want to go slower anyway for more control/accuracy.

    I think you need to put less thought into what machine you get and more thought into getting some machine and start sewing without thinking so much about details like how the bobbin is loaded. As a beginner these things don’t matter, and by the time you are non-beginner enough for them to matter then you will know what aspects are important to you and if you want to upgrade. As it is, you can’t really jump to making “expert-level” choices because you don’t have the experience to know, for example, if speed is even a priority to you.




  • I tried getting it to write out a simple melody using MIDI note numbers once. I didn’t think of asking it for LilyPond format, I couldn’t think of a text-based format for music notation at the time.

    It was able to produce a mostly accurate output for a few popular children’s songs. It was also able to “improvise” a short blues riff (mostly keeping to the correct scale, and showing some awareness of/reference to common blues themes), and write an “answer” phrase (which was suitable and made musical sense) to a prompt phrase that I provided.



  • To be honest, the same could be said of LLaMa/Facebook (which doesn’t particularly claim to be “open”, but I don’t see many people criticising Facebook for doing a potential future marketing “bait and switch” with their LLMs).

    They’re only giving these away for free because they aren’t commercially viable. If anyone actually develops a leading-edge LLM, I doubt they will be giving it away for free regardless of their prior “ethics”.

    And the chance of a leading-edge LLM being developed by someone other than a company with prior plans to market it commercially is quite small, as they wouldn’t attract the same funding to cover the development costs.


  • IMO the availability of the dataset is less important than the model, especially if the model is under a license that allows fairly unrestricted use.

    Datasets aren’t useful to most people and carry more risk of a lawsuit or being ripped off by a competitor than the model. Publishing a dataset with copyrighted content is legally grey at best, while the verdict is still out regarding a model trained on that dataset and the model also carries with it some short-term plausible deniability.


  • There are only a few popular LLM models. A few more if you count variations such as “uncensored” etc. Most of the others tend to not perform well or don’t have much difference from the more popular ones.

    I would think that the difference is likely for two reasons:

    • LLMs require more effort in curating the dataset for training. Whereas a Stable Diffusion model can be trained by grabbing a bunch of pictures of a particular subject or style and throwing them in a directory, an LLM requires careful gathering and reformatting of text. If you want an LLM to write dialog for a particular character, for example, you would need to try to find or write a lot of existing dialog for that character, which is generally harder than just searching for images on the internet.

    • LLMs are already more versatile. For example, most of the popular LLMs will already write dialog for a particular character (or at least attempt to) just by being given a description of the character and possibly a short snippet of sample dialog. Fine-tuning doesn’t give any significant performance improvement in that regard. If you want the LLM to write in a specific style, such as Old English, it is usually sufficient to just instruct it to do so and perhaps prime the conversation with a sentence or two written in that style.


  • WizardLM 13B (I didn’t notice any significant improvement with the 30B version), tends to be a bit confined to a standard output format at the expense of accuracy (e.g. it will always try to give both sides of an argument even if there isn’t another side or the question isn’t an argument at all) but is good for simple questions

    LLaMa 2 13B (not the chat tuned version), this one takes some practice with prompting as it doesn’t really understand conversation and won’t know what it’s supposed to do unless you make it clear from contextual clues, but it feels refreshing to use as the model is (as far as is practical) unbiased/uncensored so you don’t get all the annoying lectures and stuff




  • I would guess that this is possibly an issue due to the model being a “SuperHOT” model. This affects the way that the context is encoded and if the software that uses the model isn’t set up correctly for it you will get issues such as repeated output or incoherent rambling with words that are only vaguely related to the topic.

    Unfortunately I haven’t used these models myself so I don’t have any personal experience here but hopefully this is a starting point for your searches. Check out the contextsize and ropeconfig parameters. If you are using the wrong context size or scaling factor then you will get incorrect results.

    It might help if you posted a screenshot of your model settings (the screenshot that you posted is of your sampler settings). I’m not sure if you configured this in the GUI or if the only model settings that you have are the command-line ones (which are all defaults and probably not correct for an 8k model).