Python Security Developer-in-Residence – Weekly Report #2

qwop@programming.dev · 11 months ago

The full changelog for this release is here https://docs.python.org/release/3.11.7/whatsnew/changelog.html#python-3-11-7-final

Surprisingly not shown that obviously in the release announcements, but I guess that’s fair since most of the changes will have no effect on 99.9999% of people.

qwop@programming.dev · 1 year ago

Sounds fine, they’re both immutable which helps.

qwop@programming.dev · 1 year ago

UTF-8 is an encoding for unicode, that means it’s a way of representing a unicode string as actual bytes on a computer.

It is variable length and works by using the first bits of each byte to indicate how many bytes are are needed to represent the current character.

Python also uses an encoding, as you describe in the article, but it’s different to UTF-8. Unlike unicode, all characters in Python’s representation of the unicode string use the same number of bytes, which is the maximum that any individual unicode character in the string needs.

I’d probably mess up a more detailed explanation of UTF-8 or Python’s representation, so I’ll let you look into how they work in more detail if you’re interested.

qwop@programming.dev · 1 year ago

The article says that CPython represents strings as UTF-8 encoded, which is not correct. The details about how it works are correct, just that’s not UTF-8.

That’s just a minor point though, nice article.

qwop@programming.dev · 1 year ago

It’s described in PEP 585, https://peps.python.org/pep-0585/#parameters-to-generics-are-available-at-runtime

qwop@programming.dev · 1 year ago

Probably no time soon.

qwop@programming.dev · 1 year ago

Well I kept using it until Infinity died, which was only at the start of this month!

If I do decide to go back, it will be by compiling the infinity APK with my own API key, but I’m not feeling much of an urge to bother at the moment.

qwop@programming.dev · 1 year ago

It’d be nice to have a rule specifically for the use of f-strings and template formatting in the same call, since that can easily be a security vulnerability.

qwop@programming.dev · 1 year ago

I’m pretty sure most type checkers recognise both forms.

qwop@programming.dev · 1 year ago

It probably really depends on the project, though I’d probably try and start with the tests that are easiest/nicest to write and those which will be most useful. Look for complex logic that is also quite self-contained.

That will probably help to convince others of the value of tests if they aren’t onboard already.

qwop@programming.dev · 1 year ago

Yeah they’ve put them in a couple places, It’s pretty bad. Had to work out how to create a custom uBlock Origin rule to block them.

qwop@programming.dev · edit-2 1 year ago

I think calling it just like a database of likely responses is too much of a simplification and downplays what it is capable of.

I also don’t really see why the way it works is relevant to it being “smart” or not. It depends how you define “smart”, but I don’t see any proof of the assumptions people seem to make about the limitations of what an LLM could be capable of (with a larger model, better dataset, better training, etc).

I’m definitely not saying I can tell what LLMs could be capable of, but I think saying “people think ChatGPT is smart but it actually isn’t because <simplification of what an LLM is>” is missing a vital step to make it a valid logical argument.

The argument is relying on incorrect intuition people have. Before seeing ChatGPT I reckon if you’d told people how an LLM worked they wouldn’t have expected it to be able to do things it can do (for example if you ask it to write a rhyming poem about a niche subject it wouldn’t have a comparable poem about in its dataset).

A better argument would be to pick something that LLMs can’t currently do that it should be able to do if it’s “smart”, and explain the inherent limitation of an LLM which prevents it from doing that. This isn’t something I’ve really seen, I guess because it’s not easy to do. The closest I’ve seen is an explanation of why LLMs are bad at e.g. maths (like adding large numbers), but I’ve still not seen anything to convince me that this is an inherent limitation of LLMs.

qwop@programming.dev · 1 year ago

The things I’m doing are mainly just as a hobby at the moment, so the advice of others may be more relevant to you, but I’ve learned a lot from and really enjoyed just creating a really overkill stack for a simple web app I made.

I’m talking setting up grafana for monitoring, using ansible/terraform, setting up backups, etc etc. Lots of just picking cool software I’ve heard about and trying to stuff it into my use case.

qwop@programming.dev · 1 year ago

If you use poetry add it should only update what is necessary, and you can use poetry lock --no-update to lock without updating everything.

qwop@programming.dev · 1 year ago

And whether PEP 703 is accepted will probably depend quite a lot on the result of this poll: https://discuss.python.org/t/poll-feedback-to-the-sc-on-making-cpython-free-threaded-and-pep-703/28540

qwop@programming.dev · 1 year ago

Yeah, my experience with docker on windows has been pretty bad, uses high CPU and RAM at the best of times, at the worst completely hangs my computer on 100% CPU usage forcing a restart as the only fix.

I really don’t understand why people are overcomplicating this. You can install multiple Python versions at once on Windows and it just works fine (you can use the py command to select the one you want).

Virtual environments are designed exactly for this use case. They’ve got integrations for pretty much everything, they’re easy to delete/recreate, they’re really simple to use, they’re fast, and they just work.

If virtual environments alone aren’t quite enough you can use something like poetry or pipenv or the many other package management options, but in many cases even that is overkill.

qwop@programming.dev · 1 year ago

Same.

We can defederate at any point, and I think it’s too early to say federating would definitely cause harm to our community. I’d prefer to see how things go, keeping our hands close to big red “defederate” button.

qwop@programming.dev · 1 year ago

I don’t see why you need a singleton, just use use a global variable if you really need one. A singleton has all the same downsides but just hides them by not looking like a global.

https://nedbatchelder.com/blog/202204/singleton_is_a_bad_idea.html

qwop@programming.dev · 1 year ago

Python Security Developer-in-Residence – Weekly Report #2

qwop@programming.dev · 1 year ago

Thanks for the info on crossposting! I thought I’d seen someone mention a cross posting feature but couldn’t see any button to do it. I’m using the Jerboa app on Android which I guess doesn’t have that button, but I see it on the website now as you say.

It’s also good to know that linking to the original URL is generally better and the rest can be handled by the UI - that does seem nicer.

qwop@programming.dev · 1 year ago

Great TIL, I hate it.

Excellent how the page alludes to other horrible things to imagine, like “don’t pour hot oil into your ear”, and “don’t pour it in if there’s a hole in your eardrum”