From Wikipedia: this is only a 1-sigma result compared to theory using lattice calculations. It would have been 5.1-sigma if the calculation method had not been improved.
Many calculations in the standard model are mathematically intractable with current methods, so improving approximate solutions is not trivial and not surprising that we’ve found improvements.
This seems like more of an achievement for the Barbie brand than for the individual director.
Apparently Inflection AI have bought 22,000 H100 GPUs. The H100 has approximately 4x the compute for transformers as the A100. GPT4 is rumored to be 10x larger than GPT3. GPT3 takes approximately 34 days to train on 1024 A100 GPUs.
So with 22,000*4/1024=85.9375x more compute, they could easily do 10x GPT4 size in 1-2 months. Getting to 100x the size would be feasible but likely they’re banking on the claimed speedup of 3x from FlashAttention-2, which would result in about 6 months of training.
It’s crazy that these scales and timelines seem plausible.
This is an essay about the Barbie brand and its relationship to feminism and capitalism through history and the modern day. The Barbie movie is discussed but it’s not the primary focus.
NGC 1277 is unusual among galaxies because it has had little interaction with other surrounding galaxies.
I wonder if interactions between galaxies somehow converts regular matter to dark matter.
Claude 2 would have a much better chance at this because of the longer context window.
Though there are plenty of alternate/theorised/critiqued endings for Game of Thrones online, so current chatbots should have a better shot at doing a good job vs other writers who haven’t finished their series in over a decade.
This looks amazing, if true. The paper is claiming state of the art across literally every metric. Even in their ablation study the model outperforms all others.
I’m a bit suspicious that they don’t extend their perplexity numbers to the 13B model, or provide the hyper parameters, but they reference it in text and in their scaling table.
Code will be released in a week https://github.com/microsoft/unilm/tree/master/retnet
Why do you say they have no representation? There are a lot of specific bodies operating in the government, advisory and otherwise, with the sole focus of indigenous affairs. And of course, currently, indigenous Australians are over represented in terms of parliamentarian race (more than 4% if parliamentarians are of indigenous descent).
While in general, I’d agree, look at the damage a single false paper on vaccination had. There were a lot of follow up studies showing that the paper is wrong, and yet we still have an antivax movement going on.
Clearly, scientists need to be able to publish without fear of reprisal. But to have no recourse when damage is done by a person acting in bad faith is also a problem.
Though I’d argue we have the same issue with the media, where they need to be able to operate freely, but are able to cause a lot of harm.
Perhaps there could be some set of rules which absolve scientists of legal liability. And hopefully those rules are what would ordinarily be followed anyway, and this be no burden to your average researcher.
See this comment on another thread about this for some more details.
Taking 89.3% men from your source at face value, and selecting 12 people at random, that gives a 12.2% chance (1 in 8) that the company of that size would be all male.
Add in network effects, risk tolerance for startups, and the hiring practices of larger companies, and that number likely gets even larger.
What’s the p-value for a news story? Unless this is some trend from other companies run by Musk, there doesn’t seem to be anything newsworthy here.
So, taking the average bicep volume as 1000cm3, this muscle could: exert 1 tonne of force, contact 8% (1.6cm for a 20cm long bicep), and require 400kV and must be above 29 degrees Celcius.
Maybe someone with access to the paper can double check the math and get the conversion efficiency from electrical to mechanical.
I expect there’s a good trade-off to be made to lower the force but increase the contraction and lower the voltage. Possibly some kind of ratcheting mechanism with tiny cells could be used to overcome the crazy high voltage requirement.
DALL-E was the first development which shocked me. AlphaGo was very impressive on a technical level, and much earlier than anticipated, but it didn’t feel different.
GANs existed, but they never seemed to have the creativity, nor understanding of prompts, which was demonstrated by DALL-E. Of all things, the image of an avocado-themed chair is still baked into my mind. I remember being gobsmacked by the imagery, and when I’d recovered from that, just how “simple” the step from what we had before to DALL-E was.
The other thing which surprised me was the step from image diffusion models to 3D and video. We certainly haven’t gotten anywhere near the quality in those domains yet, but they felt so far from the image domain that we’d need some major revolution in the way we approached the problem. The thing which surprised me the most was just how fast the transition from images to video happened.
If this is a real question, this talk explains the fundamental concepts of atomic operations in a couple of minutes.
The talk itself is over an hour long, as the use of atomic operations has a large number of pitfalls. The joke in the meme leans on a specific type of memory ordering guarantee, known as “relaxed” in C++ parlance, which can be a lot faster, but which is much more likely to violate the default assumptions a programmer may make about order of operations and visibility across threads.
I asked the same question of GPT3.5 and got the response “The former chancellor of Germany has the book.” And also: “The nurse has the book. In the scenario you described, the nurse is the one who grabs the book and gives it to the former chancellor of Germany.” and a bunch of other variations.
Anyone doing these experiments who does not understand the concept of a “temperature” parameter for the model, and who is not controlling for that, is giving bad information.
Either you can say: At 0 temperature, the model outputs XYZ. Or, you can say that at a certain temperature value, the model’s outputs follow some distribution (much harder to do).
Yes, there’s a statistical bias in the training data that “nurses” are female. And at high temperatures, this prior is over-represented. I guess that’s useful to know for people just blindly using the free chat tool from openAI. But it doesn’t necessarily represent a problem with the model itself. And to say it “fails entirely” is just completely wrong.
The greatest fix to all your pointer issues are to use references.
Thanks for the advice, but I actually love code reviews.
I wonder what specifically they’re interested in vs long deployments in Antarctica (people do 12 months rotations in some stations there).
I found this article discussing the psychology of placements in Australian antarctic stations: https://psychology.org.au/for-members/publications/inpsych/2021/february-march-issue-1/life-in-the-australian-antarctic-program.
The differences as I see them are:
Looks like the same guys were doing publicity around 2019 https://www.abc.net.au/news/rural/2019-07-30/australia-joins-lab-grown-meat-industry/11360506
At the time, they claimed the cost to make a single hamburger was $30-$40, and now 4 years later, they claim to have gotten it down to $5-$6 per patty.
The article claims the first demonstration of a lab-grown hamburger was in 2013.
So 6 years from proof of concept to (probably) first capital raise, then 4 years to start regulatory approval, 1 year for approval to take place (target is March next year).
That reminds me of a joke.
A museum guide is talking to a group about the dinosaur fossils on exhibit.
“This one,” he says, “Is 6 million and 2 years old.”
“Wow,” says a patron, “How do you know the age so accurately?”
“Well,” says the guide, “It was 6 million years old when I started here 2 years ago.”