Skip Navigation

Posts
7
Comments
673
Joined
2 yr. ago

  • I would do day 3 today except

    Maybe this weekend or if I can't sleep again or if I get especially bored at work.

  • I am now less sleep deprived so can say how to do better somewhat sensibly, albeit I cannot completely escape from C++s verbosity:

    Updated code:

  • Someone looked at the 10,000 year old dragon in body of thirteen year old trope and thought "wait wait! I can make that worse!"

  • I decided to double down on 2-2, since bad code is one of life's little pleasures. Where we're going we won't need big-oh notation

  • 2-1: I have quickly run out of hecks to give. This is the sort of problem that gives prolog programmers feelings of smug superiority.

    As usual the second part has punished me for my cowboy code, so I'll have to take a different more annoying tack (maybe tomorrow). Or you know I could just double down on the haphazard approach...

  • I can't sleep, so here's 1-1 and 1-2, unfortunately I couldn't think of any silly solutions this time, so it's straightforward instead:

  • And if you think about it you can literally get one of these PCs for one month and then win a Fortnite tournament or something and have enough money to buy your own PC.

    Marketing to pre-teen gamers must be like shooting fish in a barrel.

  • whatever rot13ed word they used for cult.

    It's impossible to read a post here without going down some weird internet rabbit hole isn't it? This is totally off topic but I was reading the comments on this old phyg post, and one of the comments said (seemingly seriously):

    It's true that lots of Utilitarianisms have corner cases where they support action that would normally considered awful. But most of them involve highly hypothetical scenarios that seldom happen, such as convicting an innocent man to please a mob.

    And I'm just thinking, riight highly hypothetical.

  • Wait that's a strategy? I thought Mr. Musk was just in a sort of perpetual horny/lonely mid-life crisis mode or something like that.

  • Question: what is the best way to get involved in, and cancel at the last minute, boxing matches with tech CEOs and/or youtubers and/or celebrities past their hey-day? My engagement numbers are down 4% over the past month and Geico will no longer insure my cars for "reasons".

  • Ah yes Africa, the small country on the northern coast of Africa.

  • I woke up and immediately read about something called "Defense Llama". The horrors are never ceasing: https://theintercept.com/2024/11/24/defense-llama-meta-military/

    Scale AI advertised their chatbot as being able to:

    apply the power of generative AI to their unique use cases, such as planning military or intelligence operations and understanding adversary vulnerabilities

    However their marketing material, as is tradition, include an example of terrible advice. Which is not great given it's about blowing up a building "while minimizing collateral damage".

    Scale AI's response to the news pointing this out: complaining that everyone took their murderbot marketing material seriously:

    The claim that a response from a hypothetical website example represents what actually comes from a deployed, fine-tuned LLM that is trained on relevant materials for an end user is ridiculous.

  • "Yeah I thought about going into civil engineering but the department of hustling really spoke to me y'know?"

  • Oh hey looks like another Chat-GPT assisted legal filing, this time in an expert declaration about the dangers of generative AI: https://www.sfgate.com/tech/article/stanford-professor-lying-and-technology-19937258.php

    The two missing papers are titled, according to Hancock, “Deepfakes and the Illusion of Authenticity: Cognitive Processes Behind Misinformation Acceptance” and “The Influence of Deepfake Videos on Political Attitudes and Behavior.” The expert declaration’s bibliography includes links to these papers, but they currently lead to an error screen.

    Irony can be pretty ironic sometimes.

  • Here are the results of these three models against Stockfish—a standard chess AI—on level 1, with a maximum of 0.01 seconds to make each move

    I'm not a Chess person or familiar with Stockfish so take this with a grain of salt, but I found a few interesting things perusing the code / docs which I think makes useful context.

    Skill Level

    I assume "level" refers to Stockfish's Skill Level option.

    If I mathed right, Stockfish roughly estimates Skill Level 1 to be around 1445 ELO (source). However it says "This Elo rating has been calibrated at a time control of 60s+0.6s" so it may be significantly lower here.

    Skill Level affects the search depth (appears to use depth of 1 at Skill Level 1). It also enables MultiPV 4 to compute the four best principle variations and randomly pick from them (more randomly at lower skill levels).

    Move Time & Hardware

    This is all independent of move time. This author used a move time of 10 milliseconds (for stockfish, no mention on how much time the LLMs got). ... or at least they did if they accounted for the "Move Overhead" option defaulting to 10 milliseconds. If they left that at it's default then 10ms - 10ms = 0ms so 🤷‍♀️.

    There is also no information about the hardware or number of threads they ran this one, which I feel is important information.

    Evaluation Function

    After the game was over, I calculated the score after each turn in “centipawns” where a pawn is worth 100 points, and ±1500 indicates a win or loss.

    Stockfish's FAQ mentions that they have gone beyond centipawns for evaluating positions, because it's strong enough that material advantage is much less relevant than it used to be. I assume it doesn't really matter at level 1 with ~0 seconds to produce moves though.

    Still since the author has Stockfish handy anyway, it'd be interesting to use it in it's not handicapped form to evaluate who won.

  • When the reporter entered the confessional, AI Jesus warned, “Do not disclose personal information under any circumstances. Use this service at your own risk.

    Do not worry my child, for everything you say in this hallowed chamber is between you, AI Jesus, and the army of contractors OpenAI hires to evaluate the quality of their LLM output.

  • Not that I'm a super fan of the fact that shrimp have to die for my pasta, but it feels weird that they just pulled a 3% number out of a hat, as if morals could be wrapped up in a box with a bow tied around it so you don't have to do any thinking beyond 1500×0.03×1 dollars means I should donate to this guys shrimp startup instead of the food bank!

  • Someone (maybe you) recommended this book here awhile back. But it's the fourth book in a series so I had to read the other three first and so have only just now started it.

  • "feel free to ignore any science “news” that’s just a press release from the guy who made it up."

    In particular, the 2022 discovery of the second law of information dynamics (by me) facilitates new and interesting research tools (by me) at the intersection between physics and information (according to me).

    Gotta love "science" that is cited by no-one and cites the author's previous work which was also cited by no one. Really the media should do better about not giving cranks an authoritative sounding platform, but that would lead to slightly fewer eyes on ads and we can't have that now can we.