• 0 Posts
  • 189 Comments
Joined 3 years ago
cake
Cake day: June 12th, 2023

help-circle
  • If we’re going to pull up other people’s pithy phrases that aren’t intended to be taken entirely literally, then the relevant one here is machine learning is the second best solution to any problem. In the (approximately, depending on how you define it) century people have been thinking about computers, we’ve already found better solutions to lots of problems. If a transformer-based neural network can get 99% accuracy in sixty seconds on 92 billion transistors of GPU and billions more for its VRAM, that’s pretty useless if we can also do it with 100% accuracy in sixty microseconds on a $1 microcontroller, or even faster on a less constrained device.

    The attention is all you need phrase is specifically in the context of sequence transduction models, and specifically referring to the discovery that they don’t need a combination of attention, recurrence and convolution, but actually only need attention if it’s used in the novel way introduced by the paper. If you don’t need to transduce any sequences, then this isn’t necessarily relevant, and it’s critically not a claim that you can do everything by transducing sequences. It was a surprise that applying it to generating new text instead of just converting it worked as well as it did, and a surprise that it kept getting better with larger models instead of plateauing around the GPT-1 and GPT-2 era, and a surprise that the text generation could be used to do other things, even ones as basic as addition. These things weren’t predicted by the Attention Is All You Need paper.


  • It’s not damn near anything. There’s loads of stuff that computers can do much more quickly and more accurately without it just by virtue of computers already being fast and effective at maths and obeying logic. With or without the transformer architecture, a neural network is never going to be as fast or reliable at, for example, summing a collection of numbers as just adding them would be, and loads of real-world tasks are like this, hence why we’ve built billions of computers even before the transformer architecture was invented.

    Also, in particular, I didn’t say that the transformer architecture wasn’t useful for things that aren’t LLMs, I said that most of the work done specifically to improve LLMs has no applications outside LLMs, so the next big leap towards making computers intelligent isn’t helped more by working on LLMs than it would be by working on any other kind of AI.


  • It’s actually just a lot of pretty simple maths from decades ago, but it’s a lot of it. The big changes in those decades have been the feasibility of doing enough of that simple maths to achieve anything useful, and domain-specific network architecture stuff that’s rarely transferable, e.g. LLMs are possible because of the invention of the transformer architecture in 2017, and that’s also turned out to be useful for a few things like image generation and protein folding simulation, but not for all neural network based techniques, and then most of the things that have made successive LLMs better haven’t also been useful for the few other transformer-architecture-based neural networks. Most not-LLM AI isn’t going to be meaningfully easier to create than it would have been had the world got bored after GPT-2 and we’d only focussed on doing image and video generation.


  • Fun fact: if you’re self employed in the UK, you’ve been able to do your taxes online via a simple (ish) form on the HMRC website, but they’re in the process of phasing it out in favour of third-party proprietary software, so some people aren’t allowed to use the free web form anymore, and within two or three years, it’ll become totally unavailable. Everyone always loves it when the government finds things that Americans complain about and copies it here so we get to complain about it, too.


  • Lots of shops have gone out of business because, by having a premises, their operating expenses were more than an online retailer, so places like Amazon could undercut them, and their customers were willing to wait for delivery to save some money. It doesn’t take that many customers leaving before you’ve got to put up prices to cover your overheads, and that just makes more customers leave, and after a couple of decades of online retail being common, you’re left with far fewer physical shops.









  • To refer back to the original post, you are taking things too literally, and in doing so, missing meaning that is present in the symbols. As a rough analogy, DXT1 GPU texture compression has two modes. Both start by storing two colours, then they diverge. They both store a number from zero to three per pixel, but in one mode, zero to three all mean interpolating between the two endpoint colours, and in the other mode, zero to two are for interpolation, and three means that the pixel is transparent. There’s no bit explicitly storing which mode’s being used, but the information is there. The two stored colours should also be interpreted as two numbers, and if the higher one is first, then you use the first mode, and if the higher one is second, then you use the second mode. If the colours were interpreted too literally, they’d only be seen as colours, but an implementation can see that there was a choice to put the colours in a particular order, and read into that. There’s no abiguity, people just need to know about the rule and apply it.

    For communicating with the public, there are enough people that are barely literate that asking the simplest version of a question is going to cover more of the population than one that adds all the necessary qualification to ensure someone that takes everything literally knows it’s a hypothetical.


    • it wasn’t my argument
    • the question has an implicit in a hypothetical scenario where you were having a conversation where it would be relevant aspect that most people would recognise even though the words don’t literally include it, and if you did literally want to ask them whether they’d start such a conversation out of the blue, you’d have to add extra words to say so. The literal interpretation would be an absurd thing to ask about, and people subconsciously recognise that, so don’t consider it.

  • Neurotypical people wouldn’t feel that there was any guesswork, as all the context and details are already covered by the words in the sentence, the situation the sentence is being said in, or the subtext of that sentence being the one they chose to say. You wouldn’t be disambiguating anything, just redundantly restating things.


  • That’s a proprietary software problem rather than a being connected to the internet problem. One of the send-a-notification-when-it’s-done devices I set up took about as much effort as setting the right time on a phone alarm about ten times because the device’s firmware was open source with no companies’ bullshit involved, so all I had to do was navigate to the right page in Home Assistant and pick the right phone from a dropdown and the right even for the notification to trigger on from a dropdown. That’s not wildly different from picking the right time from a dropdown on a phone.


  • AnyOldName3@lemmy.worldtoLemmy Shitpost@lemmy.worldFuture
    link
    fedilink
    arrow-up
    7
    arrow-down
    1
    ·
    1 month ago

    Again, that’s specific to it being proprietary software. I’ve got some devices in my home that are connected to the local network (but not the internet), and have configured Home Assistant (which I’ve got running on an old desktop PC) to send a notification to my phone when it detects that those devices report that they’re finished with what they do. That’ll keep working until I turn off the Home Assistant server or replace the devices.


  • AnyOldName3@lemmy.worldtoLemmy Shitpost@lemmy.worldFuture
    link
    fedilink
    arrow-up
    12
    arrow-down
    2
    ·
    1 month ago

    That’s more effort per wash instead of being something that only needs setting up one and then will work forever. Also, it’s common for post-90s appliances to include sensors and vary the cycle time based on how dirty the water gets. Except for the data privacy and security concerns, which are mainly because it’s proprietary software rather than inherent in Internet-connected devices, there’s no advantage to using your phone timer over getting a notification.



  • Artemis also has the premise that stripping away all the safety regulation that a rich country would add to its space program would make a poorer country able to rapidly develop a superior space program and become a rich country with nothing at all going wrong except the one time

    spoiler

    the protagonist accidentally chloroforms everyone

    when it all works out fine in the end anyway because of ignoring the few rules that they did have. It’s not a stretch to say that it promotes elements of Objectivism, although it’s a lot more pro-state than Ayn Rand was.