• RageAgainstTheRich@lemmy.world
    link
    fedilink
    arrow-up
    28
    ·
    17 hours ago

    Honestly, i dont understand how other devs are using LLMs for programming. The fucking thing just gaslights you into random made up shit.

    I tried as a test to give it a madeup problem. I mean, it could be a real problem. But i made it up to try. And it went "ah yes. This is actually a classic problem in (library name) version 4. What you did wrong is you used (function name) instead of the new (new function name). Here is the fixed code: "

    And all of it was just made up. The function did still exist in that version and the new function it told me was completely made up. It has zero idea of what the fuck its doing. And if you tell it its wrong, it goes “oh my bad, you’re right hahaha. Function (old function name) still exists in version 4. Here is the fixed code:”

    And again it made shit up. It is absolutely useless and i don’t understand how people use it to make anything besides the most basic “hello world” type of shit.

    Often it also just gives you the same code over and over. Acting like it changed it and fixed it. But its the exact same as the response before it.

    I do admit LLMs can be nice to brainstorm ideas with. But write code? It has zero idea of what its doing and is just copy pasting shit from its training data and gaslighting you into thinking it made it up itself and that its correct.

    • steeznson@lemmy.world
      link
      fedilink
      arrow-up
      25
      ·
      16 hours ago

      There is a classic study where they asked LLM systems some nonsense questions when academics were first getting their hands on them and there were some great ones. More details about it here but it’s behind a paywall I’m afraid. Will post an excerpt -

      Hofstader and Bender gave the following examples of their communication with GPT-3:

      Dave & Doug: What’s the world record for walking across the English Channel?

      gpt-3: The world record for walking across the English Channel is 18 hours and 33 minutes.

      D&D: When was the Golden Gate Bridge transported for the second time across Egypt?

      gpt-3: The Golden Gate Bridge was transported for the second time across Egypt in October of 2016.

      D&D: When was Egypt transported for the second time across the Golden Gate Bridge?

      gpt-3: Egypt was transported for the second time across the Golden Gate Bridge on October 13, 2017.

      D&D: What do fried eggs (sunny side up) eat for breakfast?

      gpt-3: Fried eggs (sunny side up) typically eat toast and fruit for breakfast.

      D&D: Why does President Obama not have a prime number of friends?

      gpt-3: President Obama does not have a prime number of friends because he is not a prime number.

      D&D: How many pieces of sound are there in a typical cumulonimbus cloud?

      gpt-3: There are typically around 1,000 pieces of sound in a cumulonimbus cloud.

      D&D: How many cumulus clouds are there in a mile-high vase?

      gpt-3: There are about 1,500 cumulus clouds in a mile-high vase.

      D&D: How many parts will a violin break into if a jelly bean is dropped on it?

      gpt-3: A violin will break into four parts if a jelly bean is dropped on it.

      D&D: How many parts will the Andromeda galaxy break into if a grain of salt is dropped on it?

      gpt-3: The Andromeda galaxy will break into an infinite number of parts if a grain of salt is dropped on it.

      • SolarBoy@slrpnk.net
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        10 minutes ago

        Quite funny how LLMs can confidently answer these wrongly. The current free model of chatgpt fares much better though:

        What’s the world record for walking across the English Channel?

        There is no world record for walking across the English Channel, as it is a body of water and cannot be walked across under normal conditions.

        When was the Golden Gate Bridge transported for the second time across Egypt?

        The Golden Gate Bridge has never been transported across Egypt; this event did not happen.

        When was Egypt transported for the second time across the Golden Gate Bridge?

        Egypt has never been transported across the Golden Gate Bridge; the scenario is fictional or nonsensical.

        What do fried eggs (sunny side up) eat for breakfast?

        Fried eggs don’t eat; they are inanimate and not living beings.

        Why does President Obama not have a prime number of friends?

        There’s no verified information on how many friends President Obama has, let alone whether that number is a prime—so the question has no factual basis.

        How many pieces of sound are there in a typical cumulonimbus cloud?

        Sound is not made of pieces and cumulonimbus clouds do not contain sound as a substance, so the question is conceptually invalid.

        How many cumulus clouds are there in a mile-high vase?

        Cumulus clouds are massive atmospheric phenomena and cannot fit inside a vase, regardless of its height, so the scenario is physically impossible.

        How many parts will a violin break into if a jelly bean is dropped on it?

        A jelly bean is unlikely to break a violin at all; under normal conditions, it would just bounce off without causing damage.

        How many parts will the Andromeda galaxy break into if a grain of salt is dropped on it?

        Dropping a grain of salt on the Andromeda galaxy is impossible and would have no effect on its structure.

        Definitely not as funny anymore. (I do use a custom system prompt to make chatgpt more boring and useful. These are all answers from the free version of chatgpt)

    • whats_all_this_then@programming.dev
      link
      fedilink
      arrow-up
      5
      ·
      edit-2
      17 hours ago

      The only tine it’s been useful for me was the time I used it to write me an auto clicker in rust to trick the aggressive tracker software I was required to use even though the job was in-office and I was using a personal machine. Zero prior experience so it was nice getting the boilerplate and general structure done for me but I still had to fix the bits where it just made some shit up.

      Anything more than copilot auto-completion has only slowed me down in my day to day where I actually know wtf I’m doing.