• Dr. Moose@lemmy.world
    link
    fedilink
    English
    arrow-up
    37
    ·
    7 months ago

    It doesn’t. No network is capable of that and if they say they are you’re being lied to.

    • LemmyQuest@lemm.eeOP
      link
      fedilink
      arrow-up
      6
      ·
      7 months ago

      Reddit does have a system to fight it.

      Capable or not, bad solution is better than no solution.

      • Dr. Moose@lemmy.world
        link
        fedilink
        English
        arrow-up
        33
        ·
        edit-2
        7 months ago

        no it’s not better. It’s extremely invasive as you have to fingerprint and store users fingerprint on your servers indefinitely. Not only that but all of this can be avoided by anyone with half a brain cell. Lemmy should not waste their resources on something like this, it’s extremely hard to do to the point where literally nobody has a good system even giants like Linkedin. Source, I work in bot detection.

        Lemmy would never get this right no matter how many people contributed and would just cause overal harm to the platform through privacy invasion and false positives.

        • LWD@lemm.ee
          link
          fedilink
          arrow-up
          7
          ·
          7 months ago

          Lemmy has quite a few unfortunately invasive qualities of its own, including generally needing an email address from you (Reddit does not), having poor privacy and data retention practices, and generally being very messy with who gets to decide what happens with your data and how easily it can be scraped.

          Sure, Reddit sells it… But Lemmy gives it to any web scraper for free.

          • Dr. Moose@lemmy.world
            link
            fedilink
            English
            arrow-up
            15
            ·
            7 months ago

            But Lemmy gives it to any web scraper for free

            Which is good. You either have an open system or a closed one. There’s no in-between.

            If you want to have advantages of public free decentralized network you can’t obfuscate and centralize bits and pieces of it. Also, it’s 2024, we need to stop this misinformation that email address is supposed to be private. What is private is email address association with the owner and Lemmy doesn’t leak or infringe on. The address is literally called address because it’s supposed to be public.

            • LWD@lemm.ee
              link
              fedilink
              arrow-up
              2
              ·
              edit-2
              7 months ago

              …And attitudes like this towards privacy will keep Lemmy from progressing to a point where those issues will be fixed.

              I have a fundamental problem with giant corporations scraping user data without user consent. That’s a system-level issue. It doesn’t become “good” just because they get to scrape without consent for free.

              • Dr. Moose@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                7 months ago

                Nah it has nothing to do with attitude but with practicality. This would mean people’s fingerprints need to be public and shared between servers or some other hack. It’s just possible in any safety and its not really a hill worth dying on. Do we really care about users dodging subreddit bans that much? Its silly.

                • LWD@lemm.ee
                  link
                  fedilink
                  arrow-up
                  1
                  ·
                  7 months ago

                  I have a few suggestions for development concerns off the top of my head:

                  • Scrub post metadata* after users request its deletion
                  • Auto-purge deleted content* rather than letting it sit behind a “deleted” flag (something Facebook got a ton of flak for doing)
                  • Auto-purge deleted media*
                  • Consider seriously limiting opening data wide for scraping, since the problem is non-consensual scraping, not payment for non-consensual scraping

                  * either immediately or, to prevent spam, after some time

                  • can@sh.itjust.works
                    link
                    fedilink
                    arrow-up
                    1
                    ·
                    7 months ago

                    I agree with your first few points but I’m unsure about the scraping. This is a public forum, what could be done to mitigate scraping that wouldn’t take away form that?

          • andrew_bidlaw@sh.itjust.works
            link
            fedilink
            arrow-up
            8
            ·
            7 months ago

            Instances can enable or disable the email verification and other measures, like asking why you want to join that instance.

            I don’t recall reddit being so liberal. I haven’t used an account I didn’t verified in the same day, so I can’t say if it works, but I suspect they can enable different protocols for inspecting unverified accs.

            As a side-note to that discussion: my VPN works with most services i can’t access otherwise while reddit blocked me as I tried to access it to see for myself. I’m surprised.

        • Ð Greıt Þu̇mpkin@lemm.ee
          link
          fedilink
          arrow-up
          4
          ·
          7 months ago

          Keeping a list of “fingerprints” of users is hardly invasive, and it’s only dangerous without proper database security.

          It can throw up false positives, but the key there is to make it as good at not doing that as possible, and having a reasonable means for users who feel like they were unfairly tagged as evaders to appeal the flag.

          Also, don’t do it automatically, use it as a tool to identify possible cases and have a review team check for which ones need the most immediate action, with help from a separate algorithm that prioritizes user reports by how reliably a users’ reports have pinged actionable content.

          That’s the entire game of security, not being perfect, but being good enough for the adversary to decide you might as well be perfect for all their efforts would be worth, and ban evasion protection and bot prevention are no different.

          • Dr. Moose@lemmy.world
            link
            fedilink
            English
            arrow-up
            6
            ·
            7 months ago

            That’s the entire game of security, not being perfect, but being good enough

            Yes and good enough is so hard to reach that this is no way accomplished with Lemmys volunteer resources. We literally have full time people and massive AI driven systems doing this professionally. This is no way achievable in Lemmy if centralized Reddit with multi-million dollar budgets can’t even get close to “good enough”.

            • Ð Greıt Þu̇mpkin@lemm.ee
              link
              fedilink
              arrow-up
              3
              ·
              7 months ago

              TBF Reddit isn’t exactly trying all that hard since ban evaders tend to be good for engagement metrics. Like half the measures they do employ they only do because they feel like they have to in order to not look like they just blatantly don’t give a shit so long as the investor watched metrics keep going up.

      • Dave@lemmy.nz
        link
        fedilink
        arrow-up
        9
        ·
        7 months ago

        Lemmy has a system whereby admins talk to each other and share details of ban evaders, but different instances decide what is a bannable offence and not all of the 1000+ instances are involved.

      • Dr. Moose@lemmy.world
        link
        fedilink
        English
        arrow-up
        21
        ·
        7 months ago

        nope. You can do IP analysis to ban IP’s that belong to particular VPN but you can’t ban VPN tech. There are so many VPN services and so many proxies and so easy to setup your own VPN that even Netflix struggles with that.

          • lucullus@discuss.tchncs.de
            link
            fedilink
            arrow-up
            9
            ·
            7 months ago

            I think these nations setup the ISPs to look for the packets using a VPN protocol. This protocol is only used between the user and the VPN provider, so the target website doesn’t see it. Though I think this can be evaded too with a bit of work (masking the packets as normal web traffic). One reason why repressive regimes also want to control the devices of the user.

            • Call me Lenny/Leni@lemm.ee
              link
              fedilink
              English
              arrow-up
              1
              ·
              7 months ago

              There are ways to ban them even if it evades detection though. Incompatible formatting and mobile applications come to mind. Charleston in South Carolina having the highest concentration of diehard privacy maintainers goes over nobody’s head, having it come up as one’s location is like wearing a label that says “I am not who I appear to be” and is the source of the most common geoblock in the free world. Probably a giveaway I help keep the peace in a few sites.

              • eyy@lemm.ee
                link
                fedilink
                arrow-up
                2
                ·
                6 months ago

                Isn’t that just a game of whack a mole though? Ban VPNs ending in Charleston, people hop to another location. Rinse and repeat

                • Call me Lenny/Leni@lemm.ee
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  6 months ago

                  The right metaphor would be popping a pimple. Apply pressure and the oil just scatters and becomes aimless. They represent a powerhouse.

          • Monkey With A Shell@lemmy.socdojo.com
            link
            fedilink
            arrow-up
            2
            ·
            6 months ago

            From the client to the VPN host it’s feasible to do protocol/port identification and prevent it that way. Some are significantly more difficult to do that for though, particularly when it uses something like HTTPS to blend in with the general flow. It’s possible to set up a national level proxy gateway, but that would require a user’s system to trust some alternate CA which would be really hard to enforce.

            Short version, there’s always a way around, but they can make it real tough for the average user.