@[email protected]

Mid 2022, a friend of mine helped me set up a selfhosted Vaultwarden instance. Since then, my “infrastructure” has not stopped growing, and I’ve been learning each and every day about how services work, how they communicate and how I can move data from one place to another. It’s truly incredible, and my favorite hobby by a long shot.

Here’s a map of what I’ve built so far. Right now, I’m mostly done, but surely time will bring more ideas. I’ve also left out a bunch of “technically revelant” connections like DNS resolution through the AdGuard instance, firewalls and CrowdSec on the main VPS.

Looking at the setups that others have posted, I don’t think this is super incredible - but if you have input or questions about the setup, I’ll do my best to explain it all. None of my peers really understand what it takes to construct something like this, so I am in need of people who understand my excitement and proudness :)

Edit: the image was compressed a bit too much, so here’s the full res image for the curious: https://files.catbox.moe/iyq5vx.png And a dark version for the night owls: https://files.catbox.moe/hy713z.png

  • jkrtn@lemmy.ml
    link
    fedilink
    English
    arrow-up
    5
    ·
    11 months ago

    I’ve seen Caddy mentioned a few times recently, what do you like about it over other tools?

    • 7Sea_Sailor@lemmy.dbzer0.comOP
      link
      fedilink
      English
      arrow-up
      7
      ·
      11 months ago

      In addition to the other commenter and their great points, here’s some more things I like:

      • ressource efficient: im running all my stuff on low end servers, and cant afford my reverse proxy to waste gigabytes of RAM (kooking at you, NPM)
      • very easy syntax: the Caddyfile uses a very simple, easy to remember syntax. And the documentation is very precise and quickly tells me what to do to achieve something. I tried traefik and couldn’t handle the long, complicated tag names required to set anything up.
      • plugin ecosystem: caddy is written in go, and very easy to extend. There’s tons of plugins for different functionalities, that are (mostly) well documented and easy to use. Building a custom caddy executable takes one command.
      • jkrtn@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        11 months ago

        I think the two of you have convinced me to check it out! It is sounding pretty great, so thank you in advance.

    • xantoxis@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      ·
      11 months ago

      I can answer this one, but mainly only in reference to the other popular solutions:

      • nginx. Solid, reliable, uncomplicated, but. Reverse proxy semantics have a weird dependency on manually setting up a dns resolver (why??) and you have to restart the instance if your upstream gets replaced.
      • traefik. I am literally a cloud software engineer, I’ve been doing Linux networking since 1994 and I’ve made 3 separate attempts to configure traefik to work according to its promises. It has never worked correctly. Traefik’s main selling point to me is its automatic docker proxying via labels, but this doesn’t even help you if you also have multiple VMs. Basically a non-starter due to poor docs and complexity.
      • caddy. Solid, reliable, uncomplicated. It will do acme cert provisioning out of the box for you if you want (I don’t use that feature because I have a wildcard cert, but it seems nice). Also doesn’t suffer from the problems I’ve listed above.
      • OSH@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        11 months ago

        Fully agree to this summary. traefik also gave me a hard time initially, but once you have the quirks worked out, it works as promised.

        Caddy is absolutely on my list as an alternative, but the lack of docker label support is currently the main roadblocker for me.

      • jkrtn@lemmy.ml
        link
        fedilink
        English
        arrow-up
        2
        ·
        11 months ago

        I feel so relieved reading that about traefik. I briefly set that up as a k8s ingress controller for educational purposes. It’s unnecessarily confusing, brittle, and the documentation didn’t help. If it’s a pain for people in the industry that makes me feel better. My next attempt at trying out k8s I’ll give Kong a shot.

        I really like solid, reliable, and uncomplicated. The fun part is running the containers and VMs, not spending hours on a config to make them accessible.

        • notfromhere@lemmy.ml
          link
          fedilink
          English
          arrow-up
          2
          ·
          11 months ago

          I have traefik running on my kubernetes cluster as an ingress controller and it works well enough for me after finagling it a bit. Fully automated through ansible and templated manifests.

          • xantoxis@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            11 months ago

            Heh. I am, as I said, a cloud sw eng, which is why I would never touch any solution that mentioned ansible, outside of the work I am required to do professionally. Too many scars. It’s like owning a pet raccoon, you can maybe get it to do clever things if you give it enough treats, but it will eventually kill your dog.

            • notfromhere@lemmy.ml
              link
              fedilink
              English
              arrow-up
              1
              ·
              11 months ago

              Care to share some war stories? I have it set up where I can completely destroy and rebuild my bare metal k3s cluster. If I start with configured hosts, it takes about 10 minutes to install k3s and get all my services back up.

              • xantoxis@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                ·
                edit-2
                11 months ago

                Sure, I mean, we could talk about

                • dynamic inventory on AWS means the ansible interpreter will end up with three completely separate sets of hostnames for your architecture, not even including the actual DNS name. if you also need dynamic inventory on GCP, that’s three completely different sets of hostnames, i.e. they are derived from different properties of the instances than the AWS names.
                • btw, those names are exposed to the ansible runtime graph via different names i.e. ansible_inventory vs some other thing, based on who even fuckin knows, but sometimes the way you access the name will completely change from one role to the next.
                • ansible-vault’s semantics for when things can be decrypted and when they can’t leads to completely nonsense solutions like a yaml file with normal contents where individual strings are encrypted and base64-encoded inline within the yaml, and others are not. This syntax doesn’t work everywhere. The opaque contents of the encrypted strings can sometimes be treated as traversible yaml and sometimes cannot be.
                • ansible uses the system python interpreter, so if you need it to do anything that uses a different Python interpreter (because that’s where your apps are installed), you have to force it to switch back and forth between interpreters. Also, the python setting in ansible is global to the interpreter meaning you could end up leaking the wrong interpreter into the role that follows the one you were trying to tweak, causing almost invisible problems.
                • ansible output and error reporting is just a goddamn mess. I mean look at this shit. Care to guess which one of those gives you a stream which is parseable as json? Just kidding, none of them do, because ansible always prefixes each line.
                • tags are a joke. do you want to run just part of a playbook? --start-at. But oops, because not every single task in your playbook is idempotent, that will not work, ever, because something was supposed to happen earlier on that didn’t. So if you start at a particular tag, or run only the tasks that have a particular tag, your playbook will fail. Or worse, it will work, but it will work completely differently than in production because of some value that leaked into the role you were skipping into.
                • Last but not least, using ansible in production means your engineers will keep building onto it, making it more and more complex, “just one more task bro”. The bigger it gets, the more fragile it gets, and the more all of these problems rears its head.
                • notfromhere@lemmy.ml
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  edit-2
                  11 months ago
                  • Dynamic inventory. I haven’t used it on a cloud api before but I have used it against kube API and it was manageable. Are you saying through kubectl the node names are different depending on which cloud and it’s not uniform? Edit: Oh you’re talking about the VMs doh

                  • I’ve tried ansible vault and didn’t make it very far… I agree that thing is a mess.

                  • Thank god I haven’t ran into interpreter issues, that sounds like hell.

                  • Ansible output is terrible, no argument there.

                  • I don’t remember the name for it, but I use parameterized template tasks. That might help with this? Edit: include_tasks.

                  • I think this is due to not a very good IDE for including the whole scope of the playbook, which could be a condemnation of ansible or just needing better abstraction layers for this complex thing we are trying to manage the unmanageable with.

                  • xantoxis@lemmy.world
                    link
                    fedilink
                    English
                    arrow-up
                    2
                    ·
                    11 months ago

                    Really all of these have solutions, but they’re constantly biting you and slowing down development and requiring people to be constantly trained on the gotchas. So it’s not that you can’t make it work, it’s that the cost of keeping it working eats away at all the productive things you can be doing, and that problem accelerates.

                    The last bullet is perhaps unfair; any decent system would be a maintainable system, and any unmaintainable system becomes less maintainable the bigger your investment in it. Still, it’s why I urge teams to stop using it as soon as they can, because the problem only gets worse.

    • krash@lemmy.ml
      link
      fedilink
      English
      arrow-up
      2
      ·
      11 months ago

      I see everyone else have already chimed in on whats so great about Caddy (because it is!), one thing that has been a thorn in my side though is the lack of integration of fail2ban since Caddy has moved on from the old common log format and moved on to more modern log formats. So if you want to use a IPS/IDS, you’ll have to either find a creative hack to make it work with fail2ban or rely on more modern (and resource heavier) solutions such as crowdsec.

        • krash@lemmy.ml
          link
          fedilink
          English
          arrow-up
          2
          ·
          11 months ago

          Cool, thanks for this! As a user of Caddy through Docker, I suppose I need to find a way to build a docker image to be able to do this?

          Sometimes new simple technologies makes things simple - but only as long as one intends to follow how they are used… 🙃

          • xinayder@infosec.pub
            link
            fedilink
            English
            arrow-up
            2
            ·
            11 months ago

            I think so, but if you check the official image you can definitely find out how to include custom plugins in it. I think the documentation might mention a thing or two about it too.