• AllHailTheSheep@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    22
    ·
    edit-2
    15 hours ago

    according to that page the issue stemmed from an underlying system responsible for health checks in load balancing servers.

    how the hell do you fuck up a health check config that bad? that’s like messing up smartd.conf and taking your system offline somehow

    • tatterdemalion@programming.dev
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 hours ago

      If your health check is broken, then you might not notice that a service is down and you’ll fail to deploy a replacement. Or the opposite, and you end up constantly replacing it, creating a “flapping” service.

    • ayyy@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      16
      ·
      14 hours ago

      Well, you see, the mistake you are making is believing a single thing the stupid AWS status board says. It is always fucking lying, sometimes in new and creative ways.

    • flux@lemmy.ml
      link
      fedilink
      English
      arrow-up
      3
      ·
      14 hours ago

      I mean if your OS was “smart” as not to send IO to devices that indicate critical failure (e.g. by marking them read-only in the array?), and then thinks all devices have failed critically, wouldn’t this happen in that kind of system as well…