• ikidd@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    5 months ago

    I’ve been running OPNsense on Proxmox for years now, it just seems to plug along. I run ZFS for the datastores and do a snapshot before updates, but I’ve never had to use one.

    Recently got it working with HA and inadvertently tested it by having a drive failure on my primary node. I remoted in for for something else and realized it had failed over to the second node about a week before, and I’d never heard a word from the family about internet being down.

    • picnicolas@slrpnk.net
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      5 months ago

      That’s great. It’s been chugging along beautifully with no downtime for me too. It’s just that one failed update attempt, losing internet and network while it was down, and needing to go Ethernet directly into the box to do the snapshot rollback late at night made me afraid to try again. Last night it took me two hours to update everything , first proxmox 7 to 8, then OPNsense needed 4 rounds of update and reboot but each one was seamless.

      I’m also on ZFS with two primary mirrored drives. Do you have to check zfs status regularly to see if a drive has failed? Or is there some kind of warning system when logging in via SSH?

      I’m thinking of turning my rarely used windows gaming PC into a proxmox host with a Linux gaming VM for my next adventure.

      Edit: realized it was a whole node that failed, not just a drive. Cool setup! I’m not there yet. I’m curious about your setup, what’s between the modem and the router?

      • ikidd@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        ·
        5 months ago

        Proxmox will report SMART errors via email if you set that up. You could also run a system like Nagios to run the checks via another box. I actually run Home Assistant with the Proxmox HACS extension to monitor it. It’s on a VM so that isnt’ ideal, so I also run Node Red on the little I5 PBS box to send alerts if it can’t contact Proxmox itself now. The node going down without me realizing it was a bit of a wakeup call, though it failed my docker host and router over so seamlessly it was astounding.

        I have nothing between the router and the modem except a switch so each Proxmox node can have a NIC on the external network and failover/migrating can pick up the modem and use it. I suppose I could VLAN, but the servers have 2 network ports anyway so that works fine.