So, a few days ago, I scheduled an upgrade of a Fortigate HA pair from 6.2.3 to 6.4.5.
The task was to be performed remotely but the car was all fueled up, permission to access the premisses in case of trouble was given, the maintenance window was long enough, all images that I was going to need were downloaded to a local VM including the 6.2.3 version in case I need to roll back. And of course, a backup of the configuration was taken.
The HA pair was configured as Active Passive and according to design, one ping should be lost only during the upgrade but this is not true.
Frustration number one. After v6 all the upgrades have a downtime. How an HA upgrade should work and it worked on versions 5… The Secondary device is upgraded first, rebooted, all if all is ok, then the Primary device is upgraded and rebooted, while the Secondary takes over. When the Primary is up again, it takes over and you may lose a ping or two but that’s it.
I guess HA does not mean High Availability, at least during upgrades, because this, is no longer the case. When the Primary unit is rebooted, the Secondary does not take over…. Anyway, as I had a long maintenance window, I did not care much, so I issued the command to upgrade to 6.4.2 (as instructed according to the official upgrade path) and I waited the (3-4) x 2 minutes for the Master/Primary unit to boot up. When the unit booted up, the Secondary on the HA pair was not visible any longer and the unit could not connect to the internet. Excellent… Hop on the car, drive to location and connect with my trusted USB to Serial and blue Cisco console cable to the Secondary/Slave Unit.
Frustration number two. config system ha and show ….. Nothing…ok, it lost the ha configuration… get system status Current HA mode: STANDALONE. How nice! Two units with exactly the same IP configuration fighting which one is better… Keep calm and copy paste… I change the cable to the Primary unit, copy the ha configuration to a notepad, change the priority to something less and paste it to the Secondary unit. Simple, right? Well…. No. The heartbeat interface was set to one of the management ports. set hbdev “MGMT2” throughs me an error that it can’t be used as a heartbeat interface. Wait what? I just copied the config from the Primary unit and it was accepted there…
Anyway, to cut the long story short, I changed the heartbeat interface to something else and the HA was recovered and the following upgrades went through without a hitch.
Conclusion. Do not use the MGMT interfaces for heartbeat if you are running versions 6.4.
Questions to Fortinet.
1. Why there is no documentation (at least I did not find any) that the management interface can not be used for heartbeat?
2. When you are upgrading the unit why there is no check on the commands that do not work? Especially with the HA and the conversion of the Secondary unit to Standalone is at least dangerous.
3. Why the Primary unit accepted the command?
Don’t get me wrong. I love the Forti* products. They are good value for money, I suggest them to friends and colleagues but I run into some issues that should not have passed their Quality Control.