Category: F5 BigIP

F5 Big-IP LTM expired password issue

Although the issue I am writing about doesn’t exist anymore in version 13.x, it is still relevant to lower versions.

Namely when a user fails to change their password before their password expires completely they can’t log in to the web interface any more. They don’t get an error saying that their password is expired. Neither do they get a prompt to change it. They actually get an error about invalid credentials.

Initially when investigating the issue, I changed the affected users passwords manually. But then I asked one user to try and log in using SSH. What happened was, he was prompted to change his password. After that, he could successfully log in to the web interface again. And no that user did not have CLI permissions. So if you are not in a hurry to upgrade to versions 13.x and up, you still have a workaround.

F5 Big-IP password policy behavior

As it turns out F5 Big-IP LTM devices apply/check password policy only when the user changes their password. What it means is, that users that existed prior to the policy being applied will not have their password expire, etc.

I know that checking the password strength after it has already been set that is “kind of hard”. But the least you can do is set the passwords to expire according to the policy. In the case no expiry time exists it should be set to all users, to make the device actually comply with the policy that it has configured. So, in my opinion that is F5’s oversight.

So in order to actually enforce the policy you must take care that your users change their passwords after the password policy changes to actually apply them.

F5 BigIP health checks mark host resource down although it’s up

A couple of times I have happened to run across a strange issue on some F5 Big-IP LTM clusters where one of the node’s marks some resources as down although they are actually up. Which can cause quite a lot of confusion and trouble.

At least in the cases that I have seen TMM seems to start interpreting the output of health checks backwards for some hosts. In the logs you can see that the health check returned the host is up and that host was marked as down.  I have had it happen a couple of times with the 11.x series LTM software and it has also happened with the 12.x versions even with the latest patch levels. But I have not seen it happen with the 13.x version(yet).

So in order to get around the issue I have usually just restarted the TMM process on the affected device and all has gone back to normal after it.

Basically to restart the TMM just log in to the device using SSH and issue the following command:

tmsh restart /sys tmm

Beware that restarting the TMM will cause the device to stop processing traffic. So, in case you are having the issue on a device processing the traffic and are running a Big-IP cluster just do a fail-over first if you already haven’t done it.

Like with many other issues the phrase “have you tried turning it off and on again” comes to mind and saves the day.

Renewing F5 BigIP LTM expired device certificates

Every once in a while it is necessary to renew the device certificates on your BigIP devices which are used in the connection for the Web UI(XUI). It’s easy enough to do using the web interface. When the certificate hasn’t expired yet just log in to the Web UI using any web browser you like, but when the certificate has already expired Edge/Chrome/Firefox won’t let you in (no there is no “proceed” button, since the management interface is using strict settings), but Internet Explorer will still work. If you don’t have Internet Explorer available, it can also be done via the command line interface.

To renew the device certificate using the web interface just log in to the management interface and go to the page: System ›› Device Certificates : Device Certificate ›› Device Certificate and click on the Renew button. There you can choose whether you want to create a new self signed certificate or generate a certificate request to your company internal CA, or some external CA if you prefer.

In a clustered environment after you renew the certificate on one device, you need to sync the configurations between the devices before proceeding to update the others. If you don’t do config sync in between you may end up having to renew the previously already renewed certificates again, as config sync will push the old certificates back to active state on the other devices, since it doesn’t have info on the peer’s new certificates.

When renewing device certificates using the command line you will need to use openssl to generate the new rsa private key and certificate request and then use tmsh to activate the newly created key/certificate pair.

OpenSSL command example for generating a new RSA key and creating a certificate request:

openssl req -out CSR.csr -new -newkey rsa:2048 -nodes -keyout privateKey.key

OpenSSL command example for generating a new self signed certificate:

openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout privateKey.key -out certificate.crt

The newly created private key should be placed in the /config/http/confd/ssl.key/ directory and the newly created certificate should be placed in the /config/httpd/conf/ssl.crt/ directory. After you have placed them there, the command to activate new key/certificate pair using tmsh is:

tmsh modify /sys httpd ssl-certkeyfile /config/httpd/conf/ssl.key/new-private.key ssl-certfile /config/httpd/conf/ssl.crt/new-certificate.crt

 

Why using VMware vMotion on an active F5 BigIP LTM VE cluster member can be a bad idea

Although F5 states that starting from version 11.5 it supports vMotion to move a BigIP LTM VE instance between physical hosts (K15003222) some times it still can cause issues even in the newer 12.x series software. To those that didn’t want to click on the link and read what F5 has to say about it here are their recommendations for using vMotion:

  • You should perform a live migration of BIG-IP VE virtual machines on idle BIG-IP VE virtual machines. Performing a live migration of the BIG-IP VE system while the virtual machine is processing application traffic may produce unexpected results, such as dropped connections.
  • Using the vMotion feature to migrate one member of a high availability (HA) pair should not cause a failover in most cases. However, F5 recommends that you thoroughly test vMotion migration for HA systems, as you may experience different results, depending on your environment.

Well having tested it I have to say that yes, moving an active member is a bad idea since it can have “nice” side effects in certain cases. I like their unexpected results statement, namely I have seen one BigIP LTM instance drop half it’s inbound connections after vMotion in a way that even after a reboot/upgrade to a newer patch level it still drops connections from certain IP addresses in a way that they don’t even show up in tcpdump and no half the connections don’t go to the standby node they just vanish.. and as soon as you force that device to standby on the other node they re-appear.  So be very careful on what you migrate during the night, as unexpected things might happen…

But atleast in my case using vMotion on the BIG-IP VE virtual machine again, this time in standby mode and then making it active again got traffic flowing normally again.

ConfigSync issue upon resource pool member IP address change

Well as it turn’s out changing a node’s IP address in clustered environments doesn’t go that smoothly as one would expect. Not only that, F5 have made a annoyingly complex procedure of something that simple as changing one back end servers IP address. What I mean by that is, that in order to change the IP address of a node you actually have to delete the node and then re-create it. But when you have one node running a ton of services on different ports and is a part of a large amount of resource pools, etc you have to remove it from all of the resource pools before you can delete it. And then after creating the node again you have have to put it back in the pools you need it to be in.

As it turns out in a clustered environment when you do the aforementioned procedure and then try to sync the cluster member settings manually it will fail with an error message saying something like this:

0107003c:3: Invalid pool member modification. An IP address change from (192.168.1.1) to (192.168.1.2) is not supported.

So in order to avoid that error you need to sync the devices after you have finished all the deletion steps of the node and after you have done the config sync you may proceed with the creation of the node on the new IP address.

If you are already in the state where it is refusing to sync, well what helps is deleting the troublesome node on the secondary device also and then performing the config sync.

Removing stubborn client connections on F5 BigIP

In F5 BigIP LTM devices to see the connections table there is the “tmsh show sys connection”  command which would print out the entire connection table. To get more specific results it has the following parameters available for filtering:

age connection-id cs-client-addr cs-client-port cs-server-addr cs-server-port protocol ss-client-addr ss-client-port ss-server-addr ss-server-port type

cs-* parameters are relating to the connections on the external side of your load balancer in F5 terms the client-side. To see a single clients connections to your device you could issue the following command:

tmsh show sys connection cs-client-addr 172.16.1.100

Which would produce the following output in my case:

Sys::Connections

172.16.1.100:12727  192.168.32.20:443  192.168.1.254:12727  192.168.1.10:443  tcp  213  (tmm: 0)  none

Total records returned: 1

The out put show’s that the client with the IP address 172.16.1.100 is connected to the Virtual Server running on the IP address 192.168.32.20 and port 443 and the connection it self has been sent’t to the back end server with the IP address 192.168.1.10.

Lets say you have disabled that node in your LB but the client is still connected to that server and want to remove the client’s connection so it would be sent to a new resource pool member you can remove the connection with the following command:

tmsh delete sys connection cs-client-addr 172.16.1.100 cs-server-addr 192.168.32.20 cs-server-port 443

You could get even more specific on the connection you want to delete based on the other parameters available like cs-client-port,etc that were mentioned in the beginning.