Browsing: Networking

“According to the policy the packet should not have been decrypted” after managment upgrade to R80

It seems that Check Point has changed the way that policies are compiled/handled. At one installation I saw a VPN connection break right after the policy was installed the first time from the upgraded management server. Right after that “According to the policy the packet should not have been decrypted” started appearing in the logs and traffic started being dropped.

After a bit of looking around on the Check Point support site about the message it pointed to multiple objects having similar VPN domains. After going through the object database, I managed to find a “copy” of the peer GW object. Basically 2 gateway objects had same external IP address and similar VPN domains defined. Fortunately that “duplicate” entry wasn’t in use so I deleted it. After deleting the duplicate entry and pushing the policy again, the traffic started flowing again.

I wonder why the upgrade verification process showed no errors or warnings. To me it seems they should have shown some sort of warnings about it, as it breaks things.

{ Add a Comment }

Nice feature/bug in Check Point NAT

It seems that I have stumbled upon a interesting Check Point firewall NAT behavior. Namely the firewall does something that it is not ordered to do, it translates the source IP to a different one than in the policy.

Had a static inbound NAT rule where the source address of the client was supposed to be faked to another static address. What happened was that yes the policy installed fine. Yes the client IP address was changed and hidden from the service. But it was changed to an incorrect address. The rule stated that the client IP address be changed to for example 172.16.2.10, but what it was translated to was 172.16.2.100. And the new address did not exist anywhere in the policy database.

And after upgrading that particular management server to R80.20 it refused to compile the policy with that NAT rule in place. Fortunately recreating the NAT rule removed the issue.

The error in the FWM debug logs was for that issue:
“Invalid Object in Source of Address Translation Rule 57. The range size of Original and Translated columns must be the same.”

The interesting thing is, that it was fixed simply by deleting the rule and just re-adding it. And yes it was a 1 address to 1 address translation, not a network to 1 address.

{ Add a Comment }

Upgrading a secondary management server to R80.20 GA issue

The Check Point upgrade guide for R80.20 is quite clear in saying that on a secondary server you should perform a clean install. All is fine and dandy with that. I don’t know if I am the first one to actually run in to a problem with it.

When using the CPUSE clean install feature in the WebUI, it seemed to work. That is after I had resized the partitions to give it enough space to perform the upgrade. It was arguing that 30GB of free space is actually 29.83GB. So, after making the current active install partition smaller all seemed well. Until the reboot that is.

After the server rebooted, I could see the new WebUI and felt excited, that it actually kept the original IP address. As CPUSE had warned that all settings will be wiped that was a nice surprise.

Unfortunately, the nice things ended there. I was unable to log in. The default “admin/admin” combo didn’t work or the previous credentials that i had. Found no hint on the Check Point support site either.

After a bit of googling around ended up just downloading the clean install ISO file and re-installing the machine. If I had used the ISO before instead of the CPUSE, I would have saved a lot of time. I guess that CPUSE clean install feature is not that clean yet.

{ Add a Comment }

No logs after upgrade from R77.30 to R80.20?

Are you getting no logs your Check Point log server after an upgrade from R77.30 to R80.20?

Are you sure you did the “install database” step in your Smart Center right after it came back up? Before doing that, no logs will be indexed besides some Smart Center local messages.

{ Add a Comment }

F5 Big-IP LTM expired password issue

Although the issue I am writing about doesn’t exist anymore in version 13.x, it is still relevant to lower versions.

Namely when a user fails to change their password before their password expires completely they can’t log in to the web interface any more. They don’t get an error saying that their password is expired. Neither do they get a prompt to change it. They actually get an error about invalid credentials.

Initially when investigating the issue, I changed the affected users passwords manually. But then I asked one user to try and log in using SSH. What happened was, he was prompted to change his password. After that, he could successfully log in to the web interface again. And no that user did not have CLI permissions. So if you are not in a hurry to upgrade to versions 13.x and up, you still have a workaround.

{ Add a Comment }

F5 Big-IP password policy behavior

As it turns out F5 Big-IP LTM devices apply/check password policy only when the user changes their password. What it means is, that users that existed prior to the policy being applied will not have their password expire, etc.

I know that checking the password strength after it has already been set that is “kind of hard”. But the least you can do is set the passwords to expire according to the policy. In the case no expiry time exists it should be set to all users, to make the device actually comply with the policy that it has configured. So, in my opinion that is F5’s oversight.

So in order to actually enforce the policy you must take care that your users change their passwords after the password policy changes to actually apply them.

{ Add a Comment }

F5 BigIP health checks mark host resource down although it’s up

A couple of times I have happened to run across a strange issue on some F5 Big-IP LTM clusters where one of the node’s marks some resources as down although they are actually up. Which can cause quite a lot of confusion and trouble.

At least in the cases that I have seen TMM seems to start interpreting the output of health checks backwards for some hosts. In the logs you can see that the health check returned the host is up and that host was marked as down.  I have had it happen a couple of times with the 11.x series LTM software and it has also happened with the 12.x versions even with the latest patch levels. But I have not seen it happen with the 13.x version(yet).

So in order to get around the issue I have usually just restarted the TMM process on the affected device and all has gone back to normal after it.

Basically to restart the TMM just log in to the device using SSH and issue the following command:

tmsh restart /sys tmm

Beware that restarting the TMM will cause the device to stop processing traffic. So, in case you are having the issue on a device processing the traffic and are running a Big-IP cluster just do a fail-over first if you already haven’t done it.

Like with many other issues the phrase “have you tried turning it off and on again” comes to mind and saves the day.

{ Add a Comment }

Check Point 1400 series SMB device VPN debug log fast rotation work-around

If you have ever had to debug VPN-s on a Check Point SMB device you might have noticed that they rotate their logs every 1MB, which means that sometimes You might actually miss the information You were looking for.  At least for me it was a problem trying to get debug level information on some VPN issues that occurred randomly. 

So in order to get the required output I added a 32GB SD-card to the firewall to extend its small storage made some symlinks and wrote a few little script to get all the output I required for debugging.

So on to the details. After you have mounted your SD-card you have access to it on the path:

/mnt/sd

Before You enable debugging You should make symbolic links for the ikev2.xmll and ike.elg files so that you wouldn’t run out of space on the built-in flash.  You can do that by using the following commands:

touch /mnt/sd/ikev2.xmll && touch /mnt/sd/ike.elg
ln -s /opt/fw1/log/ike.elg /mnt/sd/ike.elg
ln -s /op/fw1/log/ikev2.xmll /mnt/sd/ikev2.xmll

Now enable debugging like you usually would(cp support site SK):

vpn debug trunc
vpn debug on TDERROR_ALL_ALL=5

And here is the script I used to copy the logs to the SD-card as they were rotated:

!/bin/bash
while true
do
fmtime=$(stat -c %Y /opt/fw1/log/sfwd.elg.0)
curtime=$(date +%s)
diff=$(echo $curtime-$fmtime|bc)
if test $diff -le 1
then
cp /opt/fw1/log/sfwd.elg.0 /mnt/sd/sfwd.elg-$fmtime
fi
sleep 1
done

So basically, it checks if the sfwd.elg.0 file has changed every second and copies the changed file to the SD-card. I actually also experimented using logger to send the log to a central server via syslog. Using logger just didn’t work. It sent the first one fine, but then the other changes afterwards were just dropped and I opted for the copying. 

{ Add a Comment }

Fixing Smart Dashboard crashing after receiving “Disconnected_Objects already created by another user” error

Today I happened upon an error Smart Dashboard after it randomly crashed and refused to start again. After the crash it started always showing me the error “Disconnected_Objects already created by another user” and crashing again. Quick lookup on Check Point’s support site gave me the idea that SmartMap cache might be corrupted.  So here is a quick copy paste of the commands needed to reset the Smart Map cache in R77.30 on Gaia.

mkdir -p /var/tmp/SmartMap_Backup/
cpstop
cd $FWDIR/conf/SMC_Files/vpe/
mv mdl_version.C /var/tmp/SmartMap_Backup/mdl_version.C
mv objects_graph.mdl /var/tmp/SmartMap_Backup/objects_graph.mdl
cd $FWDIR/conf/
mv applications.C /var/tmp/SmartMap_Backup/applications.C
mv CPMILinksMgr.db /var/tmp/SmartMap_Backup/CPMILinksMgr.db
cpstart

After doing that I was able to start Smart Dashboard again and continue working! 🙂

If you are running your management server on Windows are actually are using Multi-Domain-Server you can find the commands needed to do the same on those systems in “sk92142” which is about “SmartDashboard crashes when loading SmartMap data, after upgrading the Security Management Server “

{ Add a Comment }

Check Point unable to delete IKE/IPSEC SA on a SMB device cluster

On a Check Point SMB 1400 series appliance cluster with R77.20.75 installed I happened to run in to an issue where after changing the peer Gateway’s IP address the VPN did not want to come up again and VPN TU showed me a SA’s relating to the old peer IP address. VPN TU delete command did not remove them. Also disabling the VPN community/removing the gateways from it did nothing, still the stubborn SA’s remained, even waiting for the timeouts to occur did nothing.

What in the end actually removed the stuck SA was doing “cp stop” “cp start” on both of the devices with manual fail over in between. After that VPN TU didn’t show the stuck SA any more and the VPN started working again with the peer’s new IP address.

{ Add a Comment }