Finding missing free disk space in Linux, the power of lsof

There might be times when you find that your Linux machines disk seems to be full and you can’t find the reason for it. You try and find the culprit with the du (disk usage) command, but with no success. The numbers just don’t add up.  In that case actually the problem might be that you have some deleted files that are still open by some program. It can actually happen with faulty logrotate configurations where you don’t tell the program that is writing the log to release the file. Or that you manually deleted a file that some program was writing to.

In such cases the “lsof” command comes to the rescue. Basically, it does what the name says, lists open files – even if it has been deleted and is still in use.  Here is an example of a command that I sometimes use to find if there are deleted files that are still open:

lsof | grep deleted|awk '{$7=$7/1048576 "MB"; print}'

The output of the previous command would list you the open deleted files, the process that is still writing to them and the size of the files. This is some random example output from when I last had to look for missing space:

java 32511 32646 tom 1w REG 12980.00024128MB 19510390447 6341662 /var/log/tomcat/log/catalina.out (deleted)
java 32511 32646 tom 2w REG 0.00024128MB 19510390447 6341662 /var/log/tomcat/log/catalina.out (deleted)

To reclaim the disk space, you just simply need to kill/restart the program that is writing to the deleted file.

{ Add a Comment }

What’s up with all the bad passwords out there

A bit over a week ago the list of the worst passwords of the year (2018) was released by SplashData. You can review it yourself at https://www.teamsid.com/100-worst-passwords-top-50/.

After having a look at it I found myself amazed at the people’s choices of password. It just baffles me that people are still using passwords like “password” or “1234” as their password and when websites require longer passwords they just keep counting up the numbers instead of “1234” its now “12345678..”.

Do people still actually think that their passwords don’t matter? That no one will guess their username and password? By now almost everybody must have heard of the constant take overs of peoples social media accounts through simple password guessing. If not that, then people surely must have already come in contact with some one trying to log in to their account at some point – warnings at Gmail or similar services. Surely that should make people think.

In order for a password to resist simple brute force attacks it doesn’t have to be too complicated and something that is hard to remember like “x1Ds$!abFrdc?”. You can just your favorite quote from somewhere, which would be very easy to remember and much more secure than the ones on the list. To be a bit on the safer side you can add something to the beginning or ending of it. That would just be a precaution against some attackers that actually do some research on you. So that it wouldn’t happen that an attacker sees that The Simpsons is your favorite TV-show and would guess that your password is “Eatmyshorts!”

{ Add a Comment }

F5 BigIP health checks mark host resource down although it’s up

A couple of times I have happened to run across a strange issue on some F5 Big-IP LTM clusters where one of the node’s marks some resources as down although they are actually up. Which can cause quite a lot of confusion and trouble.

At least in the cases that I have seen TMM seems to start interpreting the output of health checks backwards for some hosts. In the logs you can see that the health check returned the host is up and that host was marked as down.  I have had it happen a couple of times with the 11.x series LTM software and it has also happened with the 12.x versions even with the latest patch levels. But I have not seen it happen with the 13.x version(yet).

So in order to get around the issue I have usually just restarted the TMM process on the affected device and all has gone back to normal after it.

Basically to restart the TMM just log in to the device using SSH and issue the following command:

tmsh restart /sys tmm

Beware that restarting the TMM will cause the device to stop processing traffic. So, in case you are having the issue on a device processing the traffic and are running a Big-IP cluster just do a fail-over first if you already haven’t done it.

Like with many other issues the phrase “have you tried turning it off and on again” comes to mind and saves the day.

{ Add a Comment }

Check Point 1400 series SMB device VPN debug log fast rotation work-around

If you have ever had to debug VPN-s on a Check Point SMB device you might have noticed that they rotate their logs every 1MB, which means that sometimes You might actually miss the information You were looking for.  At least for me it was a problem trying to get debug level information on some VPN issues that occurred randomly. 

So in order to get the required output I added a 32GB SD-card to the firewall to extend its small storage made some symlinks and wrote a few little script to get all the output I required for debugging.

So on to the details. After you have mounted your SD-card you have access to it on the path:

/mnt/sd

Before You enable debugging You should make symbolic links for the ikev2.xmll and ike.elg files so that you wouldn’t run out of space on the built-in flash.  You can do that by using the following commands:

touch /mnt/sd/ikev2.xmll && touch /mnt/sd/ike.elg
ln -s /opt/fw1/log/ike.elg /mnt/sd/ike.elg
ln -s /op/fw1/log/ikev2.xmll /mnt/sd/ikev2.xmll

Now enable debugging like you usually would(cp support site SK):

vpn debug trunc
vpn debug on TDERROR_ALL_ALL=5

And here is the script I used to copy the logs to the SD-card as they were rotated:

!/bin/bash
while true
do
fmtime=$(stat -c %Y /opt/fw1/log/sfwd.elg.0)
curtime=$(date +%s)
diff=$(echo $curtime-$fmtime|bc)
if test $diff -le 1
then
cp /opt/fw1/log/sfwd.elg.0 /mnt/sd/sfwd.elg-$fmtime
fi
sleep 1
done

So basically, it checks if the sfwd.elg.0 file has changed every second and copies the changed file to the SD-card. I actually also experimented using logger to send the log to a central server via syslog. Using logger just didn’t work. It sent the first one fine, but then the other changes afterwards were just dropped and I opted for the copying. 

{ Add a Comment }

Fixing Smart Dashboard crashing after receiving “Disconnected_Objects already created by another user” error

Today I happened upon an error Smart Dashboard after it randomly crashed and refused to start again. After the crash it started always showing me the error “Disconnected_Objects already created by another user” and crashing again. Quick lookup on Check Point’s support site gave me the idea that SmartMap cache might be corrupted.  So here is a quick copy paste of the commands needed to reset the Smart Map cache in R77.30 on Gaia.

mkdir -p /var/tmp/SmartMap_Backup/
cpstop
cd $FWDIR/conf/SMC_Files/vpe/
mv mdl_version.C /var/tmp/SmartMap_Backup/mdl_version.C
mv objects_graph.mdl /var/tmp/SmartMap_Backup/objects_graph.mdl
cd $FWDIR/conf/
mv applications.C /var/tmp/SmartMap_Backup/applications.C
mv CPMILinksMgr.db /var/tmp/SmartMap_Backup/CPMILinksMgr.db
cpstart

After doing that I was able to start Smart Dashboard again and continue working! 🙂

If you are running your management server on Windows are actually are using Multi-Domain-Server you can find the commands needed to do the same on those systems in “sk92142” which is about “SmartDashboard crashes when loading SmartMap data, after upgrading the Security Management Server “

{ Add a Comment }

Windows Offline files not syncing in Windows 10

Usually I don’t have that many issues with Windows 10, but somehow after last Windows update I lost control over the contents of the “Documents” folder which was being synced with a file server. I was able to add files but never delete them getting the error “Permission Denied”. Talked to the domain admin, he looked over the permissions on the file server and all seemed fine there. Reset the offline file sync cache, etc (the usual hints you get while googling resetting offline files sync issues) got me back permissions on my files, or so I thought.. After leaving the office I noticed in the evening that I have no more Documents at all. It turned out that after the reset Offline files were not syncing at all and I was able to access them only when I had connectivity to the file server. The issue was that offline files were in “sync pending state” and it wouldn’t actually start the sync.

Try the classics “reboot” the computer, no the sync would not start again, try resetting the offline files cache again – no success.. What actually worked for me was running:

gpupdate /force

After re-installing the group policy clicked on the sync offline files button and voila it synced like a charm again.

{ Add a Comment }

Check Point unable to delete IKE/IPSEC SA on a SMB device cluster

On a Check Point SMB 1400 series appliance cluster with R77.20.75 installed I happened to run in to an issue where after changing the peer Gateway’s IP address the VPN did not want to come up again and VPN TU showed me a SA’s relating to the old peer IP address. VPN TU delete command did not remove them. Also disabling the VPN community/removing the gateways from it did nothing, still the stubborn SA’s remained, even waiting for the timeouts to occur did nothing.

What in the end actually removed the stuck SA was doing “cp stop” “cp start” on both of the devices with manual fail over in between. After that VPN TU didn’t show the stuck SA any more and the VPN started working again with the peer’s new IP address.

{ Add a Comment }

Windows 10 CPU speed not down clocking like it should / Intel speedstep issue and fix

On my Lenovo T480 I ran in to a nice issue where the fan was constantly blowing and CPU was always over 4 GHz never clocking it self down. In Windows power management, it was set to balanced as it should, changing between power plans had no effect, always CPU at max speed. When I started googling about the issue I ran into different forum threads describing the same issue with people saying that Windows Power Management setting sometimes get corrupted and that the only fix they found is a re-install of Windows which for me was an unacceptable solution…

What actually helped me get around the issue was when i changed the CPU minimum and maximum speeds in power management. Basically I changed the minimum to 1% instead of 5 and max to 90%, applied the settings noticed the CPU clocked down instantly to 0,89 GHz like it was supposed to be, then reset the “Balanced power plan” back to it’s default settings and now the CPU speeds are as they should be 0,89 GHz while idling and over 4 GHz when under load.

For those who need more exact directions here is the step by step:

  • Open up the “Start menu” by pressing the Windows key and type Edit Power Plan and press enter/or just double click it with the mouse.
  • Click on “Change advanced power settings
  • Go to the “Processor Power Management” subsection and from there on move to “Minimum Processor State” change the values for both “On battery” and “Plugged in” from their default value of 5% to 1%.
  • Next change the “Maximum Processor State” values for “On battery” and “Plugged in” from their default value of 100% to 90%.
  • Click “Apply” to apply the changed power plan.
  • After applying the altered power plan click on “Restore Plan Defaults” and “Yes” in the prompt that pops up warning you that the power plan settings will be reverted to their default values. And then click “Apply” as the final thing.

And that should be it,  now your computer should be changing its CPU speed based on real needs.

{ 1 Comment }

Windows 10 Media Creation Tool error 0x80004005 fix

When trying to create a Windows 10 USB installation disk you may get errors starting with code 0x80004005 and end up scratching your head, that why isn’t it working. When I happened to get that error, what helped me get around it was basically emptying the windows update cache by doing the following:

  • Open command prompt in administrator rights
    Click on the start menu button and type cmd. A best match of "Command Prompt" will appear, right click on it and select run as administrator.
  • Using the previously opened Command Prompt stop the “Windows Update service” by typing the following:
    net stop wuauser
  • If your computer is a part of a Windows Domain, it might not have “Windows Updare service” running but rather have “Update Orchestrator Service” instead running, then you need to stop that by typing the following:
    net stop "Update Orchestrator Service"
  • Next you need to stop the Cryptographic and Background Intelligent Transfer services, by typing the following commands:
    net stop bits
    net stop cryptsvc
  • Now lets just rename some of the folders used by Windows Update so, it would re-create them, by typing:
    ren %systemroot%\System32\Catroot2 Catroot2.old
    
    ren %systemroot%\SoftwareDistribution SoftwareDistribution.old
    
    
  • Now lets start the services back up again that we previously stopped by typing:
    net start wuauser
    
    net start bits
    
    net start cryptsvc
    
    # and if necessary also update orchestrator service
    
    net start "Update Orchestrator Service"
  • And thats it close the Command Prompt and retry creating your Windows 10 installation media.

If your are getting “Access is denied” on the “ren %systemroot%\SoftwareDistribution SoftwareDistribution.old” command and you haven’t stopped the “Update Orchestrator Service”, try stopping that.

 

{ Add a Comment }

Check Point R77.30 management interface crypto hardening (WebUI and SSH Cipher change)

By default the management interfaces (WebUI/SSH) of a Check Point firewall are using crypto settings that are not that great (MD5 and SSLv3, etc are enabled), but fortunately it is possible to change them.

SSH daemon is configured like in a normal Linux Distribution by just editing the /etc/ssh/sshd_config, Check Point in its support site also recommends you also modify the ssh client configuration located in /etc/ssh/ssh_config.  Basically in order to change the encryption algorithms available when connecting to the firewall using ssh add the following lines to the aforementioned configuration files using the vi command in Expert mode:

Ciphers aes256-ctr,aes256-cbc,aes128-ctr,aes192-ctr,aes128-cbc,aes192-cbc
MACs hmac-sha1

After modifying the config file restart the SSH server using the following command:

 service sshd restart

If everything is fine then your connection survives and if for some strange reason your ssh connectivity breaks and you can’t log back in you can undo the previous changes by using the terminal access that you can get in the WebUI.

Now that the SSHD settings have been changed, lets start changing the Cipher suites available for HTTPS used for WebUI. Just connect to command line using SSH and do the following in Expert mode.

  1. Backup the current file /web/templates/httpd-ssl.conf.templ:
    [Expert@HostName:0]# cp /web/templates/httpd-ssl.conf.templ /web/templates/httpd-ssl.conf.templ_ORIGINAL
  2. Edit the current /web/templates/httpd-ssl.conf.templ file:
    [Expert@HostName:0]# vi /web/templates/httpd-ssl.conf.templ
  3.  Find the line containing the SSLCipherSuite parameter and change the values behind it for example to ECDHE-RSA-AES256-SHA384:AES256-SHA256:!ADH:!EXP:RSA:+HIGH:+MEDIUM:!MD5:!LOW:!NULL:!SSLv2:!SSLv3:!eNULL:!aNULL:!RC4
  4. Close the editor by using :wq!  , the ‘!’ in the end will override the fact that the file has read only permissions.
  5. Update the current configuration of HTTPD daemon based on the modified configuration template:
    [Expert@HostName:0]# /bin/template_xlate : /web/templates/httpd-ssl.conf.templ /web/conf/extra/httpd-ssl.conf < /config/active
  6. To activate the configuration changes restart the HTTPD daemon by using the “tellpm” command:
    [Expert@HostName:0]# tellpm process:httpd2
    
    [Expert@HostName:0]# tellpm process:httpd2 t

To find out what you actually want to use as the SSLCipherSuite value you can use the cpopenssl to see what algorithms will be available with which value. Example:

[Expert@HostName:0]# cpopenssl ciphers -v 'ECDHE-RSA-AES256-SHA384:AES256-SHA256:!ADH:!EXP:RSA:+HIGH:+MEDIUM:!MD5:!LOW:!NULL:!SSLv2:!eNULL:!aNULL:!RC4' | sort -k1

Expected output:

AES128-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(128) Mac=SHA1
AES256-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(256) Mac=SHA1
DES-CBC3-SHA SSLv3 Kx=RSA Au=RSA Enc=3DES(168) Mac=SHA1

{ Add a Comment }