WD-gtk defunct processes

With build 16 of WD-gtk I am noticing more and more defunct processes starting and hanging around. Sometimes they seem to occur after the “access violation” errors that show up in the programerrorlog.txt or in STDOUT. The system seems to be performing okay but this is a concern from a sysop viewpoint.

Mike - N7DQ


WeatherD-defunct.jpg

Hi Mike,

In the image, the second column is the parent process id. When you grep' for it, who is the parent’?

One work-around is to run the application `nohup’:

nohup my_command &

pablo,

The parent process is the WeatherD process.

How does nohup impact defunct processes? The way things are starting up is fine, it is just that WeatherD has defunct processes hanging around.

Mike - N7DQ

Hi Mike,

Hmmm, I don’t run the GUI version so bear with me.

When you start WD-gtk, isn’t this the WeatherD' process? And isn't this the defunct’ process?

pablo,

You use GoWeather.sh to start the process: ./GoWeather.sh which starts the WeatherD process. I use the following ./GoWeather.sh >myconsole.log 2>&1 & so that it runs in background and as a GUI app freeing up the terminal session. The defunct processes are child processes of the WeatherD process. See the attached image


Hi Mike,

We’re getting closer. :slight_smile:

In the ./GoWeather.sh' script, please cut and paste (no need to use an image), how the WeaterhD’ process is started.

Cheers.

pablo,

It is the default shell script that comes with the GUI release.

Mike - N7DQ

Hi Mike,

As I mentioned earlier, I don’t run the GUI version so I don’t have the script installed on my machines.

pablo,

Here it is:

#!/bin/bash

Function to establish user id of current user.

We call this function later on to test if they are ROOT and fail if they are.

getUID() {
id $1 | sed -e ‘s/(.*$//’ -e ‘s/^uid=//’
}

Test if user is root and deny access if they are.

It is not a good idea to be running software as root.

if [ “getUID” = 0 ] ; then
echo “ERROR: You can’t run this as ROOT. Change to another user and try again.”;
exit;
fi

End of User test.

DIR_PATH=dirname $0

setup the path to inlcude our directory

export PATH=“$DIR_PATH”:“$PATH”

If we made it this far we mustn’t be root.

Just set the library path.

export LD_LIBRARY_PATH=$DIR_PATH/deploy/

Then run the program.

$DIR_PATH/WeatherD

exit $?

Mike

Hi Mike,

Change this part of the code:


# Then run the program.
$DIR_PATH/WeatherD

to this:


# Then run the program.
nohup $DIR_PATH/WeatherD > /dev/null

Let’s see if that helps.

Cheers.

pablo,

Tried this no luck. The nohup just keeps the process active if the terminal session ends and the >/dev/null send stdout to the null device. I did a ps aux and found the defunct processes are zombies.

Mike

Hi Mike,

Yup, glad you’re starting to understand about `nohup’ As you probably figuring out, the zombies are there because there is no process to catch the death of the child proccess’ signal.

Please add `&’ so it looks as follows:


# Then run the program.
nohup $DIR_PATH/WeatherD > /dev/null &

Let’s see if that gives us what we want. I think it will.

Let me know.

Cheers.

pablo,

The parent process is there though. In this case I think the zombies are being created due to an issue with build 16, as build 15 did not have this issue. Zombies are dead processes that can be caused because the parent process have put them in a wait state and checking the status. I do know after a number of zombie processes are created it can cause a problem.

Mike - N7DQ

Hi Mike,

Zombies are processes whose parents have not setup an error handler to ignore the death of the child.

Brian was having the same problems fork()‘ing the cron* processes. I wrote a little C library for him to use to avoid the issue with those processes. So I `kinda’ know a little bit about zombie processes. :slight_smile:

Have you tried using the `&'?

If you see `libutils’s, that’s the library I wrote … well, I sorta found it on the Web and tweaked it for Brian.

pablo,

I am trying it (should have mentioned that in the previous reply). I will let you know shortly. The zombies start getting created fairly quickly.

Mike

I can’t find it right now but I think there was a thread about this same issue quite some time ago.

Hi,

Was the issue resolved?

pablo,

Sorry for the delayed response – traveling again this week.

No the problem was not resolved. The ‘&’ just placed the process in the background and had no impact on the creation of zombies. I have build 15 and when I get home I may drop back to that release and validate if the problem occurs there as the changes made to build 16 could be causing the problem.

Can you explain to me how the nohup may or may not impact the creation of zombies? The way I start the software the GUI session is always active which I expect in my environment and I run the primary process in background anyway to free up the GUI terminal session for other uses. Also, the ‘>/dev/null’ dumps the STDOUT messages to null so I lose those which do not help in monitoring the status of the environment.

Mike - N7DQ

Bummer on the &', I was hoping it'd cause nohup’ to setup a sigchld trap.

On the `/dev/null’, you could also redirect the output to a file to aid in debugging. I didn’t think you’d want a file hanging about.

I think Brian needs to chime in and fix the bug. I was just trying to get a work-around for you.