<br><font size=2 face="sans-serif">Hi,</font>
<br><font size=2 face="sans-serif">I have tested the patch and is OK.</font>
<br><font size=2 face="sans-serif">But I could have the main thread to be the second one from the LWP </font>
<br><font size=2 face="sans-serif">(as my in previous example).</font>
<br><font size=2 face="sans-serif">So it's hard to say. Consider it works.</font>
<br><font size=2 face="sans-serif">Thipadin.</font>
<br>
<br>
<br>
<br>
<table width=100%>
<tr valign=top>
<td>
<td><font size=1 face="sans-serif"><b>Ashley Pittman <ashley@pittman.co.uk></b></font>
<p><font size=1 face="sans-serif">12/09/2009 12:41 PM</font>
<br>
<td><font size=1 face="Arial"> </font>
<br><font size=1 face="sans-serif"> Pour : thipadin.seng-long@bull.net</font>
<br><font size=1 face="sans-serif"> cc : florence.vallee@bull.net, francois.wellenreiter@bull.net, padb-devel@pittman.org.uk, Sylvain.JEAUGEY@bull.net</font>
<br><font size=1 face="sans-serif"> Objet : Re: tiny bug with--proc-summary</font></table>
<br>
<br><font size=2 face="Courier New">On Wed, 2009-12-09 at 11:29 +0100, thipadin.seng-long@bull.net wrote:<br>
> <br>
> Hi, <br>
> With --proc-summary option, padb displays pid which is indeed a thread<br>
> PID (LWP) <br>
> for a process that have some threads as shown: <br>
<br>
> What's do you think. <br>
<br>
I can confirm there's a bug here, I can see it locally when I target a<br>
multi-threaded application on my laptop.<br>
<br>
What is happening is that the show_proc function is reporting data for<br>
all tasks in the program, this is probably the right thing for<br>
--proc-info however for --proc-summary it's incorrect in that it's<br>
recording a lot of entries twice for the same process, pid being one of<br>
these. This duplicate data is then passed back through the network to<br>
the outer process.<br>
<br>
At this point the tree_from_namespace function is re-assembling the data<br>
on the assumption that each key only has one value from a given rank, in<br>
the case here where this isn't true it's picking one at random and<br>
reporting that which is what you see.<br>
<br>
Attached is a basic patch which fixes the issue by ensuring that only<br>
data from the first thread is forwarded back, this makes padb<br>
deterministic and causes it to show the pid you'd expect.<br>
<br>
The wider issue here is how to handle multi-threaded programs, for<br>
example I don't know how to calculate memory usage across threads, I'd<br>
assume they all have the same memory maps with the possible exception of<br>
TLS which means the value is probably both common to all threads and<br>
correct across the process as a whole but the percent cpu usage<br>
calculation is almost certainly wrong, this would need to be calculated<br>
for each thread and summed across threads to get the true value.<br>
<br>
Ashley,<br>
<br>
-- <br>
<br>
Ashley Pittman, Bath, UK.<br>
<br>
Padb - A parallel job inspection tool for cluster computing<br>
http://padb.pittman.org.uk<br>
</font>
<br>
<br>