This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Integration of a NPTL Trace Tool into the glibc


Dear glibc hackers,

We've worked for three years on a tool aimed at tracing NPTL internals and at 
measuring multi-threaded applications performance and contention. This tool 
is now mature and we think it would be a good idea to include it into the 
glibc. I explain why hereafter.

Version 0.90.0 of the NPTL Trace Tool (http://nptltracetool.sourceforge.net/) 
is available for download at http://sourceforge.net/projects/nptltracetool/.

Our work was motivated by industrial needs. Indeed, trace tools are not used 
by desktop users, but rather by professionals working with precious data on 
critical activities. In an industrial context, people work is based on 
methods and tools. They need reliable, available and serviceable (RAS) 
systems.

Trace tools help to easily diagnose the system when problems arise, thus 
improving the level of serviceability. They also allow to ensure the "First 
Failure Data Capture" concept, another important need in industry, to 
understand a problem the first time it occurs.

Unfortunately, it seems that this kind of trace tools is not as popular in 
Linux systems as it might be.

Companies may be reluctant to migrate to Linux because tools they need are 
missing: it is a reason why they prefer to choose more professional systems 
widely used in the industry. That's why we strongly believe that trace tools 
are not only useful but absolutely needed in opensource systems.

On that point, Linux is late but is improving. Efforts are already made at 
kernel level: some tools (like LTT or SystemTap) enable to trace kernel 
events, even if they are not as well integrated as similar tools are in other 
operating systems. However, there is, by now, no way to trace glibc events, 
while the glibc seems to be such a central and critical system component.

Here is an analysis of the main arguments against trace tools in the glibc:

1. "Glibc internals are constantly modified. Any added code might break at the 
next update."
Right, but most parts of the glibc are now stable: updates mainly consist in 
fixing some minor bugs, and so should not break trace points code.

2. "Glibc is a runtime library. No unnecessary work is done."
Ok, but even if glibc is a high quality library, there will always be 
remaining bugs and critical situations. Trace tools are precisely necessary 
under these circumstances.

3. "Glibc is a high performance library. Trace tools would reduce 
performance."
False. Two versions of libraries can be built: production libraries (not 
spoiled by trace points) and instrumented libraries.

4. "How to find maintainers for these trace tools ?"
Industrial users will be natural maintainers of tools they use.

Moving Linux into industrial systems will be facilitated if Linux provides 
tools to answer the specific needs of this kind of customers. Glibc, as a 
part of Linux systems, should provide such tools like our NPTL Trace Tool in 
order to help industrial users to develop multi-threaded applications.

Note that Intel provides on Linux a tool similar to our NPTL Trace Tool: Intel 
Threading Tools. If Intel provides such a product, why wouldn't there be an 
equivalent opensource tool ? We give a brief comparison between both tools 
here below.

Kind Regards,

Guillaume Duranceau
Tony Reix


------------------------------------------------------------------------------
Features comparison between the NPTL Trace Tool and the Intel Threading Tools
------------------------------------------------------------------------------

ITT: Intel Threading Tools
PTT: Posix Thread Trace Toolkit (NPTL Trace Tool)

                                              - ITT -             - PTT -

Ability to not rebuild the program              NO                  YES
Few modification of the application dynamic     NO (instrusive)     YES
Search errors in source code                    YES                 NO
Trace calls to and exits from thread routines   NO                  YES
Handle large volume of traces                   NO                  YES
Name traced objects                             YES                 YES
Graphical interface                             YES              (yes Pajé)
Identify performance bottlenecks                YES                 YES
Measure scalability                             YES                 NO
Handle bad situations (crash...)                NO                  YES
Supported architectures                     ia32, ia64         ia32, ia64, ppc


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]