This is the mail archive of the libc-hacker@sources.redhat.com mailing list for the glibc project.
Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
To improve the start up time of the dynamic linker, I followed an idea
the BSD folks implemented (my view is i386 centric but everything can
be generalized):
Large applications, especially C++ ones, have a number of PC32
relocations against symbols. The lookup of these symbols is quite
expensive. I'm caching these lookups during the initial relocation of
an object (in dl-relocate) now and have decreased the relocation time
by 45 % in one large application (konqueror).
For comparison, here's the normal linker:
gee:/builds/glibc/cross:[254]$ LD_DEBUG=statistics elf/ld-linux.so.2 --library-path :math:linuxthreads:login:/usr/lib:/lib /opt/kde2/bin/konqueror -/
16634:
16634: runtime linker statistics:
16634: total startup time in dynamic loader: 699430072 clock cycles
16634: time needed for relocation: 694116497 clock cycles (99.2)
16634: number of relocations: 54130
16634: time needed to load objects: 4958742 clock cycles (.7)
konqueror: Unknown option '-/'.
konqueror: Use --help to get a list of available command line options.
And this is my version, note that number of relocations from cache is
a subset of the other relocations):
gee:/builds/glibc/main-gcc-2.95:[254]$ LD_DEBUG=statistics elf/ld-linux.so.2 --library-path :math:linuxthreads:login:/usr/lib:/lib /opt/kde2/bin/konqueror -/
22277:
22277: runtime linker statistics:
22277: total startup time in dynamic loader: 392157353 clock cycles
22277: time needed for relocation: 386720405 clock cycles (98.6)
22277: number of relocations: 54130
22277: number of relocations from cache: 33770
22277: time needed to load objects: 5044967 clock cycles (1.2)
konqueror: Unknown option '-/'.
konqueror: Use --help to get a list of available command line options.
For smaller applications the cache overhead is too expensive and the
application startup time is unfortunatly increased (here 7 %):
Normal:
gee:/builds/glibc/cross:[0]$ LD_DEBUG=statistics elf/ld-linux.so.2 --library-path :math:linuxthreads:login:/usr/lib:/lib /bin/ls /
22298:
22298: runtime linker statistics:
22298: total startup time in dynamic loader: 1280514 clock cycles
22298: time needed for relocation: 892054 clock cycles (69.6)
22298: number of relocations: 362
22298: time needed to load objects: 245332 clock cycles (19.1)
With cache:
gee:/builds/glibc/main-gcc-2.95:[0]$ LD_DEBUG=statistics elf/ld-linux.so.2 --library-path :math:linuxthreads:login:/usr/lib:/lib /bin/ls /
22299:
22299: runtime linker statistics:
22299: total startup time in dynamic loader: 1399818 clock cycles
22299: time needed for relocation: 960693 clock cycles (68.6)
22299: number of relocations: 362
22299: number of relocations from cache: 100
22299: time needed to load objects: 232376 clock cycles (16.6)
I'm appending a first version as proof of concept to get your
comments.
I do have the following open questions/things to do:
- Provide patches for platforms beside i386
- _dl_symcache and _dl_maxchain should be passed as parameters to
_dl_lookup_symbol/_dl_lookup_versioned_symbol instead of making them
global.
- The code needs reindentation.
- Do I need the version number in the cache? I don't think so (and it
works without) but I'm not sure whether I'm missing something.
- Any other ideas to improve this?
- Is this a post 2.2.4 project? I do think so.
How shall I continue with this?
Btw. I do get at the moment segvs in dlfcn/errmsg1 and elf/loadfail
that I need to investigate, any help/hints would be appreciated.
This work was influenced by Waldo's paper (discussed on libc-alpha)
and I got already some good advice from Andreas Schwab.
Andreas
2001-07-31 Andreas Jaeger <aj@suse.de>
* sysdeps/i386/dl-machine.h (elf_machine_rel): Pass r_info to
RESOLVE.
* sysdeps/generic/ldsodefs.h (Elf_r_info): New.
Adjust declarations of _dl_lookup_symbol and
_dl_lookup_versioned_symbol.
* elf/rtld.c (print_statistics): Print cache relocations.
* elf/dl-reloc.c (_dl_relocate_object): Set up cache.
* elf/dl-lookup.c: New variable _dl_num_cache_relocations.
(_dl_setup_hash): Set l_nchain.
(_dl_lookup_symbol): Change parameter to use r_info.
Use symbol cache.
(_dl_lookup_versioned_symbol_skip): Likewise.
============================================================
Index: elf/dl-lookup.c
--- elf/dl-lookup.c 2001/07/06 04:54:46 1.78
+++ elf/dl-lookup.c 2001/07/31 14:18:46
@@ -60,6 +60,7 @@ struct sym_val
/* Statistics function. */
unsigned long int _dl_num_relocations;
+unsigned long int _dl_num_cache_relocations;
/* During the program run we must not modify the global data of
loaded shared object simultanously in two threads. Therefore we
@@ -80,6 +81,8 @@ __libc_lock_define (extern, _dl_load_loc
#define VERSIONED 1
#include "do-lookup.h"
+extern struct symcache *_dl_symcache;
+extern int _dl_maxchain;
/* Add extra dependency on MAP to UNDEF_MAP. */
static int
@@ -200,18 +203,30 @@ lookup_t
internal_function
_dl_lookup_symbol (const char *undef_name, struct link_map *undef_map,
const ElfW(Sym) **ref, struct r_scope_elem *symbol_scope[],
- int reloc_type, int explicit)
+ Elf_r_info r_info, int explicit)
{
const char *reference_name = undef_map ? undef_map->l_name : NULL;
const unsigned long int hash = _dl_elf_hash (undef_name);
struct sym_val current_value = { NULL, NULL };
struct r_scope_elem **scope;
int protected;
+ int reloc_type = ELFW(R_TYPE) (r_info);
int noexec = elf_machine_lookup_noexec_p (reloc_type);
int noplt = elf_machine_lookup_noplt_p (reloc_type);
+ int symbol_num = ELFW(R_SYM) (r_info);
++_dl_num_relocations;
+ assert (symbol_num < _dl_maxchain);
+ if (_dl_symcache != NULL && _dl_symcache[symbol_num].symbol != NULL)
+ {
+ current_value.s = _dl_symcache[symbol_num].symbol;
+ current_value.m = _dl_symcache[symbol_num].object;
+ ++_dl_num_cache_relocations;
+ }
+ else
+ {
+
/* Search the relevant loaded objects for a definition. */
for (scope = symbol_scope; *scope; ++scope)
if (do_lookup (undef_name, hash, *ref, ¤t_value, *scope, 0, NULL,
@@ -233,7 +248,7 @@ _dl_lookup_symbol (const char *undef_nam
/* Something went wrong. Perhaps the object we tried to reference
was just removed. Try finding another definition. */
return _dl_lookup_symbol (undef_name, undef_map, ref, symbol_scope,
- reloc_type, 0);
+ r_info, 0);
break;
}
@@ -250,7 +265,8 @@ _dl_lookup_symbol (const char *undef_nam
*ref = NULL;
return 0;
}
-
+ }
+
protected = *ref && ELFW(ST_VISIBILITY) ((*ref)->st_other) == STV_PROTECTED;
if (__builtin_expect (_dl_debug_mask & DL_DEBUG_BINDINGS, 0))
@@ -261,6 +277,13 @@ _dl_lookup_symbol (const char *undef_nam
? current_value.m->l_name : _dl_argv[0],
protected ? "protected" : "normal", undef_name);
+
+ if (_dl_symcache != NULL && _dl_symcache[symbol_num].symbol == NULL)
+ {
+ _dl_symcache[symbol_num].symbol = current_value.s;
+ _dl_symcache[symbol_num].object = current_value.m;
+ }
+
if (__builtin_expect (protected == 0, 1))
{
*ref = current_value.s;
@@ -378,18 +401,30 @@ _dl_lookup_versioned_symbol (const char
struct link_map *undef_map, const ElfW(Sym) **ref,
struct r_scope_elem *symbol_scope[],
const struct r_found_version *version,
- int reloc_type, int explicit)
+ Elf_r_info r_info, int explicit)
{
const char *reference_name = undef_map ? undef_map->l_name : NULL;
const unsigned long int hash = _dl_elf_hash (undef_name);
struct sym_val current_value = { NULL, NULL };
struct r_scope_elem **scope;
int protected;
+ int reloc_type = ELFW(R_TYPE) (r_info);
int noexec = elf_machine_lookup_noexec_p (reloc_type);
int noplt = elf_machine_lookup_noplt_p (reloc_type);
+ int symbol_num = ELFW(R_SYM) (r_info);
++_dl_num_relocations;
+ assert (symbol_num < _dl_maxchain);
+ if (_dl_symcache != NULL && _dl_symcache[symbol_num].symbol != NULL)
+ {
+ current_value.s = _dl_symcache[symbol_num].symbol;
+ current_value.m = _dl_symcache[symbol_num].object;
+ ++_dl_num_cache_relocations;
+ }
+ else
+
+ {
/* Search the relevant loaded objects for a definition. */
for (scope = symbol_scope; *scope; ++scope)
{
@@ -414,7 +449,7 @@ _dl_lookup_versioned_symbol (const char
was just removed. Try finding another definition. */
return _dl_lookup_versioned_symbol (undef_name, undef_map, ref,
symbol_scope, version,
- reloc_type, 0);
+ r_info, 0);
break;
}
@@ -438,6 +473,8 @@ _dl_lookup_versioned_symbol (const char
return 0;
}
}
+ }
+
if (__builtin_expect (current_value.s == NULL, 0))
{
@@ -464,6 +501,12 @@ _dl_lookup_versioned_symbol (const char
protected ? "protected" : "normal",
undef_name, version->name);
+ if (_dl_symcache != NULL && _dl_symcache[symbol_num].symbol == NULL)
+ {
+ _dl_symcache[symbol_num].symbol = current_value.s;
+ _dl_symcache[symbol_num].object = current_value.m;
+ }
+
if (__builtin_expect (protected == 0, 1))
{
*ref = current_value.s;
@@ -591,14 +634,13 @@ internal_function
_dl_setup_hash (struct link_map *map)
{
Elf_Symndx *hash;
- Elf_Symndx nchain;
if (!map->l_info[DT_HASH])
return;
hash = (void *)(map->l_addr + map->l_info[DT_HASH]->d_un.d_ptr);
map->l_nbuckets = *hash++;
- nchain = *hash++;
+ map->l_nchain = *hash++;
map->l_buckets = hash;
hash += map->l_nbuckets;
map->l_chain = hash;
@@ -614,7 +656,7 @@ _dl_do_lookup (const char *undef_name, u
struct link_map *skip, int noexec, int noplt)
{
return do_lookup (undef_name, hash, ref, result, scope, i, skip, noexec,
- noplt);
+ noplt);
}
static int
============================================================
Index: elf/dl-reloc.c
--- elf/dl-reloc.c 2001/07/06 04:54:46 1.54
+++ elf/dl-reloc.c 2001/07/31 14:18:46
@@ -27,6 +27,9 @@
#include "dynamic-link.h"
+struct symcache *_dl_symcache;
+int _dl_maxchain;
+
void
_dl_relocate_object (struct link_map *l, struct r_scope_elem *scope[],
int lazy, int consider_profiling)
@@ -64,12 +67,15 @@ cannot make segment writable for relocat
}
}
+ _dl_symcache = alloca ((l->l_nchain+1) * sizeof (struct symcache));
+ memset (_dl_symcache, 0, (l->l_nchain+1) * sizeof (struct symcache));
+ _dl_maxchain = l->l_nchain;
{
/* Do the actual relocation of the object's GOT and other data. */
/* String table object symbols. */
const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
-
+
/* This macro is used as a callback from the ELF_DYNAMIC_RELOCATE code. */
#define RESOLVE_MAP(ref, version, flags) \
(ELFW(ST_BIND) ((*ref)->st_info) != STB_LOCAL \
@@ -91,6 +97,8 @@ cannot make segment writable for relocat
#include "dynamic-link.h"
ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling);
+ _dl_symcache = NULL;
+
if (__builtin_expect (_dl_profile != NULL, 0))
{
/* Allocate the array which will contain the already found
============================================================
Index: elf/rtld.c
--- elf/rtld.c 2001/07/26 00:25:54 1.200
+++ elf/rtld.c 2001/07/31 14:18:46
@@ -127,6 +127,7 @@ static hp_timing_t relocate_time;
static hp_timing_t load_time;
#endif
extern unsigned long int _dl_num_relocations; /* in dl-lookup.c */
+extern unsigned long int _dl_num_cache_relocations; /* in dl-lookup.c */
static ElfW(Addr) _dl_start_final (void *arg, struct link_map *bootstrap_map_p,
hp_timing_t start_time);
@@ -1522,6 +1523,8 @@ print_statistics (void)
#endif
_dl_debug_printf (" number of relocations: %lu\n",
_dl_num_relocations);
+ _dl_debug_printf (" number of relocations from cache: %lu\n",
+ _dl_num_cache_relocations);
#ifndef HP_TIMING_NONAVAIL
/* Time spend while loading the object and the dependencies. */
============================================================
Index: include/link.h
--- include/link.h 2001/07/26 00:24:13 1.13
+++ include/link.h 2001/07/31 14:18:47
@@ -1,6 +1,6 @@
/* Data structure for communication from the run-time dynamic linker for
loaded ELF shared objects.
- Copyright (C) 1995-1999, 2000 Free Software Foundation, Inc.
+ Copyright (C) 1995-1999, 2000, 2001 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
@@ -158,7 +158,7 @@ struct link_map
struct link_map *l_loader;
/* Symbol hash table. */
- Elf_Symndx l_nbuckets;
+ Elf_Symndx l_nbuckets, l_nchain;
const Elf_Symndx *l_buckets, *l_chain;
unsigned int l_opencount; /* Reference count for dlopen/dlclose. */
============================================================
Index: sysdeps/generic/ldsodefs.h
--- sysdeps/generic/ldsodefs.h 2001/07/06 04:55:49 1.27
+++ sysdeps/generic/ldsodefs.h 2001/07/31 14:18:47
@@ -38,6 +38,14 @@ __BEGIN_DECLS
`ElfW(TYPE)' is used in place of `Elf32_TYPE' or `Elf64_TYPE'. */
#define ELFW(type) _ElfW (ELF, __ELF_NATIVE_CLASS, type)
+#if __ELF_NATIVE_CLASS == 32
+# define Elf_r_info Elf32_Word
+#elif __ELF_NATIVE_CLASS == 32
+# define Elf_r_info Elf64_Xword
+#else
+# error "__ELF_NATIVE_CLASS unknown."
+#endif
+
/* All references to the value of l_info[DT_PLTGOT],
l_info[DT_STRTAB], l_info[DT_SYMTAB], l_info[DT_RELA],
l_info[DT_REL], l_info[DT_JMPREL], and l_info[VERSYMIDX (DT_VERSYM)]
@@ -136,6 +144,14 @@ struct libname_list
};
+/* Data structure to cache symbol lookups. */
+struct symcache
+ {
+ const ElfW (Sym) *symbol; /* Symbol table entry. */
+ struct link_map *object; /* Object defining symbol. */
+ };
+
+
/* Test whether given NAME matches any of the names of the given object. */
static __inline int
__attribute__ ((unused))
@@ -332,7 +348,7 @@ extern lookup_t _dl_lookup_symbol (const
struct link_map *undef_map,
const ElfW(Sym) **sym,
struct r_scope_elem *symbol_scope[],
- int reloc_type, int explicit)
+ Elf_r_info r_info, int explicit)
internal_function;
/* Lookup versioned symbol. */
@@ -341,7 +357,7 @@ extern lookup_t _dl_lookup_versioned_sym
const ElfW(Sym) **sym,
struct r_scope_elem *symbol_scope[],
const struct r_found_version *version,
- int reloc_type, int explicit)
+ Elf_r_info r_info, int explicit)
internal_function;
/* For handling RTLD_NEXT we must be able to skip shared objects. */
============================================================
Index: sysdeps/i386/dl-machine.h
--- sysdeps/i386/dl-machine.h 2001/07/06 04:55:52 1.84
+++ sysdeps/i386/dl-machine.h 2001/07/31 14:18:47
@@ -322,7 +322,7 @@ elf_machine_rel (struct link_map *map, c
#ifndef RTLD_BOOTSTRAP
const Elf32_Sym *const refsym = sym;
#endif
- Elf32_Addr value = RESOLVE (&sym, version, ELF32_R_TYPE (reloc->r_info));
+ Elf32_Addr value = RESOLVE (&sym, version, reloc->r_info);
if (sym)
value += sym->st_value;
--
Andreas Jaeger
SuSE Labs aj@suse.de
private aj@arthur.inka.de
http://www.suse.de/~aj
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |