Linkers & Loaders

of 34

Please download to get full document.

View again

All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
PDF
34 pages
0 downs
2 views
Share
Description
Linkers & Loaders . Sai Rahul Reddy P, M.Tech 2 nd year, SIT, IIT Kharagpur. bar.c. foo.c. run preprocessor (cpp) & compiler proper (cc1). gcc –S <filename>. gcc –S <filename>. foo.s. bar.s. run assembler (as). as <filename>. as <filename>. foo.o. bar.o. linker. ld <object files>.
Transcript
Linkers & Loaders Sai Rahul Reddy P,M.Tech 2nd year,SIT, IIT Kharagpur.Linkers & Loadersbar.cfoo.crun preprocessor (cpp) & compiler proper (cc1)gcc –S <filename>gcc –S <filename>foo.sbar.srun assembler (as)as <filename>as <filename>foo.obar.olinkerld <object files>ld <object files>a.outBasics
  • Compiler in Action…
  • gcc foo.c bar.c –o a.out
  • a.out = fully linked executableLinkers & LoadersLinkers & LoadersWhat is Linker ?
  • Combines multiple relocatable object files
  • Produces fully linked executable – directly loadable in memory
  • How?
  • Symbol resolution – associating one symbol definition with each symbol reference
  • Relocation – relocating different sections of input relocatable files
  • Linkers & LoadersLinkers vs. LoadersLinkers and loaders perform various related but conceptually different tasks:Program Loading: This refers to copying a program image from hard disk to the main memory in order to put the program in a ready-to-run state. In some cases, program loading also might involve allocating storage space or mapping virtual addresses to disk pages.Relocation: Compilers and assemblers generate the object code for each input module with a starting address of zero. Relocation is the process of assigning load addresses to different parts of the program by merging all sections of the same type into one section. The code and data section also are adjusted so they point to the correct runtime addresses.Symbol Resolution: A program is made up of multiple subprograms; reference of one subprogram to another is made through symbols. A linker's job is to resolve the reference by noting the symbol's location and patching the caller's object code.Linkers & LoadersSymbol Resolution
  • Global symbols defined by the module and referenced by other modules. All non-static functions and global variables fall in this category.
  • Global symbols referenced by the input module but defined elsewhere. All functions and variables with extern declaration fall in this category.
  • Local symbols defined and referenced exclusively by the input module. All static functions and static variables fall here.
  • Linker Resolves the symbols using the following rules
  • Multiple strong symbols are not allowed.
  • Given a single strong symbol and multiple weak symbols, choose the strong symbol.
  • Given multiple weak symbols, choose any of the weak symbols.
  • Linkers & LoadersLinkingLinkers & LoadersLinkingLinkers & LoadersObject filesSections:Idx Name Size VMA LMA File off Algn 0 .text 0000003e 00000000 00000000 00000034 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 1 .data 00000006 00000000 00000000 00000074 2**2 CONTENTS, ALLOC, LOAD, DATA 2 .bss 00000000 00000000 00000000 0000007c 2**2 ALLOC 3 .rodata 00000003 00000000 00000000 0000007c 2**0 CONTENTS, ALLOC, LOAD, READONLY, DATASYMBOL TABLE:00000000 l df *ABS* 00000000 file1.c00000000 l d .text 0000000000000000 l d .data 0000000000000000 l d .bss 0000000000000000 l d .rodata 0000000000000000 l d .note.GNU-stack 0000000000000000 l d .comment 0000000000000000 g O .data 00000006 a00000000 g F .text 00000023 main00000000 *UND* 00000000 bar00000023 g F .text 0000001b baz00000000 *UND* 00000000 printfRELOCATION RECORDS FOR [.text]:OFFSET TYPE VALUE0000001d R_386_PC32 bar00000030 R_386_32 .rodata00000035 R_386_PC32 printffile1.c:#include <stdio.h>char a[] = "Hello";extern void bar();int main(){ bar();}void baz(char *s){ printf("%s", s);}Linkers & LoadersObject filesSections:Idx Name Size VMA LMA File off Algn 0 .text 0000002d 00000000 00000000 00000034 2**2 CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE 1 .data 00000000 00000000 00000000 00000064 2**2 CONTENTS, ALLOC, LOAD, DATA 2 .bss 00000006 00000000 00000000 00000064 2**2 ALLOC 3 .note.GNU-stack 00000000 00000000 00000000 00000064 2**0 CONTENTS, READONLY 4 .comment 00000031 00000000 00000000 00000064 2**0 CONTENTS, READONLYSYMBOL TABLE:00000000 l df *ABS* 00000000 file2.c00000000 l d .text 0000000000000000 l d .data 0000000000000000 l d .bss 0000000000000000 l O .bss 00000006 b00000000 l d .note.GNU-stack 0000000000000000 l d .comment 0000000000000000 g F .text 0000002d bar00000000 *UND* 00000000 a00000000 *UND* 00000000 strcpy00000000 *UND* 00000000 bazRELOCATION RECORDS FOR [.text]:OFFSET TYPE VALUE0000000a R_386_32 a0000000f R_386_32 .bss00000014 R_386_PC32 strcpy0000001f R_386_32 .bss00000024 R_386_PC32 bazfile2.c:#include <stdio.h>extern char a[];static char b[6];void bar(){ strcpy(b, a); baz(b);}Linkers & LoadersBefore and After Linking080483a0 <main>: 80483a0: 55 push %ebp 80483a1: 89 e5 mov %esp,%ebp 80483a3: 83 ec 08 sub $0x8,%esp 80483a6: 83 e4 f0 and $0xfffffff0,%esp 80483a9: b8 00 00 00 00 mov $0x0,%eax 80483ae: 83 c0 0f add $0xf,%eax 80483b1: 83 c0 0f add $0xf,%eax 80483b4: c1 e8 04 shr $0x4,%eax 80483b7: c1 e0 04 shl $0x4,%eax 80483ba: 29 c4 sub %eax,%esp 80483bc: e8 1f 00 00 00 call 80483e0 <bar> 80483c1: c9 leave 80483c2: c3 ret080483c3 <baz>: 80483c3: 55 push %ebp 80483c4: 89 e5 mov %esp,%ebp 80483c6: 83 ec 08 sub $0x8,%esp 80483c9: 83 ec 08 sub $0x8,%esp 80483cc: ff 75 08 pushl 0x8(%ebp) 80483cf: 68 f0 84 04 08 push $0x80484f0 80483d4: e8 ff fe ff ff call 80482d8 <printf@plt> 80483d9: 83 c4 10 add $0x10,%esp 80483dc: c9 leave 80483dd: c3 ret 80483de: 90 nop 80483df: 90 nop00000000 <main>: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 83 ec 08 sub $0x8,%esp 6: 83 e4 f0 and $0xfffffff0,%esp 9: b8 00 00 00 00 mov $0x0,%eax e: 83 c0 0f add $0xf,%eax 11: 83 c0 0f add $0xf,%eax 14: c1 e8 04 shr $0x4,%eax 17: c1 e0 04 shl $0x4,%eax 1a: 29 c4 sub %eax,%esp 1c: e8 fc ff ff ff call 1d <main+0x1d> 21: c9 leave 22: c3 ret00000023 <baz>: 23: 55 push %ebp 24: 89 e5 mov %esp,%ebp 26: 83 ec 08 sub $0x8,%esp 29: 83 ec 08 sub $0x8,%esp 2c: ff 75 08 pushl 0x8(%ebp) 2f: 68 00 00 00 00 push $0x0 34: e8 fc ff ff ff call 35 <baz+0x12> 39: 83 c4 10 add $0x10,%esp 3c: c9 leave 3d: c3 retLinkers & LoadersLinking: Example080483a0 <main>: 80483a0: 55 push %ebp 80483a1: 89 e5 mov %esp,%ebp 80483a3: 83 ec 08 sub $0x8,%esp 80483a6: 83 e4 f0 and $0xfffffff0,%esp 80483a9: b8 00 00 00 00 mov $0x0,%eax 80483ae: 83 c0 0f add $0xf,%eax 80483b1: 83 c0 0f add $0xf,%eax 80483b4: c1 e8 04 shr $0x4,%eax 80483b7: c1 e0 04 shl $0x4,%eax 80483ba: 29 c4 sub %eax,%esp 80483bc: e8 1f 00 00 00 call 80483e0 <bar> 80483c1: c9 leave 80483c2: c3 ret080483c3 <baz>: 80483c3: 55 push %ebp 80483c4: 89 e5 mov %esp,%ebp 80483c6: 83 ec 08 sub $0x8,%esp 80483c9: 83 ec 08 sub $0x8,%esp 80483cc: ff 75 08 pushl 0x8(%ebp) 80483cf: 68 f0 84 04 08 push $0x80484f0 80483d4: e8 ff fe ff ff call 80482d8 <printf@plt> 80483d9: 83 c4 10 add $0x10,%esp 80483dc: c9 leave 80483dd: c3 ret 80483de: 90 nop 80483df: 90 nopfile1.c:#include <stdio.h>char a[] = "Hello";extern void bar();int main(){ bar();}void baz(char *s){ printf("%s", s);}Linkers & LoadersLinking: Examplefile2.c:#include <stdio.h>extern char a[];static char b[6];void bar(){ strcpy(b, a); baz(b);}080483e0 <bar>: 80483e0: 55 push %ebp 80483e1: 89 e5 mov %esp,%ebp 80483e3: 83 ec 08 sub $0x8,%esp 80483e6: 83 ec 08 sub $0x8,%esp 80483e9: 68 fc 95 04 08 push $0x80495fc 80483ee: 68 08 96 04 08 push $0x8049608 80483f3: e8 f0 fe ff ff call 80482e8 <strcpy@plt> 80483f8: 83 c4 10 add $0x10,%esp 80483fb: 83 ec 0c sub $0xc,%esp 80483fe: 68 08 96 04 08 push $0x8049608 8048403: e8 bb ff ff ff call 80483c3 <baz> 8048408: 83 c4 10 add $0x10,%esp 804840b: c9 leave 804840c: c3 ret 804840d: 90 nop 804840e: 90 nop 804840f: 90 nopLinkers & LoadersObject File Formats
  • .com (no relocation information. Program starts at fixed location 0x100)
  • .exe format
  • char signature[2] = "MZ";// magic numbershort lastsize; // # bytes used in last blockshort nblocks; // number of 512 byte blocksshort nreloc; // number of relocation entriesshort hdrsize; // size of file header in 16 byte paragraphsshort minalloc; // minimum extra memory to allocateshort maxalloc; // maximum extra memory to allocatevoid far *sp; // initial stack pointershort checksum; // ones complement of file sumvoid far *ip; // initial instruction pointershort relocpos; // location of relocation fixup tableshort noverlay; // Overlay number, 0 for programchar extra[]; // extra material for overlays, etc.void far *relocs[]; // relocation entries, starts at relocposLinkers & LoadersELF Object FormatLinkers & LoadersELF Object FormatELF Headerchar magic[4] = "\177ELF";// magic numberchar class; // address size, 1 = 32 bit, 2 = 64 bitchar byteorder; // 1 = little-endian, 2 = big-endianchar hversion; // header version, always 1char pad[9];short filetype; // 1 = relocatable, 2 = executable, 3 = shared object, 4 = core imageshort archtype; // 2 = SPARC, 3 = x86, 4 = 68K, etc.int fversion; // file version, always 1int entry; // entry point if executableint phdrpos; // file position of program header or 0int shdrpos; // file position of section header or 0int flags; // architecture specific flags, usually 0short hdrsize; // size of this ELF headershort phdrent; // size of an entry in program headershort phdrcnt; // number of entries in program header or 0short shdrent; // size of an entry in section headershort phdrcnt; // number of entries in section header or 0short strsec; // section number that contains section name stringsLinkers & LoadersELF Object Format#include<stdio.h>#include<math.h>int abc;int main(){ int a = 25; int b; printf("Hello World: %f\n",sqrt(a));}Example C file considered in the following discussionLinkers & LoadersELF Object FormatSample ELF HeaderLinkers & LoadersELF Object FormatSection Headerint sh_name; // name, index into the string tableint sh_type; // section typeint sh_flags; // flag bits, belowint sh_addr; // base memory address, if loadable, or zeroint sh_offset; // file position of beginning of sectionint sh_size; // size in bytesint sh_link; // section number with related info or zeroint sh_info; // more section-specific infoint sh_align; // alignment granularity if section is movedint sh_entsize; // size of entries if section is an arrayLinkers & LoadersLinkers & LoadersELF Object FormatSymbol Table int name; // position of name string in string tableint value; // symbol value, section relative in reloc,// absolute in executableint size; // object or function sizechar type:4; // data object, function, section, or special case filechar bind:4; // local, global, or weakchar other; // spareshort sect; // section number, ABS, COMMON or UNDEFLinkers & LoadersSymbol tableLinkers & LoadersELF Program HeaderLinkers & LoadersStatic libraries
  • Problems
  • Change in library requires re-linking
  • Copying library contents to target program wastes disk space and memory especially for commonly used libraries such as the C library.
  • With large number of active programs considerable amount of memory goes to storing these copies of library functions.
  • Difference between filesize of static and dynamic link executablesLinkers & LoadersShared libraries
  • The primary difference between static and shared libraries is that shared libraries delays the actual task of linking to runtime, where it is performed by special dynamic linker-loader.
  • Rather than copying the contents of the libraries into the target executable, the linker simply records the name of the libraries in a list of executable.
  • ExampleHello[sairahul@c1 test]$ ldd file libc.so.6 => /lib/tls/libc.so.6 (0x001ae000) /lib/ld-linux.so.2 (0x00191000)Linkers & LoadersShared libraries
  • The dynamic linker searches libraries in the same order as they were specified on the link line and uses the first definition of the symbol encountered.
  • Duplicate symbols normally don’t occur.
  • Linking processes happens at each program invocation. To minimize the performance overhead, shared libraries use both indirection tables and lazy symbol binding.
  • That is, the location of external symbols actually refers to table entries, which remain unbound until the application actually needs them.
  • To implement lazy symbol binding, the static linker creates jump table known as procedure-linking table and includes it as part of the final executable.
  • Linkers & LoadersShared library LinkingThe internal structure of an executable linked with shared libraries. External library calls point to procedure-linking table entries, which remain unresolved until the dynamic linker fills them in at runtime.Linkers & LoadersShared library LinkingThe dynamic binding of library symbols in sharedlibraries: malloc is bound to the C library, and printf hasnot yet been used and is bound to the dynamic linker.Linkers & LoadersLibrary Loading
  • Library names are never encoded with absolute pathnames.
  • To locate the libraries, the dynamic linker uses a configurable library search path. This path’s default value is normally stored in a system configuration file such as /etc/ld.so.conf or specified by the user in the LD_LIBRARY_PATH environment variable.
  • directory traversal is relatively slow, the loader does not look at the directories in /etc/ld.so.conf every time it runs to find library files, but consults a cache file instead. Normally named /etc/ld.so.cache, this cache file is a table that matches library names to full pathnames.
  • A better solution to the library path is to embed customized search paths in the executable itself using special linker options such as -R or -Wl, -rpath—for example
  • $ cc $SRCS –Wl,-rpath=/home/beazley/libs -L/home/beazleys/libs -lfooLinkers & LoadersLibrary loading[sairahul@c1 test]$ LD_DEBUG=libs ./file 9024: find library=libc.so.6 [0]; searching 9024: search cache=/etc/ld.so.cache 9024: trying file=/lib/tls/libc.so.6 9024: 9024: 9024: calling init: /lib/tls/libc.so.6 9024: 9024: 9024: initialize program: ./file 9024: 9024: 9024: transferring control: ./file 9024: 9024: 9024: calling fini: /lib/tls/libc.so.6 [0] 9024:We can obtain detailed information about how the dynamiclinker loads libraries by setting the LD_DEBUG environmentvariable to libs.Linkers & Loaders[sairahul@c1 test]$ strace ./fileexecve("./file", ["./file"], [/* 32 vars */]) = 0uname({sys="Linux", node="c1.grid-iitkgp.com", ...}) = 0brk(0) = 0x9a6b000access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)open("/etc/ld.so.cache", O_RDONLY) = 3fstat64(3, {st_mode=S_IFREG|0644, st_size=183440, ...}) = 0old_mmap(NULL, 183440, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7fd3000close(3) = 0open("/lib/tls/libc.so.6", O_RDONLY) = 3read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0 /\34\000"..., 512) = 512fstat64(3, {st_mode=S_IFREG|0755, st_size=1512400, ...}) = 0old_mmap(0x1ae000, 1207532, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x1ae000old_mmap(0x2cf000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x120000) = 0x2cf000old_mmap(0x2d3000, 7404, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x2d3000close(3) = 0old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fd2000mprotect(0x2cf000, 8192, PROT_READ) = 0mprotect(0x1a6000, 4096, PROT_READ) = 0set_thread_area({entry_number:-1 -> 6, base_addr:0xb7fd2940, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0munmap(0xb7fd3000, 183440) = 0fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 4), ...}) = 0mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fff000write(1, "Hello", 5Hello) = 5munmap(0xb7fff000, 4096) = 0exit_group(5) = ?References
  • http://www.linuxjournal.com/article/6463 (Article on Linkers & Loaders)
  • Linkers and Loaders by John Levine, published by Morgan-Kauffman in October 1999, ISBN 1-55860-496-0
  • Linkers and Libraries Guide from Sun (http://docs.sun.com/app/docs?p=/doc/816-1386)
  • Lecture slides from columbia university (http://www1.cs.columbia.edu/~sedwards/classes/2003/w4115f/)
  • Beazley et. al, “The inside story on shared libraries and dynamic loading”, Computing in Science & Engineering, Sep/Oct 2001.
  • Presser, L and White, J.R., "Linkers and Loaders," ACM Computing Surveys, Sep 1972.
  • Linkers & LoadersThank you[sairahul@c1 test]$ LD_DEBUG=bindings ./file 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `_res' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `_IO_file_close' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `__morecore' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `__daylight' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `__malloc_hook' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `h_nerr' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `__malloc_initialize_hook' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/ld-linux.so.2: normal symbol `_r_debug' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `stdout' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `__rcmd_errstr' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `_nl_domain_bindings' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `re_syntax_options' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `argp_program_bug_address' [GLIBC_2.1] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `__tzname' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `_IO_stdout_' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `_IO_funlockfile' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `__realloc_hook' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `_IO_stderr_' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `malloc' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `_nl_msg_cat_cntr' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `optarg' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `loc2' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `h_errlist' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `opterr' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `error_message_count' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `_environ' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `getdate_err' [GLIBC_2.1] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `__environ' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `obstack_exit_failure' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/ld-linux.so.2: normal symbol `_rtld_global' [GLIBC_PRIVATE] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `error_print_progname' [GLIBC_2.0] 9049: binding file /lib/tls/libc.so.6 to /lib/tls/libc.so.6: normal symbol `__after_morecore_hook' [GLIBC_2.0] 9049:
    Related Search
    We Need Your Support
    Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

    Thanks to everyone for your continued support.

    No, Thanks