This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
RE: String Functions for x86-64 (memset)
- From: "Menezes, Evandro" <evandro dot menezes at amd dot com>
- To: "Menezes, Evandro" <evandro dot menezes at amd dot com>, "Ulrich Drepper" <drepper at redhat dot com>, libc-alpha at sourceware dot org
- Cc: "Meissner, Michael" <michael dot meissner at amd dot com>
- Date: Thu, 11 May 2006 18:25:18 -0500
- Subject: RE: String Functions for x86-64 (memset)
On the left, the results for the proposed memset, on the right, for the current one. I filtered duplicted lines with sort.
First, on an Athlon 64:
memset builtin_memset simpl memset builtin_memset simpl
Length 1, alignment 0, c -65: 15 16 8 | Length 1, alignment 0, c -65: 9 9 8
Length 1, alignment 1, c -65: 16 18 7 | Length 1, alignment 1, c -65: 7 8 8
Length 2, alignment 0, c -65: 16 18 8 | Length 2, alignment 0, c -65: 9 12 8
Length 2, alignment 2, c -65: 16 18 8 | Length 2, alignment 2, c -65: 9 13 8
Length 3, alignment 0, c -65: 16 18 9 | Length 3, alignment 0, c -65: 15 14 9
Length 3, alignment 3, c -65: 16 18 9 | Length 3, alignment 3, c -65: 15 14 9
Length 4, alignment 0, c -65: 15 16 12 | Length 4, alignment 0, c -65: 15 17 12
Length 4, alignment 4, c -65: 15 16 12 | Length 4, alignment 4, c -65: 15 17 12
Length 5, alignment 0, c -65: 15 16 13 | Length 5, alignment 0, c -65: 18 20 13
Length 5, alignment 5, c -65: 15 16 13 | Length 5, alignment 5, c -65: 18 20 13
Length 6, alignment 0, c -65: 16 17 15 | Length 6, alignment 0, c -65: 21 23 15
Length 6, alignment 6, c -65: 15 17 15 | Length 6, alignment 6, c -65: 21 23 14
Length 7, alignment 0, c -65: 16 17 18 | Length 7, alignment 0, c -65: 24 26 18
Length 7, alignment 7, c -65: 16 17 18 | Length 7, alignment 7, c -65: 24 26 18
Length 8, alignment 0, c -65: 16 18 19 | Length 8, alignment 0, c -65: 12 14 19
Length 9, alignment 0, c -65: 16 18 37 | Length 9, alignment 0, c -65: 17 19 37
Length 9, alignment 1, c -65: 16 18 37 | Length 9, alignment 1, c -65: 38 40 37
Length 10, alignment 0, c -65: 16 18 42 | Length 10, alignment 0, c -65: 18 19 42
Length 10, alignment 2, c -65: 16 18 42 | Length 10, alignment 2, c -65: 41 43 42
Length 11, alignment 0, c -65: 16 18 44 | Length 11, alignment 0, c -65: 21 23 44
Length 11, alignment 3, c -65: 16 18 44 | Length 11, alignment 3, c -65: 44 46 44
Length 12, alignment 0, c -65: 15 16 46 | Length 12, alignment 0, c -65: 24 25 46
Length 12, alignment 4, c -65: 15 16 46 | Length 12, alignment 4, c -65: 26 28 46
Length 13, alignment 0, c -65: 15 16 48 | Length 13, alignment 0, c -65: 27 28 48
Length 13, alignment 5, c -65: 15 16 48 | Length 13, alignment 5, c -65: 27 29 48
Length 14, alignment 0, c -65: 16 17 50 | Length 14, alignment 0, c -65: 30 31 50
Length 14, alignment 1, c -65: 16 17 50 | Length 14, alignment 1, c -65: 53 55 50
Length 14, alignment 6, c -65: 16 17 50 | Length 14, alignment 6, c -65: 30 32 50
Length 15, alignment 0, c -65: 16 17 52 | Length 15, alignment 0, c -65: 33 34 52
Length 15, alignment 7, c -65: 16 17 52 | Length 15, alignment 7, c -65: 33 34 52
Length 16, alignment 0, c -65: 16 17 54 | Length 16, alignment 0, c -65: 15 17 54
Length 17, alignment 0, c -65: 15 17 56 | Length 17, alignment 0, c -65: 17 20 56
Length 17, alignment 1, c -65: 15 17 56 | Length 17, alignment 1, c -65: 39 41 56
Length 18, alignment 0, c -65: 15 17 58 | Length 18, alignment 0, c -65: 19 21 58
Length 18, alignment 2, c -65: 15 17 58 | Length 18, alignment 2, c -65: 42 44 58
Length 19, alignment 0, c -65: 15 17 60 | Length 19, alignment 0, c -65: 23 24 60
Length 19, alignment 3, c -65: 15 17 60 | Length 19, alignment 3, c -65: 45 47 60
Length 20, alignment 0, c -65: 14 15 62 | Length 20, alignment 0, c -65: 25 27 62
Length 20, alignment 4, c -65: 14 15 62 | Length 20, alignment 4, c -65: 29 30 62
Length 21, alignment 0, c -65: 14 15 64 | Length 21, alignment 0, c -65: 28 30 64
Length 21, alignment 5, c -65: 14 15 64 | Length 21, alignment 5, c -65: 29 31 64
Length 22, alignment 0, c -65: 15 16 66 | Length 22, alignment 0, c -65: 31 33 66
Length 22, alignment 6, c -65: 15 16 66 | Length 22, alignment 6, c -65: 32 34 66
Length 23, alignment 0, c -65: 17 18 68 | Length 23, alignment 0, c -65: 34 36 68
Length 23, alignment 7, c -65: 17 18 68 | Length 23, alignment 7, c -65: 35 36 68
Length 24, alignment 0, c -65: 16 17 70 | Length 24, alignment 0, c -65: 17 19 70
Length 25, alignment 0, c -65: 15 17 72 | Length 25, alignment 0, c -65: 20 21 72
Length 25, alignment 1, c -65: 15 17 72 | Length 25, alignment 1, c -65: 41 43 72
Length 25, alignment 2, c -65: 15 17 72 | Length 25, alignment 2, c -65: 41 43 72
Length 26, alignment 0, c -65: 15 17 74 | Length 26, alignment 0, c -65: 21 23 74
Length 26, alignment 2, c -65: 15 17 74 | Length 26, alignment 2, c -65: 44 46 74
Length 27, alignment 0, c -65: 16 17 76 | Length 27, alignment 0, c -65: 24 26 76
Length 27, alignment 3, c -65: 15 17 76 | Length 27, alignment 3, c -65: 47 49 76
Length 28, alignment 0, c -65: 15 16 78 | Length 28, alignment 0, c -65: 27 29 78
Length 28, alignment 4, c -65: 15 16 78 | Length 28, alignment 4, c -65: 30 32 78
Length 29, alignment 0, c -65: 17 16 80 | Length 29, alignment 0, c -65: 30 32 80
Length 29, alignment 5, c -65: 15 16 80 | Length 29, alignment 5, c -65: 31 33 80
Length 30, alignment 0, c -65: 17 17 82 | Length 30, alignment 0, c -65: 33 35 82
Length 30, alignment 6, c -65: 15 17 82 | Length 30, alignment 6, c -65: 34 36 82
Length 31, alignment 0, c -65: 18 19 84 | Length 31, alignment 0, c -65: 36 38 84
Length 31, alignment 7, c -65: 17 19 84 | Length 31, alignment 7, c -65: 37 38 84
Length 32, alignment 0, c -65: 15 17 86 | Length 32, alignment 0, c -65: 19 21 86
Length 64, alignment 0, c -65: 13 14 150 | Length 64, alignment 0, c -65: 17 18 150
Length 64, alignment 4, c -65: 20 21 150 | Length 64, alignment 4, c -65: 48 50 150
Length 128, alignment 0, c -65: 18 18 278 | Length 128, alignment 0, c -65: 22 23 278
Length 256, alignment 0, c -65: 30 29 534 | Length 256, alignment 0, c -65: 31 32 534
Length 512, alignment 0, c -65: 48 47 1046 | Length 512, alignment 0, c -65: 49 50 1046
Length 1024, alignment 0, c -65: 98 99 2070 | Length 1024, alignment 0, c -65: 100 101 2070
Length 1024, alignment 3, c -65: 124 125 2070 | Length 1024, alignment 3, c -65: 131 132 2070
Length 2048, alignment 0, c -65: 165 166 4118 | Length 2048, alignment 0, c -65: 172 173 4118
Length 4096, alignment 0, c -65: 293 294 8214 | Length 4096, alignment 0, c -65: 316 317 8214
Length 8192, alignment 0, c -65: 549 550 16406 | Length 8192, alignment 0, c -65: 604 605 16406
Length 16384, alignment 0, c -65: 1061 1062 32790 | Length 16384, alignment 0, c -65: 1180 1181 32790
Length 32768, alignment 0, c -65: 2085 2086 65558 | Length 32768, alignment 0, c -65: 2332 2333 65558
Length 65536, alignment 0, c -65: 4421 4421 13110 | Length 65536, alignment 0, c -65: 4853 4849 13111
Length 131072, alignment 0, c -65: 33405 33405 26219 | Length 131072, alignment 0, c -65: 68290 68329 26218
Now, on a P4:
memset builtin_memset simpl memset builtin_memset simpl
Length 1, alignment 0, c -65: 0 8 0 | Length 1, alignment 0, c -65: 0 0 0
Length 1, alignment 1, c -65: 0 8 8 | Length 1, alignment 1, c -65: 0 0 32
Length 2, alignment 0, c -65: 8 0 0 | Length 2, alignment 0, c -65: 0 0 0
Length 2, alignment 2, c -65: 0 8 0 | Length 2, alignment 2, c -65: 0 0 0
Length 3, alignment 0, c -65: 0 8 8 | Length 3, alignment 0, c -65: 8 0 -8
Length 3, alignment 3, c -65: 0 8 0 | Length 3, alignment 3, c -65: 40 8 0
Length 4, alignment 0, c -65: 0 8 0 | Length 4, alignment 0, c -65: 0 0 0
Length 4, alignment 4, c -65: 8 8 0 | Length 4, alignment 4, c -65: 24 0 0
Length 5, alignment 0, c -65: 0 0 0 | Length 5, alignment 0, c -65: 8 16 0
Length 5, alignment 5, c -65: 8 8 0 | Length 5, alignment 5, c -65: 0 0 0
Length 6, alignment 0, c -65: 8 8 0 | Length 6, alignment 0, c -65: 0 0 0
Length 6, alignment 6, c -65: 8 8 0 | Length 6, alignment 6, c -65: 16 8 0
Length 7, alignment 0, c -65: 0 8 8 | Length 7, alignment 0, c -65: 0 0 8
Length 7, alignment 7, c -65: 8 0 0 | Length 7, alignment 7, c -65: 48 24 24
Length 8, alignment 0, c -65: 8 8 8 | Length 8, alignment 0, c -65: 0 0 0
Length 9, alignment 0, c -65: 0 0 0 Length 9, alignment 0, c -65: 0 0 0
Length 9, alignment 1, c -65: 0 0 8 | Length 9, alignment 1, c -65: 24 16 0
Length 10, alignment 0, c -65: 8 0 8 | Length 10, alignment 0, c -65: 0 0 0
Length 10, alignment 2, c -65: 8 0 0 | Length 10, alignment 2, c -65: 24 16 0
Length 11, alignment 0, c -65: 0 8 0 | Length 11, alignment 0, c -65: 24 24 0
Length 11, alignment 3, c -65: 0 0 8 | Length 11, alignment 3, c -65: 32 24 0
Length 12, alignment 0, c -65: 8 8 8 | Length 12, alignment 0, c -65: 40 0 8
Length 12, alignment 4, c -65: 8 8 16 | Length 12, alignment 4, c -65: 0 0 0
Length 13, alignment 0, c -65: 0 8 16 | Length 13, alignment 0, c -65: 72 24 16
Length 13, alignment 5, c -65: 8 8 16 | Length 13, alignment 5, c -65: 0 32 16
Length 14, alignment 0, c -65: 8 0 24 | Length 14, alignment 0, c -65: 24 32 24
Length 14, alignment 1, c -65: 0 8 24 | Length 14, alignment 1, c -65: 40 32 24
Length 14, alignment 6, c -65: 0 8 32 | Length 14, alignment 6, c -65: 0 0 24
Length 15, alignment 0, c -65: 8 8 32 | Length 15, alignment 0, c -65: 0 8 24
Length 15, alignment 7, c -65: 8 0 32 | Length 15, alignment 7, c -65: 0 8 24
Length 16, alignment 0, c -65: 8 8 32 | Length 16, alignment 0, c -65: 56 0 24
Length 17, alignment 0, c -65: 0 0 40 | Length 17, alignment 0, c -65: 0 24 24
Length 17, alignment 1, c -65: 0 0 32 | Length 17, alignment 1, c -65: 24 24 32
Length 18, alignment 0, c -65: 8 0 40 | Length 18, alignment 0, c -65: 0 24 24
Length 18, alignment 2, c -65: 8 0 48 | Length 18, alignment 2, c -65: 24 24 24
Length 19, alignment 0, c -65: 8 0 144 | Length 19, alignment 0, c -65: 0 24 24
Length 19, alignment 3, c -65: 8 0 144 | Length 19, alignment 3, c -65: 24 24 32
Length 20, alignment 0, c -65: 8 0 80 | Length 20, alignment 0, c -65: 0 24 40
Length 20, alignment 4, c -65: 8 0 88 | Length 20, alignment 4, c -65: 0 0 40
Length 21, alignment 0, c -65: 8 8 88 | Length 21, alignment 0, c -65: 104 64 40
Length 21, alignment 5, c -65: 0 8 96 | Length 21, alignment 5, c -65: 8 16 40
Length 22, alignment 0, c -65: 8 0 96 | Length 22, alignment 0, c -65: 64 0 40
Length 22, alignment 6, c -65: 8 8 96 | Length 22, alignment 6, c -65: 16 8 40
Length 23, alignment 0, c -65: 8 8 104 | Length 23, alignment 0, c -65: 8 16 80
Length 23, alignment 7, c -65: 8 0 104 | Length 23, alignment 7, c -65: 16 16 80
Length 24, alignment 0, c -65: 8 0 112 | Length 24, alignment 0, c -65: 0 8 80
Length 25, alignment 0, c -65: 0 8 120 | Length 25, alignment 0, c -65: 0 0 88
Length 25, alignment 1, c -65: 8 0 120 | Length 25, alignment 1, c -65: 24 24 80
Length 25, alignment 2, c -65: 8 8 96 | Length 25, alignment 2, c -65: 24 24 88
Length 26, alignment 0, c -65: 8 8 120 | Length 26, alignment 0, c -65: 0 0 88
Length 26, alignment 2, c -65: 8 0 120 | Length 26, alignment 2, c -65: 32 24 88
Length 27, alignment 0, c -65: 0 8 128 | Length 27, alignment 0, c -65: 0 24 88
Length 27, alignment 3, c -65: 0 0 128 | Length 27, alignment 3, c -65: 32 32 88
Length 28, alignment 0, c -65: 0 0 128 | Length 28, alignment 0, c -65: 24 152 96
Length 28, alignment 4, c -65: 0 0 136 | Length 28, alignment 4, c -65: 8 0 96
Length 29, alignment 0, c -65: 0 8 144 | Length 29, alignment 0, c -65: 192 0 96
Length 29, alignment 5, c -65: 8 8 144 | Length 29, alignment 5, c -65: 16 8 96
Length 30, alignment 0, c -65: 0 8 144 | Length 30, alignment 0, c -65: 8 8 96
Length 30, alignment 6, c -65: 8 0 144 | Length 30, alignment 6, c -65: 16 16 96
Length 31, alignment 0, c -65: 8 0 152 | Length 31, alignment 0, c -65: 8 16 96
Length 31, alignment 7, c -65: 0 0 152 | Length 31, alignment 7, c -65: 24 24 96
Length 32, alignment 0, c -65: 8 8 112 | Length 32, alignment 0, c -65: 16 0 96
Length 64, alignment 0, c -65: 0 8 168 | Length 64, alignment 0, c -65: 32 16 232
Length 64, alignment 4, c -65: 8 0 168 | Length 64, alignment 4, c -65: 32 32 168
Length 128, alignment 0, c -65: 8 0 296 | Length 128, alignment 0, c -65: 120 0 296
Length 256, alignment 0, c -65: 56 56 552 | Length 256, alignment 0, c -65: 48 48 552
Length 512, alignment 0, c -65: 112 128 1072 | Length 512, alignment 0, c -65: 128 128 1064
Length 1024, alignment 0, c -65: 328 328 2096 | Length 1024, alignment 0, c -65: 320 320 2088
Length 1024, alignment 3, c -65: 336 352 2096 | Length 1024, alignment 3, c -65: 336 336 2088
Length 2048, alignment 0, c -65: 672 672 4136 | Length 2048, alignment 0, c -65: 720 712 4136
Length 4096, alignment 0, c -65: 1248 1248 8240 | Length 4096, alignment 0, c -65: 1416 1416 8232
Length 8192, alignment 0, c -65: 2400 2400 16424 | Length 8192, alignment 0, c -65: 2832 2824 16424
Length 16384, alignment 0, c -65: 4720 4736 32832 | Length 16384, alignment 0, c -65: 5640 5640 32800
Length 32768, alignment 0, c -65: 9328 9328 65576 | Length 32768, alignment 0, c -65: 11280 11272 65568
Length 65536, alignment 0, c -65: 18544 18544 13112 | Length 65536, alignment 0, c -65: 22536 22536 13111
Length 131072, alignment 0, c -65: 36992 36984 78640 | Length 131072, alignment 0, c -65: 98288 98128 26218
If this way of presenting the data is not satisfactory, I'd appreciate suggestions to get it easier to digest.
Thanks,
--
_______________________________________________________
Evandro Menezes AMD Austin, TX