linux 64-bit vs 32-bit
2007年12月14日
阅读评论 607 views
http://www.x86-secret.com/popups/articleswindow.php?id=118
http://www.pdc.kth.se/~pek/64vs32bits.txt
http://blog.linux.org.tw/~jserv/archives/001463.html
http://blogger.org.cn/blog/more.asp?name=FoxWolf&id=28695
64 versus 32 bit processors
As manufacturers like Intel and AMD work to bring 64 bit
microprocessors into the mainstream the question of what makes a 64
bit processor 64 bit and what implications the "bittedness" has on
application performance becomes increasingly relevant.
Strictly speaking a processor is said to be a "N bit processor" if
the size of the integer registers is N bits. This normally means that
the processor can use no more than N bits in a memory address (since
addresses are usually manipulated as integers in modern processors),
although there is certainly no requirement that a N bit processor
implement N bit addresses.
What are the benefits of a 64 bit processor?
The chief benefit is that a lot of limits are removed transparently.
In a pure 64 bit environment the user doesn't have to think about
compiling with the right flags or working around pointer size limits
using library calls and opaque data types. Large processes and big
memories work in a straightforward manner with a minimum of fuss.
Programs that need to use more than 2 or 4 GB of memory can usually
do that with a simple recompile.
Codes that perform many operations on integer values larger than
2^32 bits _may_ also run faster on a 64 bit processor (see below).
However...
Some myths :
- A 64 bit processor is faster than a 32 bit processor.
This may be true, but not on account of the increased number of
bits.
The size of the integer registers does not normally have a large
impact on performance. Integer codes that work with large numbers
could benefit from the increased register size but since memory
latency is the most important performance limiter on modern
processors it will in many cases mask a lot of the potential gain.
If the addresses in the 64 bit processor are larger than those
in the 32 bit processor the 64 bit processor may actually turn
out to be slightly slower since the larger addresses eat up
a larger portion of caches and memory bandwidth.
Since a 64 bit version of a processor family is in most cases
developed later than a 32 bit version it often benefits from a
lot of other changes as well such as larger caches, improved
memory systems and higher clock frequencies. These changes
can cause a great performance difference that marketing
departments are fond of attributing to the "bittedness" of
the processor.
- A 32 bit processor can only use 2^32 bytes (4GB) of RAM
It is true that the maximum size of an address in a program in a 32
bit processor is usually 32 bits, however it is possible for the
processor to extend the address using other means to enable it to
access a larger physical memory. Some members of the 32 bit Intel
IA32 family of processors (modern Pentiums) can access 2^36 bytes
(64GB) of physical memory using the ESMA (Extended Server Memory
Architecure [2]). A virtual address is still limited to 32 bits but
4 extra bits can be tacked onto it during translation to a physical
address. This means that each process can only access at most 4GB at
once transparently though a pointer but the system as a whole can
use more and the process can keep more data in memory with the help
of the OS. This requires the code to explicitly support this model.
- A 64 bit system can use 2^64 bytes (16EB) of RAM
As previously stated a 64 bit processor may have addresses that use
fewer than 64 bits and usually doesn't actually have memory
interfaces that support the full 64 bit address range. Many systems
also reserve parts of the address range for various uses so that
some bits of the address range are lost there. For the time being the
number of address bits actually supported is usually "large enough"
anyway, often in the Petabyte range.
- The precision of a floating point value depends on the "bittedness"
of the processor
The question of the width of the floating point registers is
basically orthogonal to that of the width of the integer
registers. Practically all modern commercial processors with a
floating point unit conforms to the IEEE 754 standard which
specifies that a single precision floating point value is
represented with 32 bits and a double precision floating point value
is represented with 64 bits. For instance, the 32 bit Power2
processor has 64 bit wide floating point registers and the 32 bit
IA32 family of processors actually has 80 bit wide floating point
registers.
Some processors incorporate SIMD (Single Instruction, Multiple Data)
extensions where single precision (32 bit) floating point values can
be packed into 64 or 128 bit registers and operated on in parallel.
Because they operate on several values in parallel, programs that
can use these extensions can improve their floating point
performance quite a bit. Examples of such extensions are SSE/SSE2 in
Intel processors, 3Dnow in AMD Athlon processors and AltiVec in
PowerPC.
- My PlayStation2 has a 128 bit processor
The processor in the PlayStation2, the Emotion Engine, consists of a
MIPS III RISC core and two vector units [3]. The MIPS III is a 64
bit architecture with 64 bit integer registers. The vector units
each has 16 16 bit integer registers and 32 128 bit floating point
registers. So, ignoring the marketing value of being able to claim
"128 bit power", the PlayStation2 uses a 64 bit processor.
- I need a 64 bit processor in order to use 64 bit wide integer types
A compiler can well support integer types that are wider than the
maximum width of the architectural integer registers by using more
than one register to store them. For instance, the "long long" type
in GCC under Linux on a IA32 processor is 64 bits. Using these types
is typically slower than using the "native" types since the compiler
has to generate code to load/store more registers and stitch
together the results.
- The maximum size of a file is tied to the "bittedness" of the
processor
The maximum size of a file in a Unix system is defined by the
maximum size of the type off_t. This type can be 64 bits even on a
32 bit processor. A common solution is to have a 32 bit off_t by
default and a 64 bit off64_t type that can be used if
necessary. Most modern filesystems support 64 bit off_t on 32 bit
systems. Programs often have to define a constant (such as setting
_FILE_OFFSET_BITS to 64 in Linux) or be compiled with some special
flag order to use the larger version of off_t.
Potential points of contention
The definition of 64 bit processors comes from John Mashey [1],
someone who knows a lot more about it than I do. Feel free to argue
against it but know that you must argue you point more effectively
than Mashey.
There are machines out there that put the lie to what I've written
above. Older Cray machines didn't support IEEE 754 floating point for
instance. I'd be happy to note any other deviances but bear in mind
that I've purposfully glossed some things over to keep it short(ish)
and not to obscure the point too much.
Contact
Corrections, improvement suggestions and questions welcome.
pek@pdc.kth.se
References
[1] John Masheys comp.arch post about "bittedness" :
http://www.pdc.kth.se/~pek/bittedness.Mashery.txt
[2] Extended Memory Access on IA-32 Platforms :
http://www.intel.com/idf/us/fall2002/presentations/DES124PS.pdf
[3] Masaaki Oka, Masakazu Suzuoki. Designing and Programming the
Emotion Engine. IEEE Micro, Vol. 19, No. 6, pp. 20-28
20030923
包子猜您可能还喜欢下列文章:
分类: 未分类
最近评论