Have you ever wondered how fast your computer's Pentium processor really is? Intel quotes processor speeds in gigahertz, but that's the clock speed. Most operations take several clock cycles, so the clock speed is not a true indication of how fast your computer can do useful work.

We use our Pentium 4-based machine to do scientific calculations, and for this kind of work, the relevant measure of speed is how many millions of floating-point operations the processor can perform per second.

We created a simple test program, written in C and Intel assembler, which actually performs a set of floating-point calculations and reports the speed directly. The program is designed to run under Linux, since that is the operating system we use.

The program tests the major Pentium floating-point opcodes in a
fetch-calculate-store cycle which emulates the kind of work that a real
scientific application would perform. In addition to addition, multiplication
and division, the program tests several special functions: square root,
cosine, two-argument inverse tangent, logarithm, exponential, and the
`FSINCOS`

opcode, which calculates both sine and cosine.

You can download the gzipped tar file. Unpack it, then type

make a_speedtest ./a_speedtest -nsize 1000000 -niters 100

The output will look something like this:

This is a CPU speed test program which is designed to report your processor's floating-point performance in megaflops. It gives the most reliable results for the Intel Pentium family of processors, because the timing loop for these devices is written in IA32 assembler code, but there is an architecture-independent version which will run on any CPU. However, the results reported by the generic version are heavily dependent on the optimisation performed by the compiler! This software was written by David Harper at obliquity.com If you find it useful, please give due credit to the author. Allocating data arrays ... done. Setting input arrays to random number ... done. Running ALL speed tests. It took 6.731 seconds to perform 1000.000 million additions. That corresponds to 148.567 mflops. It took 6.737 seconds to perform 1000.000 million multiplications. That corresponds to 148.435 mflops. It took 15.258 seconds to perform 1000.000 million divisions. That corresponds to 65.541 mflops. It took 79.994 seconds to perform 1000.000 million cosines. That corresponds to 12.501 mflops. It took 15.220 seconds to perform 1000.000 million square roots. That corresponds to 65.704 mflops. It took 72.307 seconds to perform 1000.000 million arc-tangents. That corresponds to 13.830 mflops. It took 42.740 seconds to perform 1000.000 million y.log2(x) operations. That corresponds to 23.398 mflops. It took 71.271 seconds to perform 1000.000 million combined sine/cosines. That corresponds to 14.031 mflops. It took 34.828 seconds to perform 1000.000 million binary exponentials. That corresponds to 28.713 mflops.

That example was run on a 2.8GHz Pentium 4 processor. Your mileage may vary.