23 January 2007

AbiWord on Sharp Zaurus

Screenshot of AbiWord on Akita AbiWord 2.5.x with the shiny new embedded UI on Sharp Zaurus SL-C1000, running Poky (screenshot courtesy of scap).

(Had to scale the image from 640×480 to 320×240 for the blog; the original capture is here.

22 January 2007

Western Rib on Aonach Mor

Picture 1 Had a great day yesterday on the Western Rib on the west face of Aonach Mor. We intended to have a go at the Ordinary Route in Coire nan Lochan at Glen Coe, but seeing that the lower car park was almost full when we go there before 8am, and knowing that due to the conditions pretty much everyone would be either on the Ordinary Route or Dorsal Arete, we decided to head to the Aonach.

Picture 1 It turned out to be excellent choice, as the west face was quiet, and there were no other parties on our climb. The conditions on the face are not bad; not huge amounts of snow, and not much ice, but excellent for the type of a climb we were on. We soloed the first 150 or so meters, then Bob led one pitch over some trickier ground, the rest we moved on together. Took as around 3.5 hours, which is not bad for a party of three.

The visibility on the top was very poor, so we did the prudent thing and took a bearing; however, thinking that the bearing took us too close to the edge of the north face, we made a small adjustment (the cornices on the top of the north face are presently massive) — you can imagine our surprise when after half an hour walking we arrived at what looked suspiciously like the Aonach Mor summit cairn from which we departed. Goes without saying that we stuck uncompromisingly to the bearing on the second attempt, reaching the top of the ski runs shortly. We were in luck, the gondola was still running, and shortly we were enjoying venison sausages in the Ben Nevis Inn.

19 January 2007

Fast, Fixed Point Square Root Computation

Recently I have been working on Clutter, and more specifically on replacing some floating point math with fixed point. What was needed was a fixed point implementation of sin() and sqrt(). The sin function is the simple one; since sine is a periodic function, it is easy to pre-compute a LUT, and when you have the luxury of choosing how the angle is represented, you can use it directly to index the LUT, ending up with a very fast function indeed.

Sqrt is bit more challenging; there are various implementations of a square root function out there, but two of those stand out. First of these is a LUT based algorithm that I found described in the Allegro library. It exploits the fact that

sqrt (x) == sqrt (x/d) * sqrt (d); 
for d == 2^(n) => sqrt (x) = sqrt (x/2^(2n)) * 2^n

If for a given x we locate a suitable n such that x >> 2n is in <0,255>, we can compute sqrt(x) from sqrt(x>>2n), which we store in a LUT.

While determining a suitable n in C requires a loop, on many modern processors there is an instruction for counting leading zeros (e.g. the arm CLZ), so that a line or two of inline assembly makes it possible to determine n in just couple of clock cycles. Overall, this algorithm provides good compromise between speed and accuracy; it suffers from error < 1%, works well even for numbers in interval <0, 1), and on ARM without FPU is about 20 times faster than the clib sqrt.

The other square root algorithm that stands out for its speed is one sometimes referred to as the Quake algorithm, for it was made famous by its use in Quake (although the correspondence here suggests it is was not invented by the Quake authors). It goes like this:

float magic_sqrt (float number)
{
   long i;
   float f = 1.5, x = number/2, y = number;
   i  = * ( unsigned long * ) &y;
   i   = 0x5f3759df - ( i >> 1 );
   y = * ( float * ) &i;
   y = y * (f - x*y*y);
   return number * y;    

}

(If it makes no sense to you, do not despair, the link above describes how it works.)

The main problem with this algorithm is that in its original form it uses floating point multiplication, which we want to avoid on FPU-less CPUs, if at all possible. Fortunately, with little care the algorithm can be implemented using fixed point math. The key to this is in choosing the fixed point format for the number y; the algorithm does not compute sqrt (number), but 1/sqrt(number), which is only inverted in the final step. This means that for number > 1, y < 1, — unless y is represented with good precision, the final result is significantly affected.

It works pretty well in single precision float, so it should work using 10.22 fixed point, and it indeed does: for an integer x in <0, 131> the error is < 5% (i.e., at most 1 pixel), and for x <5591 less then < 10%. Obviously, this is not as good as the LUT-based method (altough the precission could be improved by additional iterration), but this algorithm is very fast. My tests shown that on my Pentium M laptop it is more than 10x faster than the clib sqrt, and on FPU-less arm (and this is not a typo) more than 800 times faster!

If you want check it out, you will find implementations of both of these sqrt algorithms in the Clutter sources, and perhaps you can make them even faster.

18 January 2007

Tweaking AbiWord for Embedded Devices

One thing that I have wanted since the INdT guys produced the initial Maemo port of AbiWord is a normal (that is, plain gtk-based) version of AbiWord tweaked toward use on embedded devices. Over the past year or so I have been pecking at it, and it is finally coming together in the shape of a bunch of new options that can be passed to the AbiWord configure script:

–enable-embedded: this option results in an alternative user interface that is intended to improve the use of screen real estate:

  • The Normal mode becomes the default mode, rulers are hidden, various ‘dead’ space areas around the page are reduced in size or eliminated,

  • There is no status bar, and no menu bar, only a simple toolbar, with the main menu accessible via a button on the lef side of the toolbar.

The alternative interface looks something like this:

Screen capture of embedded AbiWord

Apart from the UI-oriented –enable-embedded, there are also three other options that can be used to reduce size of the application:

–disable-printing: removes code and dependencies associated with printing.

–disable-spellcheck: removes code and dependencies associated with spell and grammar checking features.

–disable-exports: when this option is used, symbols are not exported from the executable; this reduces it’s size by about 1/8, but the executable compiled with this option cannot use any plug-ins.

An ipk package of AbiWord built with all of these options is only 1.4MB in size, which I think is pretty amazing. A bitbake recipe to build AbiWord from CVS using OpenEmbedded environment can be found here.

10 January 2007

N800 Developer Device Programme

So I have been contemplating this surge in traffic on the maemo-developer mailing list since 8 January and concluded that some people must know something I do not, namely that the 500 lucky developers to get their discount code for the N800 will be picked up by an automated judging system based on the number of emails posted to the maemo list since the initial announcement on the 8th.

Now, my rational self is telling me that this could not possibly be the case, but my spiritual self is advising me to hedge the bets, just in case my rational self happens to be mistaken. So, I have hacked upcarefully crafted this amazing bulk email tool for the 770, that will make sure that my presence is truly felt on the list.

I am very pleased with the results; the engine has been hand-optimised to run on ARM processor, and is achieving throughput that is only slightly behind that of similar (and expensive) commercial tools for the desktop. Furthermore, the automatically generated messages are virtually indistinguishable from human generated text, and fit well among other messages on the list.

In case you are interested, Application Installer-ready packages for the 770 are available here.

P.S. Packages for the N800 will made available as soon as I get my discounted device.

P.S.S. Just discovered a small bug; for some reason sender is being generated at random — will definitely maker sure this is fixed in an upcoming 0.2 release.

9 January 2007

Cross-platform development M$ way

Some insight into how ‘cross-platform’ development works @M$; I felt a bit sorry for the Mac guys, until I read in the update section,

If we had to add support for Open XML to Mac Word 12 without being able to port code from Win Word, the read/write estimates shrinks down to about 8.5 man/years …

That made me really, really laugh; come on, 8.5man/years? I am not surprised nothing from M$ is ever produced on time if it takes 8.5man/years to create an import/export filter from a file format that was tailored to the Word internals, to a file format that was tailored to the Word internals, via (yes, you guessed it) an intermediate file format that was tailored to the Word internals. (Bad news for the OpenDocument though; how could M$ ever manage to support such a complex file format that has no connection with the Word internals whatsoever?)

As I recall, this about half of what it took to add HTML support to Word: 10 or so devs over a release cycle of 2 years.

One less mystery in the Universe: I finally understand why Word HTML is so bloated.

Next Page »