Archive for December, 2005

Tomcat must have noticed my blog

Wednesday, December 28th, 2005

… because it ruined my servlet redesign by crashing the server when a servlet calls java.lang.Class.forName().

Update: Apparently tomcat is innocent and it is all my fault for causing an infinite recursion (by mapping a single servlet to /* and trying to forward the request to /WEB-INF/some-page.jsp). I’m a bit disappointed that a misbehaving application can bring the whole application server down so easily.

Some Java programs are just LISP in disguise as well

Friday, December 23rd, 2005

The easiest way to write an expression parser that creates a syntax tree from a string is of course to use a suitable parser generator. If that is not desirable for some reason, the second choice is a recursive-descent parser, a few functions calling each other, walking over the string and keeping a pointer to the current position:

Tree parse(String expr) {
     return parseOr(expr, new int[] { 0 });
}

Tree parseOr(String expr, int pos[]) {
    … parseAnd(expr, pos) … pos[0]++; …
}

This works, but passing the parameters to each function is redundant and using pos[0] is ugly. (more…)

Why Java is not the right language for web development

Wednesday, December 21st, 2005

The recent rise of popularity of Ruby on Rails seems to be mainly a backlash against “the enterprise design” that is overengineered for most applications and designed a priori, “by committee”; people are drawn to Rails because the whole framework is so simple to learn and convenient, not because they are persuaded Ruby is a better language for the application.

While I don’t think Ruby is the only language suitable for Web programming, I feel quite confident Java is not suitable for this purpose at all.

(more…)

The ugliest C feature: <tgmath.h>

Thursday, December 8th, 2005

<tgmath.h> is a header provided by the standard C library, introduced in C99 to allow easier porting of Fortran numerical software to C.

Fortran, unlike C, provides “intrinsic functions”, which are a part of the language and behave more like operators. While ordinary (”external”) functions behave similarly to C functions with respect to types (the types of arguments and parameters must match and the restult type is fixed), intrinsic functions accept arguments of several types and their return type may depend on the type of their arguments. For example Fortran 77 provides, among others, an INT function which accepts Integer, Real, Double or Complex arguments and always returns an Integer, and a SIN function which accepts Real, Double or Complex arguments and returns a value of the same type.

This helps the programmer somewhat because the function calls don’t have to be changed if variable types change. On the other hand user-defined functions can’t behave this way, so the additional flexibility is really limited to single subroutines that don’t need to call user-defined functions.

Some C programmers would call the feature ugly from the above description already, for the same reason integrating printf into the language would be ugly.

This functionality was incorporated in C99 together with other features for better support of numerical computation and it is provided in the abovementioned <tgmath.h> header. Provided are goniometric and logarithmic functions, functions for rounding and a few others. The header defines macros that shadow the existing functions from <math.h>; e.g. the cos macro behaves like the cos function when its parameter has type double, like cosf for float, cosl for long double, ccos for double _Complex, ccosf for float _Complex, ccosl for long double _Complex. Finally, when the parameter has any integer type, the cos function is called, as if the parameter were implicitly converted to double.

The second reason why this feature is ugly is that it attempts to imitate functions, but the imitation is imperfect and even dangerous: If you try to pass the generic macro cos as a function parameter, you actually always supply the cos function operating on doubles because the macro expansion doesn’t happen when cos is not followed by a left parenthesis.

The final reason why this feature is ugly is that such macros can’t be implemented in strictly conforming C, they have to rely on some kind of compiler support – and experience (e.g. the speed with which bugs in the glibc implementation are discovered) seems to suggest this features is used very rarely and doesn’t deserve to be a part of the “core language”, especially because the underlying feature is not available. (Contrast this to <stdarg.h>, which is available for portable use.)

Now, if the feature is both ugly and not used in practice, why mention it at all? I’m writing this article because I have examined the glibc implementation and it is such an ingenious hack that I feel it should be recorded for posterity, in some better way than this commit message:

2000-08-01  Ulrich Drepper  <drepper@redhat.com>
            Joseph S. Myers  <jsm28@cam.ac.uk>

* math/tgmath.h: Make standard compliant. Don't ask how.

(more…)

How to destroy Linux

Monday, December 5th, 2005

If you don’t feel strongly about binary-only modules because “nobody gets hurt”, Arjan’s Linux in a binary world… a doomsday scenario might persuade you to change your mind.

Some C programs are just LISP in disguise

Sunday, December 4th, 2005

After learning LISP I’m increasingly noticing open-coded emulations of LISP facilities. While every LISP advocate talks about macros, macros are not the only reason why LISP is worth learning.

Dynamic variables look like an obscure feature inherited from the times of LISP interpreters, but there is no really clean way to emulate the functionality if the language doesn’t provide it. (On the other hand, the comprehensive LISP condition system can be implemented as a set of macros on top of dynamic variables.)

(more…)

Fun with C99 compound literals

Friday, December 2nd, 2005

Compound literals allow one to write e.g.:

struct point { int x, y; };
extern void putpixel(const struct point *);

and later

putpixel(&(const struct point){ 1, 2 });
struct point pt = (const struct point){ 37, 42 };

Nice, but not useful much, right? I was amazed at Nikita’s usage:

strtol() Usage Guide

Friday, December 2nd, 2005

strtol() is what you get when you want to be flexible and yet have a simple interface. At the first glance it looks like a nice, clean function: you pass it a string and a base and you get the value and optionally a pointer to the rest of the string.

Then you read the documentation.

To convert const char *str to a long, properly checking for overflow, invalid trailing characters and empty input, it is necessary to do the following:

char *p;
errno = 0;
result = strtol(str, &p, base);
if (errno != 0 || *p != 0 || p == str)
  error_handling ();

It is necessary to check both errno and *p; if you don’t check errno, you get 0 for empty input and LONG_MAX or LONG_MIN for overflow or underflow. On empty input the return value is 0 and errno might be set to EINVAL; the portable way of checking for empty input is comparing p and str.

This will still accept strings that start with white space; check for !isspace((unsigned char)*str) if you want to reject them.