Matthias Ernst blog@mernst.org Matthias Ernst's Weblog http://mernst.org/blog/rss.xml Mostly geeky stuff 2007-10-05T19:18:00+02:00 Sidebar

About

I'm a passionate, paid tinkerer in all things Java.

Links

jist - sprong - axt - ULC with Spring - heapprofile - ariadna

Syndicate

Atom 1.0

http://mernst.org/blog/rss.xml#sidebar 2007-06-11T13:00:00+02:00 Re: Budget Reservation Confirmation

So it seems I'm exhibiting a huge amount of naivety about how car rental works at Budget Inc. Four strong indications:

Assumption: the online reservation system "knows" about the number of cars available at a rental location. It will not let you make a reservation when no cars are available and it will not send you a "confirmation" mail saying "your rental car has been reserved".
Assumption: Representatives at the rental location are aware of your reservation and are expecting you at the time indicated therein.
Assumption: A representative that - after debunking assumptions 1 & 2 - offers to transfer your reservation to another location nearby has a better overview of availability in her software and will make sure a car will be waiting for you there.
Assumption: The map handed to you by the representative actually carries the correct address of that other location.

Needless to say, all four assumptions have been proven wrong at Budget Embarcadero, San Francisco, and FWIW said representative had already closed shop by the time #3 came to pass. It wasted like 2 hours of a precious Saturday that I had planned to spend on the road - boy am I pissed. And I just refuse to accept the idea of having to question every statement made by a representative and jotting down the name of every person I talk to. Get your shit together, Budget, before I rent with you again.

http://mernst.org/blog/rss.xml#re-budget-reservation-confirmation 2008-02-16T15:17:00-08:00 To Rephrase Reginald

Reginald quotes Blodgett's Law and points to verbos languages:

A development process that involves any amount of tedium will eventually be done poorly or not at all.

If I read him correctly, Reginald is trying to say: "Programming Languages that require a handwritten test suite to prove basic type correctness will eventually be abandoned." Happily trolling away ...

http://mernst.org/blog/rss.xml#to-rephrase-reginald 2008-02-14T20:47:00-08:00 Děkuji Vám

Yet another installment in that long list of pleasant surprises (example made up). Imagine I've written a method that takes a map and prints its entry set. Now I want to generalize this method to print any set and to make the caller pass in the entry set himself. Highlight map.entrySet() and press Apple-Alt-P:

Notice the last option "Remove parameter no longer used"? It makes my refactoring just one consistent operation, one I'm doing quite regularly and this is really just awesome. Steve Yegge dismisses it as "industry excitement about manipulating code as algebraic structures". He's happier pushing around compressed ASCII. Matter of taste.

Anyway, making my employer buy IDEA licenses is already a form of positive feedback, but I think it's about time to extend a BIG thank you to the guys at JetBrains. You have fundamentally and continually improved my work experience over the last >5 years --- I can't even remember anymore when I retired emacs. Cranking out and massaging code is what I do and every friction removed helps keep me sane.

</vendor-promotion>

http://mernst.org/blog/rss.xml#a-big-thanks 2008-01-12T10:44:00+01:00 Actually type inference has been here for a long time

There are so many complaints that variable declarations in Java are written like so: type var = new type();.

Not in my IDE. In my IDE, it's: new type() CTRL-ALT-V/F/C var RETURN. Or type var = new SHIFT-CTRL-SPACE RETURN. And should I want to change the type, I typically rinse and repeat.

This is not to say that it doesn't inhibit readability and that a compiler shouldn't do the job. But people who actually type the type twice today simply don't use the right tool.

[2008-01-06] Kevin writes: You must be joking. That's not type inference. Of course I'm joking and not. Type inference refers to the process performed by the programming environment that fills in -redundant- type declarations that have been omitted from the source code, typically in two cases: variable declarations (type is that of the initialization expression) and function calls (type parameters derived of value parameters' types). This process is hopefully defined by the language specification.

However, we have to keep in mind that this definition is very focused on the textual input that is source code. The underlying assumption is that the programmer types and reads the source code as a sequence of characters --- so we need the source code to be as concise as possible. I'm saying that we need to factor programming environments in more. I'm saying I don't need type inference for locals as badly as other people seem to because my interface to Java is much more than the JLS and a text editor, my IDE helps me.

There are more examples like this. No one argues against long variable names because they are so hard to type. Opponents of extension methods as recently proposed say that a reader of a call site can no longer figure out easily where the method is defined. Not in my world: CTRL-Hover over the symbol and I see the definition site. Smalltalkers don't ever bitch about the class definition syntax since they almost never see it. And why has no one ever proposed getting rid of the package declaration statement in Java although it is predetermined by the directory location? Exactly, no one writes these by hand anyway.

In the end it's a language vs tool argument: what I say makes the IDE great, linguists say they wouldn't need if the language were better. Don't get me wrong, I'm all for local variable type inference. And I loathe code generation. I'm just getting pretty tired of endless my language vs your language when the comparison is based on the textual representations and not on the net programming effort.

[2008-01-07] Rémi and Ben raise an important point below: some types cannot be written down in Java, especially of anonymous inner classes.

Comments

Kevin: You must be joking. That's not type inference.

Rémi Forax: Matthias, yes IDE helps but it will bother me to have to use the mouse over every variables to understand a code. Reading codes is a part of my job (finding bugs in student codes) and i love Java because i am able to do that pretty fast.

Moreover, Java type system own types that are not writeable like the type of null, the intersection of interfaces, wildcard capture. By example, try to find with your IDE, the type of foo in the code below:

 list = null;
foo := null;   // inference here
list.add(foo);
]]>

Rémi

Ben: Rémi makes a good point. What if you have an anonymous inner class with a method that you want to be able to call or a field you want to reference?

AFAIK, this is the main reason local variable inference was added to C#3 - to allow anonymous types for LINQ expressions.

http://mernst.org/blog/rss.xml#ide-type-inference 2008-01-05T22:00:00+01:00 Swimming with the stream

JAXB is great. Not a doubt. Just as an academic exercise, in order to write a big document, can we avoid building the object graph in advance? Could we still profit from strongly typed binding? The best answer I found was this:

Write an interface that corresponds to your complex type. Where JAXB expects a property to read from, have a method that accepts a closure that accepts an instance of the child element type. For elements of simple type, provide a simple setter:

          public interface Artist {
          public Artist name(String name);
          }

          public interface Album {
          public Album artist({Artist=>Void} artist);
          public Album title(String title);
          }

In order to produce an Xml document with this, we need to bootstrap ourselves an Album instance and write through typed calls:

          write("album", Album.class, {Album a =>
          a.title("mein leben")
          .artist({Artist a =>
          a.name("Charlotte Saenger")
          })
          });

Not pretty. The best I could think of. Seems like this won't have a great future... Just for the sake of completeness, an implementation _sketch_:

          public class StreamBinder {
          Model m;
          ContentHandler out;


          <T> void write(String uri, String element, Class<T> c, {T=>void} t) {
          out.startElement(uri, element, "", new AttributesImpl());
          t.invoke(bind(c));
          out.end(uri, element, "");
          }

          <T> T bind(final Class<T> c) {
          return c.cast(Proxy.newProxyInstance(c.getClassLoader(), new Class[] {c}, new InvocationHandler() {
          public Object invoke(Object o, Method method, Object[] args) throws Throwable {
          String uri, element = model.lookup(c, method);
          Class elementType = model.lookup(c, method);

          // ##handle simple content
          write(uri, element, elementType, args[0]);
          return o;
          }
          }));
          }


          public static void main(String[] args) {
          model = reflectOver(Album.class);

          new StreamBinder(out, model)
          .write("album", Album.class, {Album a =>
          a.title("mein leben")
          .artist({Artist a =>
          a.name("Saenger")
          })
          });
          }
          }

http://mernst.org/blog/rss.xml#swimming-with-the-stream 2007-12-22T00:20:00+01:00 S3ad

So I tried Transmit, Forklift and S3cmd and _none_ allows me to set the content-type. Only s3cmd can make items world-readable in an easy way. I can't believe I just had to hack s3cmd to make '--mime-type' work.

Also sad: this file as application/atom+xml would be spec compliant, but Camino only renders application/xml. So I'm going to stay with xml I guess...

http://mernst.org/blog/rss.xml#S3ad 2007-12-21T22:34:00+01:00 PUT to work

I want to second Harry and Patrick's comment over at Bill's blog. I think all of Bill's concerns can be addressed:

Trust: If you have multiple clients, you need to do access control anyway (they might PUT to each other's items, even if created by POST).

Accidental overwriting: handled by 'If-None-Match'.

Conflicts: with PUT, you can route requests according to URI and do efficient conflict resolution/avoidance on the local server shard.

Apart from idempotency: let's remember that Atom is for syndication - that typically entails already existing items with client-specific ids. That an Atom server forces its own id space onto me that I need to map to mine is a nuisance, not a feature. And for the really stupid clients: serve an Atom feed of fresh, unused ids.

On the same level, I've never been into the "primary keys must be database-generated" mantra.

http://mernst.org/blog/rss.xml#PUT-to-work 2007-12-21T20:46:00+01:00 mernst.org moved and outsourced

As a consequence of my move to Zurich, I had to change providers for mernst.org. I was torn between two extremes: a "real" server where I could experiment and put Java services on the web and a minimalistic approach where http file delivery and mail run with the big guys. I have chosen the latter route - a mere domain registration with a german provider, http traffic is CNAMEd to my bucket www.mernst.org at Amazon S3, email lives on Google Apps.

What I'm still looking for is a free/pay per usage service that would accept [captcha'd] [XHR] POST requests for my domain, store them and offer them as a feed that I could process at my discretion, i.e. when I'm online.

http://mernst.org/blog/rss.xml#mernst-org-moved-and-outsourced 2007-11-24T23:24:00+02:00 Non-blocking IO Iterators

When you deal with non-blocking IO your IO tasks turn from methods into state machines. You do some IO on every transition, each state representing what kind of IO you would like to do next. In combination with something like theyielder iterator byte rewriting framework, we can turn these back into iterator methods that yield interest sets, making NIO a little easier again:

          // this is sample code that neither implements proper error handling
          // nor a tunable buffer allocation policy or anything ...
          // it just demonstrate how an NBIO task looks as an iterator of InterestSets
          public class Copy extends Yielder<InterestSet> {
          public final ReadableByteChannel source;
          public final WritableByteChannel target;

          public Copy(ReadableByteChannel source, WritableByteChannel target) {
          this.source = source;
          this.target = target;
          }

          protected void yieldNextCore() {
          try {
          ByteBuffer b = ByteBuffer.allocate(4096);
          while (true) {
          switch (source.read(b)) {
          case -1: // EOF
          yieldBreak(); return; // yieldBreak() actually necessary?
          case 0:
          yieldReturn(new InterestSet().read(source));
          break;
          default:
          b.flip();
          while (b.remaining() > 0)
          if(target.write(b) == 0) yieldReturn(new InterestSet().write(target));
          }
          }
          } catch (IOException e) {
          // hmm what about that
          }
          }
          }

          ... where:

          public class InterestSet {
          public final HashMap<Channel, Integer> interestSet = new HashMap<Channel, Integer>();
          public Long delay;

          public InterestSet withInterest(Channel c, int op) {
          interestSet.put(c, or(interestSet.get(c), op));
          return this;
          }

          public InterestSet waitFor(long delay, TimeUnit u) {
          this.delay = TimeUnit.MILLISECONDS.convert(delay, u);
          return this;
          }

          public InterestSet read(Channel c) {
          return withInterest(c, SelectionKey.OP_READ);
          }

          public InterestSet write(Channel c) {
          return withInterest(c, SelectionKey.OP_WRITE);
          }

          public InterestSet connect(Channel c) {
          return withInterest(c, SelectionKey.OP_CONNECT);
          }

          public InterestSet accept(Channel c) {
          return withInterest(c, SelectionKey.OP_ACCEPT);
          }

          private int or(Integer a, int b) {
          return a == null ? b : a|b;
          }
          }

Disclaimer: I haven't actually run this code yet - just to get an idea how it would look like.

Comments

Aviad Ben Dov: Actually, yieldBreak() isn't really neccessary. Reaching the end of the yieldNextCore method is an implicit yieldBreak. So, you could either yieldBreak, or to break from the while loop - if that's the intention.

http://mernst.org/blog/rss.xml#non-blocking-io-iterators 2007-10-05T19:18:00+02:00 Extract Method Reloaded

Three steps of refactoring I often perform: given a complex expression exprA(exprB)

extract local "b" on exprB: b=exprB; exprA(b)
extract method "a" on exprA: b=exprB; a(b)
inline local "b": a(exprB)

Wouldn't it be great if this were part of the extract method refactoring? I.e. you get to choose subtrees of the expression's AST to form the parameters?

Example at hand: my expression is "interestSet.get(ch)|op" and I want to extract a method "or(Integer a, int b)" where "interestSet.get(ch)" would form the first parameter. In the next step I would change the method to return "a == null ? b : a|b".

http://mernst.org/blog/rss.xml#extract-method-reloaded 2007-10-04T16:15:00+02:00 Definite Initialization?

The Java language specification requires that a call to a super(...) be the first statement in a constructor. That rule has often felt overly restrictive to me. For example, it forbids code like the following:

          Constructor() {
          T local = expr;
          super(local);
          }

However super(expr) is allowed although both versions are provably equivalent. What's the motivation behind that rule? It's twofold: one, it garantuees that we do call a superclass constructor. Two, it (kind of) garantuees that we do not access "this" before that. The second point is kind of moot because the compiler still has to prove that "expr" does not access "this" either.

The whole situation reminds me very much of definite assignment: with JDK 1.1, the rules for initialization of local final variables were relaxed. You do not need to provide an initialization expression at the point of declaration but you can assign a value later, iff the compiler can prove that a) you only assign once and b) you do not access the variable before.

I personally believe :-) that the same relaxed rule could be applied to super constructor invocations. Take http://java.sun.com/docs/books/jls/third_edition/html/defAssign.html, replace the variable by "this", "definetely assigned" by "definetely initialized" and "assignment expression" by "super constructor invocation". Then the following code could be legal:

          Constructor(x) {
          if(x != null) {
          super(x);
          } else {
          super();
          }
          }

Am I missing something?

Comments

Matt: AFAIK, the JRE 1.4+ verifier allows super() to be later in the constructor, but only the language forbids it.

In fact, for anonymous subclasses, the compiler generates assignments to this.static before calling super()! You can see evidence of this if you read through Lower.java in the compiler sources.

Aviad Ben Dov: I totally agree. This obscure rule has made a lot of code into a mess.

Currently, to avoid having crazy expressions at the constructor, I create static methods and call those. But still, this causes ridiculous code.

 max) { throw new IAE(...); }
   super(min, max);
 }
}

Typically validations are single, such as

class Derived extends Range {
 public Derived(int min, int max) {
   super(min, max);
   if (min < 0) { throw new IAE(...); }
   if (max < 0) { throw new IAE(...); }
 }
}

This you can get away with using a validation method, but it's ugly.

class Derived extends Range {
 public Derived(int min, int max) {
   super(isNonNegative(min),
       isNonNegative(max));
 }
 private static int isNonNegative(int val) {
   if (val < 0) { throw new IAE(...); }
   return val;
 }
}

I'm not a language designer, but I'll ask: how about getting multiple return values? Then we could have

class Derived extends Range {
 public Derived(int min, int max) {
   super(validate(min, max));
 }
 private static int validate(int min, int max) {
   if (min < 0) { throw new IAE(...); }
   if (max < 0) { throw new IAE(...); }
   if (max < min) { throw new IAE(...); }
   return (min, max);
 }
}        ]]>

Bharath R: In addition to the motivations that you mentioned, this rule also ensures that (in cases where explicit calls to super() are used) the subclass does not access non-static elements (methods and fields) of the super class before the latter has been completely initialized. Again, if the compiler can prove that this isn't violated, there is no need to mandate that super() be the first statement in the constructor.

Rémi Forax: A simple rule could be:

1) before super() or this(), context of the constructor is static so no access to an instance variable is allowed and "this" doesn't exist.

2) after super()/this() context is classical

Futhermore, only one super() or this() is allowed for an execution path.

Rémi

http://mernst.org/blog/rss.xml#definite-initialization 2007-09-29T12:50:00+02:00 On the Luxury of Walking to Work

I've always enjoyed walking to the office on good weather (in Hamburg that means no rain), a habit that I'm determined to continue in Zurich. Here's a rocky rendition of my round about 25 minute commute, shot from the hip every ten seconds, stitched together with iMovie:

On a side note, I'm amazed how, let's say old fashioned the Youtube upload form is. Empty required fields are flagged only after submit/reload, the categories combo and my location map are reset in that moment. Not to mention that I have to invent yet another user name, even if I log in with my Google account - the obvious ones have already been harvested, of course.

Comments

Björn: I expected to see something from Zürich and was quite astouned to recognize Hamburg after the first few frames. Only then, I realized that you are probably not even started your career in switzerland :)

I wish you all the best and good luck.

http://mernst.org/blog/rss.xml#on-the-luxury 2007-09-12T19:35:00+02:00 New Adventures

Friday was my last day atCoreMedia, after a long long time. I've had a great time and got to work with a bunch of people I highly respect. Fare well!

Later this year, come find me in Google Zurich.

Folks, thanks for all the good wishes!

Comments

Sebastian Sanitz: Glückwunsch - Viel Spaß in Zürich!

Björn Bauer: Not only as a supporter I will miss you! Hope you feel good with Google. Wish you all the best!

Neal Gafter: Yay! Please look me up when you're in Mountain View.

Eberhard: Well, I wish you the very best for your next step in your career. Zürich is quite a nice place - so enjoy your time there!

Daniel: Hey Matthias, congratulations! -- daniel

Bharath R: Congratulations Matthias. Hope you keep the Java engineering flag flying high at Google (and continue your excellent writings). All the best.

Gino: I wish you a good time in Zurich, please post the walk to your new work.

Joerg: Alles Gute und guten Start!

Jochen T: Awesome, Google sounds like it would be fun. They're turning evil though, so watch out. But then again, look at me -- I'm working for Microsoft ;-)

http://mernst.org/blog/rss.xml#new-adventures 2007-09-09T10:40:00+02:00 R as in record

Some interesting stuff I've been working on recently: the never ending story of object-relational mapping. Instead of breaking objects into relations, I've looked into mapping annotated classes directly onto Oracle user-defined object types, or basically records.

What you get is: in combination with VARRAYS you get arbitrary nesting and can persist an object tree in a single column. It's about the same usage model as with XML serialization: if you can get used to non-cyclic, non-shared, trees of value objects you'll like it. Performance is great, on the order of several hundred items per second on a standard PC with Oracle/XE. In contrast to just serializing blobs, the database actually understands your schema and we can define indexes on simple properties or easily have the database server maintain index-organized tables for joining against property paths that may be array-valued. All important operations can be performed in single statements, no hidden roundtrips to maintain several tables or open/close lobs are necessary. (Actually not a 100% true, creating an instance of a struct requires to acquire a StructDescriptor first but that can be reused for all instances in the same batch).

No object identity, no sessions to flush, no second-level cache: this is just about making the database understand structured complex values. I'm beginning to think of it as my JAXB for persistence.

I would like to see this on other databases but the support for structs and arrays and their respective JDBC counterparts doesn't look great. On DB2, it seems like the path leads to their XML store anyway.

http://mernst.org/blog/rss.xml#r-as-in-record 2007-09-09T09:59:00+02:00 Knee Jerk

How simple-minded have I become? Patrick Logan holds a hammer to my knee and my immediate jerk: "by no means does that pattern matching look particularly readable to me" and "I can do it just as well in Java". Although that is indeed the case, it isn't Patrick's point and I also thought I had long given up on contests like these. It must be some kind of allergy striking as soon as one topic becomes too popular. So just for your reference:

 codes = new HashMap() {
    {
      put('A', ".-"); put('B', "-..."); put('C', "-.-."); put('D', "-.."); put('E', ".");
      put('F', "..-."); put('G', "--."); put('H', "...."); put('I', ".."); put('J', ".---");
      put('K', "-.-"); put('L', ".-.."); put('M', "--"); put('N', "-."); put('O', "---");
      put('P', ".--."); put('Q', "--.-"); put('R', ".-."); put('S', "..."); put('T', "-");
      put('U', "..-"); put('V', "...-"); put('W', ".--"); put('X', "-..-"); put('Y', "-.--");
      put('Z', "--..");
    }
  };

  public static List decode(String morse) {
    ArrayList result = new ArrayList();
    decode(morse, "", result);
    return result;
  }

  private static void decode(String morse, String partial, ArrayList result) {
    if (morse.length() == 0) {
      result.add(partial);
    } else {
      for (Map.Entry entry : codes.entrySet())
        if (morse.startsWith(entry.getValue()))
          decode(morse.substring(entry.getValue().length()), partial + entry.getKey(), result);
    }
  }

  public static void main(String[] args) {
    System.out.println("decode(\".-\") = " + decode(".-"));
    System.out.println("decode(\"...---..-....-\").contains(\"SOFIA\") = " + decode("...---..-....-").contains("SOFIA"));
    System.out.println("decode(\"...---..-....-\").contains(\"SOPHIA\") = " + decode("...---..-....-").contains("SOPHIA"));
    System.out.println("decode(\"...---..-....-\").contains(\"EUGENIA\") = " + decode("...---..-....-").contains("EUGENIA"));
    System.out.println("decode(\"...---..-....-\").size() = " + decode("...---..-....-").size());
  }
]]>

http://mernst.org/blog/rss.xml#knee-jerk 2007-09-09T09:40:00+02:00 Generics in Reflection

Generics haven't really arrived in relective libraries yet. Most libraries reflect over class objects, not types. For example, while JAXB knows how to marshal a List<X>, when presented a Holder<X> it will reflect over the erased Holder.class and conclude that its value is of type Object, not X. Note to self: whenever designing a reflective library I will try and accept instances of Type from now on. Gafter's Gadget will be helpful as the user's input device.

Here's some handy code that helps you resolve member types in instantiations of generic types:

 classOf(ParameterizedType pt) {
    return (Class) pt.getRawType();
  }

  private static ParameterizedType cook(Type t) {
    if (t instanceof ParameterizedType) return (ParameterizedType) t;
    if (t instanceof Class) {
      Class c = (Class) t;
      Type[] args = new Type[c.getTypeParameters().length];
      for (int i = 0; i < args.length; i++) {
        throw new UnsupportedOperationException("Can't (yet) construct a wildcard argument to " + t);
      }
      return ParameterizedTypeImpl.make(c, args, c.getEnclosingClass());
    }

    // have I missed a case?
    throw new IllegalArgumentException(String.valueOf(t));
  }
        ]]>

Example: what type has iterator() in a LinkedList<String>?



    java.util.Iterator
      ]]>

Note that in their infinite wisdom, Sun have made the type hierarchy into interfaces with private implementations and you're invited to either re-implement them yourself or cross your fingers that sun.reflect.generics.reflectiveObjects.* is not gonna go away. A public concrete class in java.lang.reflect would have done the job, too, IMHO.

Comments

Eamonn McManus @ 2007-08-02 04:38:06 PDT:

Excellent!

Sergey Malenkov and I addressed a similar problem in the JDK. For JavaBeans (Sergey) and MXBeans (me), we needed the ability to know that the type of iterator() in a List<String> is Iterator<String>, etc. The result of our labours is the class com.sun.beans.TypeResolver in JDK7 (and OpenJDK), which you can see at https://openjdk.dev.java.net/source/browse/openjdk/jdk/trunk/j2se/src/share/classes/com/sun/beans/TypeResolver.java?view=markup

There are actually quite a few nasties, of which the lack of concrete classes that you mention is one, though not too important for us inside the JDK. Another is doing the right thing in the presence of raw types (what is the type of iterator() in a raw List?). Also there's a bug in the way generic arrays are handled which means that you can get a GenericArrayType when it should really be a Class like String[].class.

I was surprised to see that the spec for ParameterizedType doesn't say that you can modify the returned array. The good example of Method.getDeclaredAnnotations() and relations should be generalized. In practice it's hard to imagine that an implementation of ParameterizedType would not clone the array before returning it to you, so you don't really need to clone it again.

http://mernst.org/blog/rss.xml#generic-reflection 2007-07-31T10:40:00+02:00 The Easy ETag Issue

Bill de hÓra:

"... Django will calculate an ETag for each request by MD5-hashing the page content, and it'll take care of sending Not Modified responses, if appropriate." That's it - you're done.

Just an observation: a Java web framework that collects all its output into a string before sending it to the client would be considered a poor, if not amateur implementation while we hail its Python competitor for it. Do Javanese have a premature optimization mindset? Indeed most Java projects never reach the point where they need to scale...

How about this approach: in absence of a more efficient alternative, calculate the ETag as a streaming digest of the content and send it as a trailer field. Iff the client sent an If-None-Match, do the calculation first over the content streamed to /dev/null. If matched, send 304. If not matched, do it again, this time streaming to the client. In an MVC architecture, we could reuse the computed model and just repeat the rendering phase. This way we trade memory consumption against additional processing overhead.

Comments

Damon Hart-Davis @ 2007-07-18 09:59:52 PDT:

...assuming that a page isn't very expensive to generate and is guaranteed to be generated exactly the same way twice.

My largest app generates each (JSP) page on the fly each time, and it is very unlikely that the page content would be identical twice in a row (eg I drop some content if the page is taking too long to send to the user).

On the other hand, generating an ETag on the fly is relatively easy, because I know the timestamps of the important constituents of the page.

Thus: the 'run it twice' approach would for me be slow and wrong!

(Matthias) I absolutely agree (that's why I wrote "in absence...")! My proposal only tries to improve on the memory requirements of the out-of-the box solution people are wishing for so badly. The fundamental weaknesses remain: request must actually be performed in order to compute the ETag and it's based on the volatile rendered representation.

http://mernst.org/blog/rss.xml#easy-etag 2007-07-18T13:30:00+02:00 Ids are forever ...

... until you decide to change them all. JRoller seems to just have done so: 150 unread items

http://mernst.org/blog/rss.xml#jroller-ids 2007-07-17T18:30:00+02:00 Maneifa Makes A NativE InterFAce

So I had to interface with Windows' Lsa* APIs in order to retrieve the password of my machine domain account ("$machine.acc") that allows me to authenticate as the HOST/hostname kerberos service. No problemo. Downloaded the Python Win32 extensions and had the string out in like two minutes.

Now do the same in Java. JNI is a no go. I looked at NLink and found myself annotating most parameters with @MarshalAs(PVOID) and marshalling (e.g. LsaUnicodeString) in and out of ByteBuffers myself.

Now don't get me wrong: I don't expect a native interface to implement the most wicked marshalling scheme for me. I would rather that it embraces marshalling on the Java side in an extensible fashion and only implements the absolute minimum in native code. So I came up with this for Win32:

          /**
          * Push $words words from address $stack. Invoke $function.
          * After invocation, put the result at $stack. $retType defines
          * the result type (0: void, 1: 32bit int/ptr, 2: 64bit, 3: float,
          * 4: double. If $isCdecl, clear up the stack afterwards.
          */
          public static native void invoke(
          long function,
          long stack,
          int words,
          int retType,
          boolean isCdecl
          );

With the very first inline assembly code of my life it was easy to implement:

          JNIEXPORT void JNICALL Java_org_mernst_maneifa_Maneifa_invoke
          (JNIEnv *env, jclass c, jlong function, jlong stack,
          jint words, jint retType, jboolean isCdecl)
          {
          int i, w;
          int *buf = (int*) stack;
          void (*f)() = (void (*))function;

          for(i=words-1; i>=0; i--) {
          w = buf[i];
          __asm push w
          }

          switch(retType) {
          case 0: ((void (_stdcall *)())f)(); break;
          case 1: buf[0] = ((int (_stdcall *)())f)(); break;
          case 2: ((long long *)buf)[0] = ((long long (_stdcall *)())f)(); break;
          case 3: ((float*)buf)[0] = ((float (_stdcall *)())f)(); break;
          case 4: ((double*)buf)[0] = ((double (_stdcall *)())f)(); break;
          }

          if(isCdecl) {
          for(i=0; i<words; i++) {
          __asm add esp, 4
          }
          }
          }

Easy enough. EVERYTHING ELSE CAN BE DONE IN JAVA. Well, almost. Three things I need are there but weren't public: ClassLoader#findNative finds symbols in DLLs, DirectByteBuffer#address and new DirectByteBuffer(address, cap) do the pointer fiddling.

For a different architecture this interface might need to be different (wrt registers) but that's ok. It reminds me of the approach SWT takes in contrast to AWT: instead of mapping a uniform interface to the system in tedious native code, rather use your high-level language.

Once it's settled down a little, I might describe what interfaces I use for marshalling. Now it's time for lunch. Oh, as for the name: Maneifa used to work here at CoreMedia.

Update: a MacOS/X Intel port was easy enough. Removed _stdcall and added stack padding.

Comments

Michael Nischt @ 2007-07-11 23:15:14 PDT:

Definitely nice!
I'd love to see some usage examples soon..

http://mernst.org/blog/rss.xml#maneifa-native-binding 2007-07-11T13:01:00+02:00 Generics and Delegation

Alex hates template methods and I think he's right. I prefer delegation, too, but traditionally it has a drawback: the delegate's type is obscured. In Alex' example, I have to cast to the specific ConnectionFactory type if I wanted to use one of its specific methods. That's why I've started to parameterize delegating classes with the type of their delegates like so:

public class ConnectionPool
          <CF extends ConnectionFactory>
          {
          public final
          CF
          connectionFactory;
          public ConnectionPool(CF connectionFactory, ...) { ... }
          ...
          }

          public interface ConnectionFactory { ... }

Now a client can have aCountingConnectionFactory, and use it into a pool without casting: pool.connectionFactory.numberOfCreatedConnections.

If you don't care about the specific type, just use aConnectionPool<?>. Actually maybe the compiler/IDE shouldn't be so picky about raw usage of (especially bounded) parameterized types and just be happy with ConnectionPool. Neal's call for type inference for constructors would help to make construction more concise, too.

You may have also noticed the use of a public final field. I don't quite see the point of hiding information that my client provided in the first place (ducking and covering).

http://mernst.org/blog/rss.xml#generics-and-delegation 2007-07-07T14:41:00+02:00 Bytecode Instrumentation for Jist

I've added some bytecode instrumentation support to Jist using the always excellent ASM library. Agent#instrument(class, methodName, n) returns an Activity that enters and exits as the method is called/returns. Example:

This code will aggregate statistical data about Thread lifetimes. Unfortunately this code contains some problems: it uses Instrumentation#redefineClasses and thus does not interoperate well with other instrumentation agents. Since Jist only pushes aggregation data every once in a while, thread-local event data may be lost when a thread exits. I guess I should inject a forced publish at thread exit and/or add a forcePublish flag to an event.

Comments

Alan @ 2007-07-01 09:30:39 PDT:

Have you looked at retransformClasses? It was added in Java SE 6 to support multiple agents doing BCI.

(Matthias) I have but unfortunately I'm on a Mac. I guess I should download the old old Mustang build that Apple provides... or look at Parallels and an OpenSolaris Image.

http://mernst.org/blog/rss.xml#bytecode-instrumentation-jist 2007-07-01T15:06:00+02:00 Gmail goes Web 1.0

Just pressed "send" and got greeted with "your connection to gmail has expired. please log in again". Mail lost. Me seriously grumpy. I've had this type of UI with Outlook Web Access long enough.

Comments

Lee @ 2007-07-01 19:59:29 PDT:

have you looked in your 'drafts' folder?

(Matthias) Thanks, I have. It contains an old draft that doesn't help.

Bob Lee @ 2007-07-14 09:19:25 PDT:

When this happens, I log in in a different window, and then I can send the message.

http://mernst.org/blog/rss.xml#gmail-oldschool 2007-06-29T15:28:00+02:00 ariadna for Windows x64?

I'm always happy to hear that someone is usingariadna. Now Neil Ferguson is asking for a Windows x64 build of the agent dll which I cannot provide. Can someone help out? Thanks in advance.

http://mernst.org/blog/rss.xml#ariadna-winx64 2007-06-26T22:28:00+02:00 Bad Refactoring

Case of a refactoring gone bad:

          public volatile
          ByteBuffermessage;

          public ByteBuffer getMessage() {
          return duplicate(message);
          }

          private ByteBuffer duplicate(ByteBuffer message) {
          return (message != null) ? message.duplicate() : null;
          }

IDEA 7.0M1b will happily inline #duplicate so that it reads #message twice (Bug):

          public ByteBuffer getMessage() {
          return (
          message
          != null) ?message.duplicate() : null;
          }

Always watch your tool!

Comments

Damon Hart-Davis @ 2007-06-18 04:11:33 PDT:

Yuck!

Now that's really bad, especially with the huge red warning light that is 'volatile'!

Rgds

Damon

http://mernst.org/blog/rss.xml#bad-refactoring 2007-06-18T10:35:00+02:00 Google Groups Virus Detector

Anyone else get this? I got it on two independent computers now, one Mac, one PC. Of course I can't say for sure my computer isn't 0wned but it looks bogus to me.

Comments

Thomas Darimont @ 2007-06-13 05:32:58 PDT

jep, I also see this message when I browse the guice-google group... May be google just want us to solve some Captchas for them ;-)

As in: Human Computation: http://video.google.com/videoplay?docid=-8246463980976635143

Best regards,
Tom

Björn @ 2007-06-14 00:47:12 PDT

Question is, what's meant by "[..] or your network is infected". Eventually you're part of the internet!

http://mernst.org/blog/rss.xml#google-groups-virus-detector 2007-06-13T14:10:00+02:00 Partial APPdates

James Snell and Bill de hÓra point to the PATCH method to handle partial updates of Atom entries through APP. I don't see that need. PUT with the If-Match: header is just enough to do the work on the client side using optimistic concurrency control. The server side patch model:

          Client Server
          --- GET(uri) -->
          <-- ETag, entry ---
          prepare patch
          --- PATCH(uri, patch) -->
          with locked entry
            apply(patch, entry)

can easily be replaced by this:

          Client Server
          --- GET(uri) -->
          <-- ETag, entry ---
          prepare patch
            do
            newEntry = apply(patch, entry)
          

          --- PUT(uri, newEntry, If-Match: etag) -->
          with locked resource
            compare tag ? update : fail
          
          <-- 200 OK or 412 Precondition Failed ---
          if(Precondition Failed)
          --- GET(uri) -->
          <-- ETag, entry ---
          while(Precondition Failed)

The second version requires the same number of roundtrips if there is no conflict and we assume that you always read an entry before patching it. Optimistic concurrency control breaks down when you have heavy concurrent updates on the same entry, something I'd argue to be rather uncommon. It does require the client to handle the patching but I don't think this is an undue burden.

http://mernst.org/blog/rss.xml#partial-appdates 2007-06-10T11:40:00+02:00 Mobile TV Interop Event

We just spent a great week in Berlin at the BMCO Test Camp to test interoperability of OMA BCAST Mobile TV implementations. Results cannot be disclosed but I can say it was an amazing event. Engineers from all over the world brought handsets, encoders, scramblers, smartcards, key management components and whatnot in emulation, prototype up to product quality, together with assorted IDEs, debuggers, analyzers, flashing equipment. And for a broadcasting event, way too many cables :-) Most participants demonstrated a great spirit of cooperation and were eager to help each other, running high on adrenaline to get their systems to interoperate.

Some observations: my NIO/multicast hack has run rock-solid for days except for one VM freeze. Hot Swapping is a real winner during testing. It's sad to see people programming ad-hoc HTTP clients in C (on a server!) - of course it didn't work. In a stark contrast to Java One, almost noone brought a Mac. Some people can go for days with almost no sleep (not me).

We are very happy with our results. German engineering paid off! Plus we had a lot of fun.

http://mernst.org/blog/rss.xml#mobile-tv-interop-event 2007-06-08T10:46:00+02:00 Type Operators

I was talking to someone about the type system of Tycoon2, our research language back in school, and I mentioned it had a higher order type system with type operators. So they asked me what are type operators good for. And I couldn't say.

Today I could think of something. I often wish for a Map whose key and value types aren't fixed but instead dependent on one another. Example: an XmlTypeSystem contains a map from class objects to marshallers. Given a Class<T> I can get a Marshaller<T>. Signature:
<T> Marshaller<T> getMarshaller(Class<T> clazz).

Trying to implement this method using a hashmap I cannot get away without a compiler warning since
Map<Class<?>, Marshaller<?>> marshallerByClass
is the best I can come up with in Java.

So actually I would like a Map with a signature like this:

          interface DMap<T, KeyOperator extends <T>Object, ValueOperator extends <T>Object> {
          <K extends T> ValueOperator<K> get(KeyOperator<K> key);
          }

          // or rather
          interface DMap<T,
          KeyOperator extends <? super T>Object,
          ValueOperator extends <? super T>Object>

To be read as such: A DMap is parameterized by a base type T and two type operators that map any type below T to a subtype of Object. The get method is parameterized by a type K below T and maps between the application of the first type op to the application of the second. This way we can assign a meaningful type to our marshaller map:

          DMap<Object, :T=>Class<T>, :T=>Marshaller<T>> marshallerByClass;

          // explicit type arguments for demonstration purposes only
          marshallerByClass.<String>get(String.class) => Marshaller<String>
          marshallerByClass.<Feed>get(Feed.class) => Marshaller<Feed>

So this is what type operators can be good for. I do not think they have a broad application though and I do NOT propose we do anything like this to Java.

http://mernst.org/blog/rss.xml#type-operators 2007-05-30T09:09:00+02:00 Comments

I've finally added some crude Response-O-Matic backed comment forms to my blog. Even though they're moderated I hope this lowers the entry barrier for you. Keep them rolling in!

http://mernst.org/blog/rss.xml#comments 2007-05-28T13:04:00+02:00 Cascading: First Patch

Here's a first patch against b13 that survives simple tests. A number of problems remain:

It only cascades method invocations with explicit receiver expressions. Javac does not explicitly compute the receiver at typecheck time for implicit this or outer this targets and so does not have their types ready.
super.m() has the type of the superclass, not of this.

          --- comp/Attr.java.orig.txt 2007-05-26 13:55:53.000000000 +0200
            +++ comp/Attr.java 2007-05-26 14:15:51.000000000 +0200
          
          

          --- comp/TransTypes.java.orig.txt 2007-05-26 13:56:46.000000000 +0200
            +++ comp/TransTypes.java 2007-05-26 14:15:51.000000000 +0200
          
          

          --- jvm/Gen.java.orig.txt 2007-05-26 14:00:07.000000000 +0200
            +++ jvm/Gen.java 2007-05-26 14:16:38.000000000 +0200

http://mernst.org/blog/rss.xml#Cascading-First-Patch 2007-05-26T14:21:00+02:00 Cascading: A Modest Language Proposal

I've hinted at this before - a discussion with my colleague Stefan Aust brought the topic back up: I'm proposing to add cascading for non-static method invocations of type void to the Java language.

Informal Discussion

I propose to make invocations of void methods usable as expressions. The type of a method invocation o.m(args) where m resolves to a void method would not be void but instead the type of o. The result of the method invocation could either be returned, be assigned to a variable, be used as a method argument, or be the target of another method invocation. Cascaded method invocations of void methods, especially of property setters would be possible:

final JFrame jFrame = new JFrame()
          .setBounds(100, 100, 100, 100)
          .setVisible(true);

Implementation: the change can be implemented solely by changing the way javac compiles method invocations. This is advantageous because no old code needs to be recompiled: after computing the receiver, it is simply duplicated on the stack. Either it is used as the result value or it is discarded as in JLS 14.8 "Expression Statements".

Required Specification Changes

JLS 15.12.2.6 Method Result and Throws Types: the type of a method invocation expression needs adjustment. The different forms of method invocation need to be discussed - either the type of the receiver expression, this or outer this.

JLS 15.12.4.1 Compute Target Reference (If Necessary): append "if the method returns void and there is a target reference, the target reference is retained and will become the result value".

There seems to be no wording about the result value in 15.12.4.5 "Create Frame, Synchronize, Transfer Control".

Required Compiler Changes

I don't have the code handy (just browsing) and it looks like two small changes to Gen#visitApply (Gen.java) and to MemberItem#invoke (Items.java) should be enough. Update: I forgot the necessary change to the type checker: that should be TransTypes#visitApply (TransTypes.java).

Conclusion

So far, a void method invocation can only be used as an expression statement, so the semantics of existing valid code remain unchanged. Setter invocation orgies can now be made a little nicer.

If I find time I'll try my first KSL hack. What do you think?

Comments

Rémi Forax @ May 23, 2007 8:19 PM:

Hi matthias,

the typechecker of javac is implemented in Attr and not in TransTypes. Transtypes pass does the erasure (and some tricks like inserting inner class backreference etc.)

cheers, Rémi

Anonymous @ 2007-05-31 09:57:01 PDT

Thats a feature I've thought about too. It seems really logical and can't break old code. There is one disadvantage though. The documentation of the signature sends contradictory signals. And anyway, imagine a immutable class. No use there since that already should do what you propose, but in a mutable class, things can get strange. Is that method returning this, or another object? Not immediately apparent.

Peter Kehl @ 2007-06-18 03:00:24 PDT:

This looks good and usefull.

RE Anonymous - immutable objects: Any methods on immutable objects that return void won't modify the object (by the definition of immutable). So any void method won't have any side effects on the this object.

Then it's 100% legit to have a sequence of such void calls, and cascading them makes sense. Of course, void methods on immutable object won't do a big deal to the object itself - they may either validate the object or register it in some static/threadlocal structures.

You may have a setter/modification method on an immutable object, but that method must return a new object based on this. So the method won't return void - therefore void cascading won't apply to it.

http://mernst.org/blog/rss.xml#AModestLanguageProposal 2007-05-22T23:28:00+02:00 Wave Particle --- Particle Wave?

Ever so often I'm utterly confused over a design choice. Two alternative designs flipping back and forth in my mind with their respective pros and cons and I loathe wasting time over them. Other people are much more pragmatic. Examples in no particular order:

Embedding vs Embedded: Be a webapp or embed Jetty? Is the primary purpose of your app to speak HTTP or does it just happen to?
Synchronization vs Active Object: Thinking of it I couldn't really name you a reason to submit tasks to an object instead of having the relevant method synchronized - you have the same set of challenges and actually extra context switches. Nonetheless pooled-thread-per-object looks attractive and simple.
Mutable Object vs Replacement: Instead of changing an object under synchronization I like setting up and dropping in a replacement instead which can often be done through a volatile write. Of course that means the parent must be mutable and thus the question repeats at a different level.
Remote Service vs Multiply Installed Jar: Redundancy is dangerous. A domain rule should not be implemented twice - but does that mean that only one component in my distributed system may "physically" contain it? Or is it "ok" to have the same jar in many places? Can I promise to update it everywhere? Versioning of algorithms may help.
Multithreaded Service vs Decicated Objects: For those how aren't afraid of garbage collection anymore: how do you choose? service.service(parameters) vs instance.service()? Oh, more confusion: Spring scopes make it easy to inject a proxy representing a "current" entity into a singleton. So service could have a field representing the "current context". Ugh.
instance.destroy() vs factory.destroy(instance): More general: object.doIt() vs service.doItTo(object)
NIH vs dependency hell: Is byteArrayToHexString worth a dependency? What if it uses a different logging framework?

I could go on forever... Are there other examples you're getting grey hairs about? Many of these have been discussed before under various names. Obviously, the pattern books don't have an answer either. They just describe the alternatives more eloquently. I guess the truth is: you just have to choose and it helps to be consistent throughout one piece of software. Maybe you know better afterwards.

http://mernst.org/blog/rss.xml#WaveParticleParticleWave 2007-05-05T18:02:00+02:00 DLR

Microsoft announcesDLR, the Dynamic Language Runtime. More details to come so stay tuned on Jim's blog. If I'm not mistaken it's a pure user-space implementation right now that tries to standardize functionality required for implementations of dynamic language interpreters. Once those interfaces stabilize Microsoft could go ahead and implement optimizations below in future CLR versions.

A crucial one of these optimizations, namely VM support for dynamic method invocations, was postulated by Gilad Bracha inJSR 292. Bears the question, what ever happened to this JSR? Can we expect to hear any more in this direction?

http://mernst.org/blog/rss.xml#DLR 2007-05-01T21:02:00+02:00 JavaOne Secrets

Some time it's always gonna be the first time, and this year I'm going to have my JavaOne initiation. So I built my session schedule and I'm hoping to turn some of my digital encounters into real ones. If you want to meet up, too, or have any tips for the best experience and be it "bring your own sandwich" I would be more than glad. Thanks and see you there!

http://mernst.org/blog/rss.xml#JavaOne-Secrets 2007-04-27T23:31:00+02:00 Wieder Runterkommen

Blood pressure once again at 180. Just who wants to punish me...

Exhibit A

A constructor with a java.lang.Class argument wouldn't harm. Update: to make things worse, there's a "fix by extension" available: "javax.activation.ActivationDataFlavor is a special subclass of java.awt.datatransfer.DataFlavor that allows to set all three values stored by the DataFlavor class via a new constructor." Thanks to Rémi (for the pointer, not the implementation!).

Exhibit B

I wonder when we'll finally get an SPI for implementing java.lang.Integer.

Exhibit C (fixed in Java 6)

I bet that people who understandably avoid this extra cruft are responsible for half of all encoding errors, the other half being those who do not understand encoding. Oh didn't someone want to build CharSet#UTF8()?

Epilogue

If you do decide to use checked exceptions, please go the extra way and provide alternatives that are statically known not to throw. Adrenaline is bad for my health. Blogging helps a little, and this:

 T mute({=>T throws Exception} block) {
  try {
    return block.invoke();
  } catch(RuntimeException e) {
    throw e;
  } catch(Exception e) {
    throw new RuntimeException(e);
  }
}
        ]]>

http://mernst.org/blog/rss.xml#Runterkommen 2007-04-20T18:31:00+02:00 Annotations are not the API

My 2cents towardsmocked annotations: when designing a library, always design a proper model in plain old classes. Make this model a public API and your library a runtime that interprets model instances. Use annotations only as a shorthand, as metadata from which a model instance can be derived. But leave the door open for alternative, programmatic access.

For example, an experimental XML binding API I've played with exposes things like the following:

: binding of a complex type to java objects of type V
JavaProperty: getting/setting values of type C into an object of type P
XmlComplexTypeBinding: binding of instances of complex type C into parent instances of type P
...
        ]]>

Apart from the programmatic aspect, I find that this model is the most precise way to document how the library works. (JAXB does have a model but it's hidden away as implementation detail). My XML binder will never be as complete or stable as JAXB or do XML mapping any "better" but it lays its cards open about how exactly it bridges between XML and Java.

Of course, this all applies just as well to other forms of configuration but we already knew that ...

http://mernst.org/blog/rss.xml#Annotations-are-not-the-API 2007-04-11T09:54:00+02:00 Oh those swedish product names

This one might not sell as good.

fartyg

Johan Liesén: Just wanted to let you know that "fartyg" is the Swedish word for ship or large boat.

http://mernst.org/blog/rss.xml#Those-Swedish-Product-Names 2007-03-28T23:55:00+02:00 Scrolling and Paging

Scrolling and paging: both well accepted and useful forms of visiting an amount of data bigger than your screen. What drives me nuts is that they are often employed in the same application. Click-scroll-click-scroll until I'm steaming. Microsoft Outlook Web Access, Google Mail, Google Search, my Xing contact list are all guilty of that. Letting me configure how many items I want per page is a poor replacement for what I really want: show as many items as fit, stupid. Second worst crime, by the way: changing the position of the "next" button from page to page.

http://mernst.org/blog/rss.xml#Scrolling-and-Paging 2007-03-27T19:33:00+01:00 Fermé pour travaux

Musée national Fernand Léger, Biot, Musée Picasso, Antibes, Musée Matisse, Nice. At least we got some sun.

http://mernst.org/blog/rss.xml#Ferme-au-travaux 2007-03-27T19:30:00+01:00 Pattern, transient final, serialization

I tried to find a better fix for Bug ID: 6436458 that would eliminate synchronization altogether. Basically it boils down to the problem of serializing transient final fields. In this case some fields of the Pattern class hold the compilation information of the regex that we don't want serialized but rather reconstructed on deserialization. But we can't assign them in #readObject since they're final. And we want them final so that Pattern isinitialization safe. Sounds like a Catch-22.

The solution is #readResolve. We can deserialize an intermediate Pattern instance that holds the pattern string and the flags - the transients being null - and then resolve it to a compiled instance using Pattern#compile. #readResolve is not recursive, so we won't end up in an infinite loop.

Mission accomplished: initialization safety, slim serialized form and no synchronization altogether. Patch.

http://mernst.org/blog/rss.xml#Pattern-transient-final-serialization 2007-03-14T17:00:00+01:00 Get the Jist

Toying with dtracing Java apps through JNI was fun but had a number of downsides:

it would only run on Solaris
I had to jump through JNI hoops
there was only one probe
I only had strings
D is not exactly good for building abstractions. I found myself repeating a lot of ...-entry { self->x[y]=... } boilerplate.

So I'm trying a new approach: Jist, Java InSTrumentation. It is heavily DTrace inspired but completely implemented in Java, as a simple 55k no-dependency jar. With Jist, you can define low overhead probes in your code that can be dynamically enabled at runtime. As with DTrace, probes are merely the source of information - you define separately how you want to aggregate the data and correlate probes that were written by independent authors and just happen to run in the same code path. Probes and aggregation functions are thoroughly typed.

In comparison to DTrace, what is Jist not ? It only captures what happens in the JVM, it does not come with all-purpose providers like syscall, it is not easily invoked from the outside on an arbitrary Java process (agent attachment notwithstanding) and there is no garantueed safety from misbehaving aggregations.

Defining and Firing Probes

In contrast to DTrace, Jist distinguishes two kinds of probes: activities and events. Activities span an interval in the program execution, and can be nested; the stack of activity contexts is managed by Jist. This is very similar to the -entry and -return convention in DTrace where you often build contexts in a thread-local map. Events on the other hand are just points in the program execution. Both types of probes can signal detail data. This is how you define and use probes:

 Request = new Activity("Request");

  public void handle(String request) {
    Request.enter("request");
    try {
      ...
    } finally {
      Request.exit();
    }
  }
}

class ConnectionPool {
  public static final Event ConnectionOpened = new Event("ConnectionOpened");

  {
    ...
    ConnectionOpened.fire(dataSource);
  }
}
        ]]>

When an activity starts, a Context record is allocated and pushed onto the thread-local stack. As of now, there is now way to disable this, but I have the feeling that this cost is negligable in most cases - otherwise I'll employ a small thread-local cache. Events on the other hand are disabled by default. Every time an event fires it checks for enabledness through a volatile - this should be ok for most cases. As with logging, in case the event argument is expensive to compute, you can check beforehand: if(ConnectionOpened.isEnabled()) ... .

Simple Aggregation

One of the cornerstones of DTrace is in-place aggregation. Instead of saving all instrumentation data for offline aggregation, you condense the data as much as possible in place and save only the aggregated data. Especially, an aggregation function should have the property that you can feed it data piecemeal: Count, Average, Sum, are all examples of such functions. Jist, like DTrace, capitalizes on the aggregation function property in that it aggregates incrementally in the moment an event happens, and especially, it aggregates events thread-locally. Less frequently, the thread-local aggregation values are aggregated into a global view. Both incremental and thread-local aggregation ensure that the instrumentation overhead stays low. An example: the following code defines an aggregation that will simply count how many times a connection was opened:

 aggregation = Aggregation.of(Count.ofEvents, ConnectionPool.ConnectionOpened);
aggregation.enable();
while (true) {
  Thread.sleep(1000);
  System.out.println(aggregation);
}
        ]]>

On #enable, the aggregation will attach itself as a listener on the Event. When the event fires, it is routed to the aggregation which will increment a thread-local counter. Only sometimes will the counter be propagated to the global counter. In that codepath we only have one volatile-get (read listeners), two invoke-interface, a thread-local get and a counter increment. Every once in a while do we push the counter onto a ConcurrentLinkedQueue. On my MacBook (two threads repeatedly firing events, 1.8Ghz Core2Duo, -server) that amounts to:

40 million events per second amounts to 100 cycles per event, a number that stays stable at 20 threads, too. That's pretty good.

Correlation

Now, incrementing counters is not especially interesting and can be done more easily. It gets more interesting to correlate different, independent probes. For example, we would like to know the number of connection openings per request uri . For that, we can employ a Mapping aggregation function:

> aggregation =
  Aggregation.of(Mapping.of(Server.Request, Count.ofEvents), ConnectionPool.ConnectionOpened);

number of events by Request (updated each ConnectionOpened) = {/uri2=2285635, /uri1=4571279}
number of events by Request (updated each ConnectionOpened) = {/uri2=1833648, /uri1=3667297}
number of events by Request (updated each ConnectionOpened) = {/uri2=1842779, /uri1=3685558}
number of events by Request (updated each ConnectionOpened) = {/uri2=1974581, /uri1=3949161}
        ]]>

This shows the real potential: we can see that requests to /uri1 open twice as many connections as those to /uri2. That's something no profiler and no DTrace will tell you. Another teaser (numbers highly dubious):

> aggregation =
  Aggregation.of(Mapping.of(Server.Request, Statistics.of(Duration.of(Server.Request))), Server.Request.Exit);

statistics for duration in NANOSECONDS of Request by Request (updated each exit from Request) =
  {/uri2={|sample|=1615081, sum=7.43407E8, min=0, max=64524000, avg=460.29084609378725, stddev=96321.15405584917},
   /uri1={|sample|=1615084, sum=9.26188E8, min=0, max=75908000, avg=573.4611945880214, stddev=123748.53849281174}}
statistics for duration in NANOSECONDS of Request by Request (updated each exit from Request) =
  {/uri2={|sample|=1602764, sum=9.54039E8, min=0, max=100171000, avg=595.246087384044, stddev=145823.88277769994},
   /uri1={|sample|=1602763, sum=8.40672E8, min=0, max=77095000, avg=524.5142294899496, stddev=118100.06216280891}}
...
      ]]>

Outlook

One of the perks of DTrace is that you can attach to a process and run D scripts that were not compiled into that process. I'm still pondering how to do that. A scripting solution for the definition of aggregations might be interesting, one that is defined on dynamically loaded agents or something JMX-based. Right now, you'll have to bake in aggregations and export them somehow, i.e. though JMX. Some simple class-redefinition to inject probes into foreign code would be nice, too.

To some extent I hope that aggregation of independent events will reduce the time you spend sorting log records together (".. ok, here it starts .. now lets look for the same thread, ok down here, ... go up one page again").

Get it

Now that you're mouthwatering enough, go download Jist here. Or get the most recent version: hg clone static-http://mernst.org/jist .

http://mernst.org/blog/rss.xml#Get-the-Jist 2007-03-04T12:24:00+01:00 Experience comes with age

Germany's facing fierce resistance about moving the retirement age up to 67. Sun thinks a little more progressively in this context :-)

Years of experience: 55

Anyone? http://www.sun.com/corp_emp/search.cgi?req=548725&p=

http://mernst.org/blog/rss.xml#Experience-comes-with-age 2007-02-03T15:38:00+01:00 SDT probes for Java

The real power of DTrace lies in its ability to correlate independently provided data in ways unforeseen by the developer. Too often have I been asked things like "nice cache eviction statistics but how can I see which algorithm, query, user is responsible for this?". D is the SQL of instrumentation: it enables ad-hoc queries over unprepared data.

As of today, DTracing Java is cool as long as you're doing relatively general performance analysis of the software - questions about "which method?", "which class?", "which monitor?" can be answered - to open the ultimate insight into application space ("which query?", "which dataset?", "which url?") we need to work a little harder.

I'm interested in two kind of probes: the ones that establish a thread-local context for a possibly lengthy period, i.e. "starting request to /url", "starting transform foo.xsl", "importing x" and the corresponding exit in order to correlate these with events such as "evicting y" or "creating temp-file z".

A simple DTrace provider for Java applications

So I want to do something like this (obviously not in one class):

 REQUEST = new dtrace.DTraceContext("request");
  static final dtrace.DTraceContext TEMPLATE = new dtrace.DTraceContext("template");
  static final dtrace.DTraceContext EL = new dtrace.DTraceContext("el");

  public static void main(String[] args) throws Exception {
    while(true) {
      REQUEST.push("/index.html");
        TEMPLATE.push("outer.jsp");
          // do some real work here
          TEMPLATE.push("inner.jsp");
            EL.push("${something.wild}");
            EL.pop("${something.wild}");
          TEMPLATE.pop();
          // do some real work here
        TEMPLATE.pop();
      REQUEST.pop();
      Thread.sleep(1000);
    }
  }
	]]>

The DTraceContext class looks as follows. It does some stack management and ultimately fires entry and exit events through JNI:

 {
    private static native void init();
    private static native void fireEntry(String name, Object top, Object previous);
    private static native void fireReturn(String name, Object top, Object previous);

    public final String name;
    private final ThreadLocal> value = new ThreadLocal>() {
      protected ArrayList initialValue() {
	return new ArrayList(5);
      }
    };

    public DTraceContext(String name) {
      this.name = name;
    }

    public T get() {
      ArrayList ts = value.get();
      return ts.isEmpty() ? null : ts.get(ts.size()-1);
    }

    public void push(T value) {
      T previous = get();
      this.value.get().add(value);
      fireEntry(name, value, previous);
    }

    public void pop() {
      ArrayList values = this.value.get();
      T value = values.remove(values.size() - 1);
      fireReturn(name, value, get());
    }

    static {
      System.loadLibrary("dtracecontext");
      init();
    }
  }

  JNI code:

  // error checking yadda yadda
  JNIEXPORT void JNICALL Java_dtrace_DTraceContext_fireEntry
    (JNIEnv *env, jclass class, jstring name, jobject top, jobject previous)
  {
    if(DTRACECONTEXT_ENTRY_ENABLED()) {
      const jbyte *namestr = (name==NULL) ? NULL : (*env)->GetStringUTFChars(env, name, NULL);

      jobject tops = (top==NULL) ? NULL : (*env)->CallObjectMethod(env, top, tostring);
      const jbyte *topstr = (tops==NULL) ? NULL : (*env)->GetStringUTFChars(env, tops, NULL);

      jobject previouss = (previous==NULL) ? NULL : (*env)->CallObjectMethod(env, previous, tostring);
      const jbyte *previousstr = (previouss==NULL) ? NULL : (*env)->GetStringUTFChars(env, previouss, NULL);

      DTRACECONTEXT_ENTRY(namestr, topstr, previousstr);

      if(previousstr != NULL) (*env)->ReleaseStringUTFChars(env, previouss, previousstr);
      if(previouss != NULL) (*env)->DeleteLocalRef(env, previouss);

      if(topstr != NULL) (*env)->ReleaseStringUTFChars(env, tops, topstr);
      if(tops != NULL) (*env)->DeleteLocalRef(env, tops);

      if(namestr != NULL) (*env)->ReleaseStringUTFChars(env, name, namestr);
    }
  }

  // ... same for pop
  ]]>

Note how the native methods take Objects but the probes carry strings: I do a callback to #toString and extract the UTF-8 sequences only after checking whether the probe is enabled. The overhead of the JNI calls for disabled probes seems good if not used in tight loops (like 22 ns per call on my machine) - if it turns out to be a problem I'll need to hoist the test into the Java code. The DTRACECONTEXT_ENTRY is a macro generated by dtrace from my provider description - see "statically defined tracing" - basically we compile a call to a dummy external C function into our code, and postprocess it with dtrace that will patch this code sequence in order to disable/enable the probe. The provider:

          provider dtracecontext {
          // args (contextname, context entered, previously active context)
          probe ENTRY(char *, char *, char *);
          // args (contextname, context left, context returned to)
          probe RETURN(char *, char *, char *);
          };

A note to the naming: I tried to name the probes "entry" and "return" first - I thought they would be recognized by the "-F" (indent flow) option of dtrace but unfortunately that is not so; and return is a dtrace keyword. So the first thing that came to my mind was capitalization - I will most probably change that.

You will have noticed that this provider is very static - there's only two probes for the whole Java process and the actual type comes as a string argument. All application probes are disabled and enabled together which will have a bigger performance impact - maybe I can come up with some dynamic compilation/library generation scheme later.

I haven't collected really good examples from real software yet - that's why there were no actual dtrace scripts in action but I hope you could see the possibilities. I think the potential is huge. The demise of logging? That would be too good to be true...

http://mernst.org/blog/rss.xml#SDT-probes-for-Java 2007-02-03T12:45:00+01:00 Euphemism of the Day

"HTML allowed" actually means "Comply with those angle brackets or I will swallow all your paragraphs. You have been warned.".

http://mernst.org/blog/rss.xml#Euphemism-of-the-Day 2007-01-17T17:21:00+01:00 Solaris 10 Privileges

The Solaris 10 kernel supports a fine-grained security model, based on "least privilege" . For example, in order to run a server on port 80 you don't need to be root. The PRIV_NET_PRIVADDR privilege is sufficient.

The same applies to coniguration of the IPsec security associations and policy entries. Unfortunately, the respective command line tools 'ipseckey' and 'ipsecconf' do check for uid 0 before they even try to talk to the kernel.

Luckily, w/ Open Solaris being open source I could quickly fix that. Commented out the checks, did the ppriv thing, et voila, a normal user with SYS_NET_CONFIG can now do what I want. Neat!

Update: I've learnt that I shouldn't use privileges for that directly (will become messy over time) but rather a profile that defines the right execution attributes such as the "Network Security" profile. Yet that profile doesn't work right now for the outlined reasons (bug filed). Workaround: use your own profile and exec_attr with uid=0. No need to recompile the tools - just assign that profile to your user.

http://mernst.org/blog/rss.xml#Solaris-10-Privileges 2007-01-17T17:20:00+01:00 2007-02-03T12:49:00+01:00 Atom in Style

WARNING: this is an experiment and may well wreak havoc on your blog-reading experience.

I'm cheap and lazy and thus I'm not willing to pay for PHP hosting nor would I want to administer a blog system. Yet I want to host my blog on my own domain. I've been using Thingamablog until now but their editor has serious limitations, especially when working with pre-formatted code fragments. And I can only edit the blog from the one machine where the database is.

I have thus changed my blog to one major source file in the Atom format. I'm typing this entry directly in atom format into IDEA's XML editor - I have a nice live template to insert the minimum scaffolding. When I finish, this source file is then only post-processed (w/ JAXB) to derive "updated", "link", "id" and "a" elements (a process that I could still do by hand if that little piece of Java code is not handy). I have left all old ids intact and hope that they won't be re-syndicated, even if the feed format changes from RSS to Atom (this file will still be called rss.xml).

No HTML is generated, the feed is directly styled with CSS, not even XSL. A nice learning experience and the capabilities of CSS continue to impress me. As a potential downside, I'm now including all my posts in one file. Certainly more than before but not too huge if all participants make use of the relevant HTTP mechanisms.

I've tried to be as standards compliant as possible but your mileage with various readers may vary:

Firefox renders this feed best. I had to include a lengthy comment header in order to actually enable the stylesheet.
IE 7 fairs worst. It won't use the stylesheet (someone? DOCTYPE?) but is fooled by the comment. So you'll only see the text() nodes.
(Fixed) Both Omea Reader and Google Reader have issues with "bleeding" anchor tags (<a../>). Obviously they're embedding the XHTML pretty directly into quirks HTML without reprocessing it.
RSS Bandit and BottomFeeder look fine.
Syndication to PlanetJDK has stopped :-(

The CSS element selectors don't seem to care about namespaces. I don't know whether this will continue to work with CSS3 Namespace Module .

The sidebar on the right is implemented as a dedicated feed entry that is styled differently.

So much from the low-end side of blog software - please let me know what problems you're having and whether they prohibit you from reading this journal any longer. Enhancements to the stylesheet are welcome, too. You know where you can find it.

http://mernst.org/blog/rss.xml#Atom-in-Style 2007-01-16T23:37:00+01:00 Maybe it's time to think a little further

If you're bored of the "property" word, please skip.

If not, you may note that some of the REST principles apply to properties:

We expect to read a property without visibly affecting the object. Reading properties is supposed to be a safe operation . We also expect setting the same value for a second time to have no tangible effect. Setting properties is supposed to be idempotent . The values we see from properties may or may not be the actual state of an object, they are merely a representation .

In a distributed, concurrent environment, atomicity also is of concern. I feel progressively unsure about using an array of setter invocations for reconfiguration of a service, e.g. through JMX. The REST way of PUTting an entire representation of the -desired- resource's state looks a lot safer to me. It gives the object a chance to implement atomicity. Recently, we built a service that is reconfigurable through PUTting an entire configuration document, and this approach feels really solid compared to fine grained remoted JMX-based management. The admin UI operates not on the object but on a representation of its state and then submits it to take effect.

What is my point? I believe that applying REST principles to OO programming lends itself to separate the "life" objects with their logic from "dumb" state representations that can be manipulated or bound to. We might not want to bind UI elements directly to property methods. We might rather go and implement representations through simple structs w/ fields that are also nicely transported, persisted, versioned, ... Putting that state into side-effects would be of no concern at this stage. Useful linguistic support would be compile-time checked field references so we can bind some UI element to a field of the state representation.

There's a ton of open questions - no question about that. And this may be a pile of bull. But thinking about how representations (external ones, too), state, binding, ... all fit together is definetely a larger topic than just syntactic sugar around field declarations.

Comment by Eamonn McManus @ Jan 15, 2007 2:58 PM:

Atomicity and JMX

Hi Matthias,

Re your entry "May it's time to think a little further", indeed I think configuring a JMX object with a sequence of setAttribute operations is suboptimal. There are a few ways to avoid this within the JMX API:

As of Java SE 6, you can use an MXBean with a single complex attribute representing the collection of values to set. This is basically a "struct". Because it is mapped to a standard Java type (CompositeData) the potential problems with versioning and serialization are reduced.
In earlier versions you can still use the MBeanServer.setAttributes operation to set a bunch of attributes at the same time. But there's no way to require that the attributes all be set together, and there's some work needed to make the set atomic. It doesn't have to be very much work though, for instance using this idea from Nick Stephen.
An alternative to both of the above is to have an MBean operation rather than an attribute, the parameters to which are the values to set. This means there's no longer a way to set just some of the values, which overcomes the chief drawback of the preceding approach; it also works better with management consoles such as JConsole than either of the two alternatives. The operation can be called setSomething without ambiguity.

I'm not sure how this fits into the wider debate about properties, though. I think the questions that arise when setting things remotely are distinct from the questions that arise for local objects, though they do overlap. In particular the idea of extending the JavaBeans conventions so you can have a setter that sets more than one thing seems interesting. One of the advantages of immutable objects is that you can't get an inconsistent object state by an unfortunate combination of setters, but the same advantage could be had by only providing setters that set enough properties to retain consistency.

http://www.mernst.org/blog/archives/01-01-2007_01-31-2007.html#151 2007-01-16T20:51:30.345+01:00 Aren't properties objects?

Surface syntax aside, implementation-wise a property is all object-like and I'm surprised we're still talking generating methods according to a naming convention. A property *has* setter and getter, it can be referenced, especially in order to bind it to a UI component, you can attach listeners, ...

ValueModel anyone?

          public final RWProperty<String> name = new Slot<String>();

          mernst.name.get();
          mernst.name.set("matthias");
          mernst.name.attachListener(...);
          label.bind(mernst.name);

Comment by Stephen Colebourne @ Jan 13, 2007 7:28 PM:

Hi,

See my blogs about this
http://www.jroller.com/page/scolebourne?entry=java_7_property_objects
http://www.jroller.com/page/scolebourne?entry=more_detail_on_property_literals

And responses including
http://blogs.sun.com/abuckley/entry/properties_in_java

Also
http://joda-beans.sf.net

As for whether
> mernst.name.get();
> mernst.name.set("matthias");

should be exposed to the application code, I'm not yet convinced that isn't too big a leap from the current conventions. It would be nice not to have get/set methods though.

http://www.mernst.org/blog/archives/01-01-2007_01-31-2007.html#143 2007-01-12T08:58:59+01:00 Understanding the threats posed by XmlHttpRequest vs SCRIPT

Stefan agreed and received comments that matters just ain't so.

I admit that my experience with XSS is not extensive, so I went reading, too. I'll try and formulate the understanding I've reached now (main source being Jeremiah Grossman ):

The fundamental threat of XSS is that a script from site A (attacker) has the browser talk to site V (victim) (using V's cookies), having V reveal some sensitive data, and consequently transfers this data to A. That's why cross-site XHR is prohibited. In contrast to XHR, SCRIPT tags cannot read the response and thus not make any malicious use of it. Instead, the SCRIPT response is evaluated as executable Javascript code by the browser. If V never responds with the data in an executable format that data is safe from A. Thus SCRIPT tags are allowed to load scripts from a different site.

Now it has become popular to return data in JSON because it is so accessible from Javascript code. When this data is evaluated by a SCRIPT it gets dangerous since a clever script can intercept the inocouusly looking array- and object constructors and intercept the data - or easier, instruct the server to pass the data to a callback function, which seems to be popular, too.

Conclusion: only data in executable format, returned on a GET authenticated through a cookie, is in danger of being hijacked through a SCRIPT tag. But that's more a fault of the server than the browser security model.

http://www.mernst.org/blog/archives/01-01-2007_01-31-2007.html#140 2007-01-06T15:10:43+01:00 Two notes on current controversies

On the JSON v XML debate : what good is a same-domain requirement for XmlHttpRequest if you can do a GET on any host via a SCRIPT element? It's like saying "applets can't contact arbitrary hosts through sockets but URLConnections are ok".

On Closures and LINQ : LINQ introduces the notion of expressions. Restricting the contents of the curly braces to expressions might seem like an interesting approach. It avoids transfer-of-control-problems. But the most interesting part: LINQ allows you to reflect over an expression structure and gives you some interesting options to optimize the query.

http://www.mernst.org/blog/archives/01-01-2007_01-31-2007.html#138 2007-01-05T09:37:16+01:00 Three things hardly anyone wants to know about me

Thank you, Axel . What can I say?

When I helped in the development of the Sather compiler, I analyzed and fixed bugs purely by studying the code - my machine was way too small to bootstrap the compiler.

I met my wife at Marcello's in the Castro district in San Francisco. With funky shoes and hair she thought I was gay and thus safe to talk to - it turned out I wasn't gay but european.

The first linux I installed came by mail-order and was a stack of 3.5'' wrapped in packing tape. I wanted it because of xeyes.

Eberhard , Alan and Grabowski .

http://www.mernst.org/blog/archives/12-01-2006_12-31-2006.html#137 2006-12-21T10:15:07+01:00 NIO & IP Multicast

The common wisdom: the two of them don't go together. Some official Java version in 2008 might have support, some open source derivative earlier. For those of us that need to do multicast today, here's the version that works for me with 5.0u9, Win XP, IPV4. CAVEAT: ugliness reigns and this will most certainly break with newer or non-Sun versions.

In short: we wrap the DatagramChannel's fd in a manually created PlainDatagramSocketImpl, join the multicast group and let go of the socket again:

          // UGLY UGLY HACK: multicast support for NIO
          // create a temporary instanceof PlainDatagramSocket, set its fd and configure it
          Constructor<? extends DatagramSocketImpl> c =
          (Constructor<? extends DatagramSocketImpl>)
          Class.forName("java.net.PlainDatagramSocketImpl").getDeclaredConstructor();

          c.setAccessible(true);
          DatagramSocketImpl socketImpl = c.newInstance();

          Field channelFd = Class.forName("sun.nio.ch.DatagramChannelImpl").getDeclaredField("fd");
          channelFd.setAccessible(true);

          Field socketFd = DatagramSocketImpl.class.getDeclaredField("fd");
          socketFd.setAccessible(true);
          socketFd.set(socketImpl, channelFd.get(channel));

          try {
          Method m = DatagramSocketImpl.class.getDeclaredMethod("joinGroup", SocketAddress.class,
          NetworkInterface.class);
          m.setAccessible(true);
          m.invoke(socketImpl, socketAddress, networkInterface);
          } finally {
          // important, otherwise the fake socket's finalizer will nuke the fd
          socketFd.set(socketImpl, null);
          }

Enjoy or let me know why this is a bad idea.

http://www.mernst.org/blog/archives/12-01-2006_12-31-2006.html#136 2006-12-18T22:37:06+01:00 Finger Activity

We recently celebrated crossing changelist #100000! A great moment to reminisce. We introduced Perforce in 2000 and haven't regretted it once. Fast, reliable, powerful, good support.

I will now reveal my very personal electro-codeo-gram to you:

(p4 changes -u mernst | cut -d ' ' -f 4 | cut -d / -f 1-2 | uniq -c).

I've decided not to read too much into it.

http://www.mernst.org/blog/archives/11-01-2006_11-30-2006.html#135 2006-11-30T23:24:22+01:00 How Far is Fidji?

Inspired by James Snell : Fidji is deliberately Java-incompatible :-)

There's a few things that would be fun to hack into Java, now that we could actually share it with the world (Hats off to Sun!). I would

get rid of checked exceptions. They're such a pain. As far as I understand they're purely a compile-time construct, could be easily fixed in Javac and have nothing to do with either security nor VM integrity.
fill the void. An invocation expression of a method with return type void could be treated as if returning the receiver, enabling cascaded sends. Bean b = new Bean().setA(a).setB(b);
add type inference for locals: todo := problems.size(); next := problems.removeFirst();
add reliable closing of iterators in the for-each loop. for(line : File.lines("log.txt")) soutv<TAB>line (try this today without losing a file handle!). I've actually already done this one: patch
Resource literals . Akin to X.class, a mechanism for compile-time checked references to resources.

If I needn't go to work now I could probably think of many more. The big BUT: these are all futile if no IDE recognizes them ("Das Rote ist gefährlicher Zahnbelag"). I hear NetBeans is javac-based. How much would it take?

What would you do?

http://www.mernst.org/blog/archives/11-01-2006_11-30-2006.html#133 2006-11-15T09:42:06+01:00 Wrong way

I don't understand all the whining about type erasure. Au contraire, an elaborate compile-time type system without burdening the runtime is most elegant and I wish Java would rather move further into that direction instead of getting reified generics .

http://www.mernst.org/blog/archives/11-01-2006_11-30-2006.html#132 2006-11-06T08:20:22+01:00 Climate Matters

Al Gore's film " An Inconvenient Truth " has arrived in Germany. It is a good movie and it deserves that you go, make your friends see it with you and finally start doing something about your energy balance.

http://www.mernst.org/blog/archives/10-01-2006_10-31-2006.html#131 2006-10-26T00:19:17+02:00 SVG.class!

The argument I made when dissing Kirill's SVG to class compilation was the overhead of parsing the class files, resolution and verification as compared to simply parsing an XML representation into an application-level tree representation. I came around to actually measure the difference.

Note that I'm only talking about the speed of loading, not of drawing. The test I performed was: what is the fastest representation to load ? In order to get I/O out of the way, I prepared the source representation in memory.

The contestants:

class file: 1000 byte arrays with different classes, each with a method doing 1000 method calls, are loaded into a new classloader. An instance is created for each class to make sure the class is fully initialized.
serialization: 1000 serialized object graphs, consisting of 1000 small objects each, are deserialized
Binding with XML: 1000 XML documents, consisting of 1000 elements each, are parsed and bound to an object graph using JAXB.
Binding with Fast Infoset: 1000 fast infoset documents, consisting of 1000 elements each, are bound to an object graph using JAXB.
DOM building: 1000 fast infoset documents, consisting of 1000 elements each, are transformed into a DOM tree

So how do you think the contestants fared?

Quite a surprise for me: slowest was the DOM building, followed by XML parsing&binding. Next came FI to Java binding, then serialization. And the winner is: classloading! It is faster to load a compiled representation of method calls into the VM than bind the equivalent Infoset representation to heap objects. So except for handcrafting a binary format, a jar with compiled .svg icons might indeed be the best option.

The numbers after some warmup:

class loading: 359ms
serialization: 500ms
fi->JAXB: 703ms
XML->JAXB: 1516ms
fi->DOM: 1594ms

The test was performed using Java 6b101, server VM w/ 256MB initial and max heap. The usual caveats apply, especially the data is synthetic and simplistic. If you want to take a look at the test code, it's here .

I have to update my results again. Now with a more realistic scenario, the actual bytecode for an actual icon vs the equivalent fast infoset document, DOM and serialization are fastest, about 30% faster than classloading and binding which come out head to head. Binding spends a considerable amount parsing the coordinate and other attributes into doubles. With a more efficient representation it would be ahead of bytecode. Just using String, it is almost twice as fast. At a millisecond per icon, the difference doesn't matter that much anyway.

http://www.mernst.org/blog/archives/10-01-2006_10-31-2006.html#128 2006-10-24T22:36:57+02:00 Ficken ist so ein schönes Wort

Schade, dass man es in den USA nicht schreiben darf .

http://www.mernst.org/blog/archives/10-01-2006_10-31-2006.html#127 2006-10-23T22:14:24+02:00 SVG.class?

I have to say I cringed when I saw Kirill's post because I'm allergic to code generation (unless it's necessary). Graphics are data, not program (see - I neither have a LISP nor Postscript background) - it makes about as much sense as compiling JSPs. In Suns VM, classes end up in a scarce resource called permspace, and I don't expect classloading to be any cheaper than reading a "data" file. My take: instead of generating Java code, rather serialize the render tree and interpret it.

Yet this allergy is highly opion-laden and calls for a comparison. How expensive is loading a class in comparison to parsing an XML file and binding it to a renderer tree representation? What's the memory footprint? Will a "compiled" icon enjoy any benefit from dynamic compilation? Ultimately, would the world want another file format that corresponds 1-1 to the Java 2D API or isn't that just another form of bytecode?

In the end, any means to avoid a heap of Batik jars are justified...

http://www.mernst.org/blog/archives/10-01-2006_10-31-2006.html#125 2006-10-23T09:45:00+02:00 Gilad Bracha is leaving Sun

No one has died but I feel like whatever I'll write will sound like that; please bear with my limited language. I cannot claim that I know Gilad too well. I had the pleasure to meet him seven years ago when I had the audacity to invite myself into his office to talk about our language work at Hamburg University and received the honour of my own StrongTalk tour and a lecture about mixins. I wonder how often Gilad must have cursed that the most advanced virtual machine technology aquired by Sun was used to drive such an ugly language as Java. But it's probably not overrated to say that he's been the pillar for its stability and well-definedness, reason amongst others for the commercial success of Java. Enough said! Hope there's greater adventures waiting for you. Good luck!

http://www.mernst.org/blog/archives/10-01-2006_10-31-2006.html#124 2006-10-17T01:26:25+02:00 Neal Gafter's example is bogus

Neal makes it look like only closures could hide away the strutting of a forEachConcurrently abstraction. I'm all for closures but this is a flawed argument. What he should have compared is his closure-using code:

List<Principal> result = new ArrayList<Principal>(); for eachConcurrently(EventResponse r : getResponses(), threadPool) { if (r.mayAttend()) { Principal attendee = r.getAttendee(); synchronized (result) { result.add(attendee); } } } return result;

to this:

final List<Principal> result = new ArrayList<Principal>(); forEachConcurrently(getResponses(), threadPool, new Runnable1<EventResponse, RemoteException>() { public void call(EventResponse r) throws RemoteException { if (r.mayAttend()) { Principal attendee = r.getAttendee(); synchronized (result) { result.add(attendee); } } } }); return result;

The supposed library implementation (exception handling and such according to Google):

public interface Runnable1<Arg1, Ex extends Throwable> { public void run(Arg1 arg1) throws Ex; } public static <E> void forEachConcurrently( Collection<? extends E> collection, Executor threadPool, final Runnable1<E, ? extends Throwable> fun) { CompletionService<Void> ecs = new ExecutorCompletionService<Void>(threadPool); for (final E e : collection) { ecs.submit(new Callable<Void>() { public Void call() { try { fun.run(e); } catch (Throwable ex) { LOGGER.log(Level.SEVERE, "Uncaught Exception", ex); } return null; } }); } // wait for the tasks to complete for (final E e : collection) { try { /*discard*/ ecs.take().get(); } catch (InterruptedException ex) { throw new AssertionError(ex); // shouldn't happen } catch (ExecutionException ex) { LOGGER.log(Level.SEVERE, "Uncaught Exception", ex); } } }

http://www.mernst.org/blog/archives/10-01-2006_10-31-2006.html#121 2006-10-16T00:07:03+02:00 First Sprouts

After me bitching about the perceived opacity of JDK 7 development, JSR 277 has released an early draft . A pretty long and detailed document (well, not a quality in itself, EJB was, too), too much to have studied it in detail yet. My first impressions:

there are examples and scenarios for a number of use cases

J2SE is expected to become modular itself - whatever the implications are for standards. Au revoir -Xbootclasspath?

modules and their containing repositories are fully reflected at runtime.

the specification predicts that JDC 4670071 bug will finally need fixing

the syntax for module definitions is weird. It resembles annotations but not quite. Then, in a different section, using XML for URL repository descriptions seems ok for the EG.

So far, matters related to language specification have been in conservative, if not too rigid hands, so I expect this spec will be consistent. It's hard to judge how much easier or more complicated developers' lifes will become. According to section 2.18 the authors "anticipate many higher level tools and frameworks will build upon the module system to provide greater convenience and ease of use.". Let's hope so.

So when you send your comments to the JSR comment address and hope for reply, why don't you post them on your blog or up the java.net forums as well?

http://www.mernst.org/blog/archives/10-01-2006_10-31-2006.html#120 2006-10-15T23:05:15+02:00 JDK development from the outside

I'm sure there's a lot of people like me that would like to know where things are moving in Java 7. However, it's hard to find out. Danny points to a list of JSRs slated for inclusion in Java 7. I'm disappointed that not one of the expert groups chose a process as open as JSR166. I'm certainly not expected to join all these as an "expert" (what is that anyway?), just because I'm interested. In the average case, it will take a few months until an EG spits out a zip with JavaDoc and - if I'm lucky - a PDF with a little rationale. But a lot will be missing - what was tried but didn't work out, alternatives considered, ... And the impression I get (the wrong it may be): it's too late in the process to change something fundamental anyway.

I feel this form of development isn't very compatible with a soon-to-be open-source implementation. It may almost provoke forks.

My request: open up your mailing list archives and your CVSs, let us follow your train of thoughts, and you'll receive so much more valuable feedback and contributions. If you want them, that is.

http://www.mernst.org/blog/archives/10-01-2006_10-31-2006.html#74 2006-10-03T16:56:08+02:00 A long term suspicion

An observation: almost all WYSIWIG text editors suck at some point. We've been beaten up repeatedly for our CMS's text editor but in comparison we're actually doing pretty well. The editor I'm typing this text with is Swing-based and gets confused all the time. James Robertson turned off his WYSIWIG comment editor pretty quickly again. And today, after noticing the BLL competing closure proposal (someone with a U want to join that group?), I tried writely. After pasting some dead-simple HTML (just p's and text) BACKSPACE wouldn't join two paragraphs but introduce some br weiredness. A second BACKSPACE would do the job. Not intuitive.

Which leads me back to a long-term suspicion of mine: the base abstraction is wrong. A long time ago computer science decided that documents shall be trees. Things started to go down after that...

http://www.mernst.org/blog/archives/10-01-2006_10-31-2006.html#73 2006-10-03T12:29:59+02:00 volatile is the new synchronized

Bill Pugh and folks have done an outstanding job to clarify and simplify Java's memory model. And they have done a great job to get the word out and raise awareness for the issues. Yet what can happen is that people have now gotten nervous about safe publishing and visibility and start sprinkling volatile pixie dust. Reminds me of the generous application of "synchronized" when one wasn't quite sure ...

http://www.mernst.org/blog/archives/09-01-2006_09-30-2006.html#71 2006-09-22T08:50:39+02:00 Issues of an aging star

Three bug reports ( 36541 , 37356 , 38713 ), some observations: Tomcat is full of clever(TM) and buggy concurrency code. Even experienced maintainers may not know how the memory model works. On a popular project, the more stupid people you're dealing with, the more resistant you get to suggestions - to the point where you refuse to accept that you're wrong. I don't have the scoop on the Tomcat / Glassfish relationship but obviously, there's a lot of bitterness involved, too.

Sad to see, in a way.

http://www.mernst.org/blog/archives/09-01-2006_09-30-2006.html#70 2006-09-21T09:16:45+02:00 Nil is back with a vengeance ...

TL2 , the OO programming language we built in school has no distinction between expressions and statements. Everything is an expression and correspondingly has a type. The Exception class sports a method "raise" - which needs to have a special type: an exception can be thrown at any point in the program, where any type is expected. For example

i :Int := anException.raise()

The return type of raise() is Nil, corresponding to the bottom type. By definition, Nil is subtype of all types in the system, that's why the above line typechecks.

raise :Nil (* the signature of raise *)

Looking back at this I'm now wondering why we chose subtyping over parametric polymorphism. Instead of "raise() produces a value that fulfils every other type", we could as well have said "raise() produces any type you want":

raise(T<:Void) :T (* T is a type parameter bounded by the top type "Void", i.e. any type *)

Why I'm writing this? I came to think of the parametric throw method at a discussion on Remy's blog. And now, Nil has resurfaced in the form of ... java.lang.Unreachable . Do we really need a name for the it?

The situation pretty much resembles the one with empty lists. With generics introduced in Java 5, there is no type to assign to the EMPTY_LIST constant in java.util.Collections. Instead of coming up with a new type we have a polymorphic method emptyList() to give us the empty list of choice. I say we could do the same with the closure that always throws (as in "der Geist, der stets verneint...").

http://www.mernst.org/blog/archives/09-01-2006_09-30-2006.html#68 2006-09-15T08:58:22+02:00 REST is not CRUD, it's COPOPACU
The right explanation of the main HTTP verbs (through Mark Baker ).
http://www.mernst.org/blog/archives/08-01-2006_08-31-2006.html#67 2006-08-26T10:17:44+02:00 An alternative to String#intern()

Ethan blogged about the usage of String#intern in the Xerces XML parser. Let's not delve into the discussion whether interning is particularly useful in this context. What does String#intern do? Basically, it implements a global weak symbol table that unfortunately populates the permspace. As an alternative, let's see what it takes to implement #intern at the application level: Since I'm lazy, we'll define that you do not actually need the weakness property if you have dedicated symbol tables that you can forget and that can be GCed as a whole. Luckily, ConcurrentHashMap already implements everything we need then, so it's pretty easy:

public class SymbolTable<T> { ConcurrentHashMap<T> chm = new ConcurrentHashMap<T>(); public T unique(T t) { T first = chm.putIfAbsent(t, t); return (first == null) ? t : first; } }

Not only does this little class not bloat your permspace but it's actually faster than using String#intern(). Just say no.

Rémi Forax reminds me that String.intern() is handy in order to do identity comparisons against constants. Just don't let yourself be fooled into thinking that '==' makes things faster. Interning the string in the first place requires you to hash the string and compare it to the interned version.

http://www.mernst.org/blog/archives/08-01-2006_08-31-2006.html#59 2006-08-26T00:16:18+02:00 Parsing object streams with JAXB

JAXB is a wonderful tool to bind XML to objects. Unfortunately, used naively, it has the same problem as DOM: you need to load the whole document into memory until you can start consuming the result. If you have a document with one root element and tens of thousands of children it is a good idea to combine object-binding for each child with streaming for the whole list.

I'm using a simple solution to do that. It's a list implementation that does not store its children but notifies a callback handler instead. Use that list in the parent bean and you're all set:

@XmlRootElement(name = "parent") @XmlType(name = "", propOrder = {"children"}) public class Parent { @XmlElement(name = "child") List<Child> children = new CallbackList<Child>(); }

Now, whenever JAXB has finished constructing a child object, it will call #add on my list and the object is immediately handed to the callback and garbage collected afterwards.

You may have noticed that the parent context is not fully constructed yet and unknown to the child, thus inaccessible to the callback handler. Also, I haven't found a proper way to inject objects into JAXB, so the CallbackList needs to find its callback handler through a ThreadLocal which is not exactly pretty. But the net result is great: blazing performance combined with the comfort of binding.

Next morning: alright, I just discovered the UnmarshallerListener. We can use it a) to install a write-only sink as the list implementation and b) listen for child objects without the thread-local. And without having to change any of the generated classes. Neat!

http://www.mernst.org/blog/archives/08-01-2006_08-31-2006.html#62 2006-08-25T23:04:25+02:00 Regaining a name for a wildcard type

When using generics I run into the same problem over and over: a class has a field of a parameterized type but actually doesn't care about the type. A perfect match for using wildcards. However, within a method the using class needs to call methods of the parameterized type that have arguments of the type parameter - of course just derived from that very object. For example: rotate a linked list.

public class RotateList { LinkedList<?> theList; public void rotate() { LinkedList<?> l = theList; theList.addLast(theList.removeFirst()); } }

This does not compile since an important piece of information is lost between calling removeFirst and addLast: not only don't we know the type of the result but we also don't know that it's the very type parameter of the list we're talking about.

At this point I would normally resign and propagate the list's type parameter to the using class. This way, type parameters spread throughout my code. But there is hope! We can pull out the code into a generic method, parameterized by the list's element type. This way we can give a local name to the wildcard that will convince the compiler that we're doing right:

public class RotateList { LinkedList<?> theList; public void rotate() { // what used to be an unknown type // regains a name in the method // and becomes more useful rotate(theList); } private <T> void rotate(LinkedList<T> l) { l.addLast(l.removeFirst()); } }

Probably not a surprise for generics wizards but I had my aha-moment! Now why do we need an extra method? How about a syntax for introducing a type name within a method?

public class RotateList { LinkedList<?> theList; public void rotate() { <T > List < T > l = theList; l.addLast(l.removeFirst()); } }

http://www.mernst.org/blog/archives/07-01-2006_07-31-2006.html#60 2006-07-15T09:37:43+02:00 Defeated
We just got our asses kicked 2:0 in the last three minutes. Trotzdem, Jungs, Ihr wart großartig! Dann fahren wir halt nach Stuttgart.

http://www.mernst.org/blog/archives/07-01-2006_07-31-2006.html#57 2006-07-05T00:54:09+02:00 Seattle Pictures
I spent a few days in Seattle last week. The city sported some fabulous weather - thanks! Here's some pictures:
http://www.mernst.org/blog/archives/07-01-2006_07-31-2006.html#54 2006-07-04T03:26:30+02:00 OSVM in trouble
Esmertec is closing its Denmark subsidiary after having CTO, CEO and two co-founders leave and a dramatic loss in revenue ( press releases ). Sad news since Lars Bak and his team were building OSVM, a highly interesting Smalltalk environment for embedded devices. Hopefully it doesn't just go away. (via Christian Plesner Hansen ) .
http://www.mernst.org/blog/archives/07-01-2006_07-31-2006.html#53 2006-07-03T09:55:14+02:00 Contented Monitor on a T2000

I've ranted about it before, the Jakarta JSP EL-Interpreter up to Tomcat 5.5 / JSTL 1.1 does not parse EL expressions at page compile time but goes through a parser cache on each evaluation in ELEvaluator.java:

static Map sCachedExpressionStrings = Collections.synchronizedMap (new HashMap ());

We've been doing some performance testing on a Sun T2000 machine and this showed to be a bottleneck. A typical thread dump would have many threads waiting for monitor entry on this parser cache here:

// See if it's in the cache Object ret = mBypassCache ? null : sCachedExpressionStrings.get(pExpressionString); if (ret == null) { // Parse the expression Reader r = new StringReader(pExpressionString); ELParser parser = new ELParser(r); try { ret = parser.ExpressionString(); sCachedExpressionStrings.put(pExpressionString, ret); } }

There's a simple trick I've used on several occasions to reduce contention on this monitor. I made the parser cache read-only; any thread that wants to add to the cache must allocate a new copy and substitute it for the old version. Since we're talking about EL expressions here that you'll only have a few hundred of in your application(s), the process will quickly converge. Reading the cache now only costs an additional memory barrier for reading the volatile and is fully concurrent otherwise. In this particular case it boosted the performance by 40% from 400 to 550 page impressions / s.

static volatile Map sCachedExpressionStrings = new HashMap(); ... // copy on write Map updatedCache = new HashMap(sCachedExpressionStrings); updatedCache.put(pExpressionString, ret); sCachedExpressionStrings = updatedCache;

Note that there's a race, if two threads both want to update the cache at once, one might lose. In this scenario it doesn't really matter, though, and the cache will quickly be complete. Also note, that this code is only really legal under the revised Java Memory Model in Java 5. Up to 1.4.2, writing the reference of the new map to the volatile variable 'sCachedExpressionStrings' would not garantuee visibility of the update cache's contents; actually anything could happen. I seem to recall though, that on Sun JVM 1.4.2 this code will actually be correct, too.
Update: it just dawned on me that I could combine copy-on-write with a traditional monitor-protected field instead of the volatile. This would still be a much shorter critical section than the synchronized map AND entirely legal even according to the 1.4 memory model. Something that might have a chance to be accepted as a patch iff the gains are high enough.

I expect more of this kind of problems the more cores we see in servers. Luckily guys like Doug Lea and Bill Pugh make sure we can write code that avoids them. Can't wait for my copy of this ...

http://www.mernst.org/blog/archives/05-01-2006_05-31-2006.html#51 2006-05-19T08:48:07+02:00 Russ Beattie goes off the air

One of the more interesting blogs has closed its last page . Too bad. After going through the website, wiki, blog transformation process, maybe Russ is going to come up with a next greater way of publishing. Goodbye and good luck!

http://www.mernst.org/blog/archives/04-01-2006_04-30-2006.html#50 2006-04-22T16:39:15+02:00 debugger experience

Three annoyances keep coming back to me (at least in the IDEA debugger):

I can't see the result of a method . If a method ends with "return expression ;" I have to step out and hope that the caller stores the result in a variable or introduce an extra local myself.

I can't put breakpoints on expressions, just on lines . If I want to break on #doIt in "if(rarelyTrue) doIt();" I have to reformat the code or introduce a conditional breakpoint.

I can't identify objects very well . I catch myself writing down OIDs (x.y.z@1234) in order to spot them later in the session. It would be great if I could give an object a symbolic name which the debugger could show me later. JVMTI tags should be useful for that.

http://www.mernst.org/blog/archives/04-01-2006_04-30-2006.html#49 2006-04-22T10:56:38+02:00 The last(?) bastion has folded

The latest build of Glassfish, the Sun Java Application Server, now ships with security manager disabled . SJAS always stood out because of this and may have been the #1 issue in getting apps to work.

I must admit I've never paid much attention to making my code security manager enabled. The long list of permissions seemed impossible to maintain. It would have to be deployed differently with every container (there's no J2EE standard for listing the required permissions in the deployment descriptor). And honestly, as long as Sun doesn't come up with some reasonable, concise syntax for [privileged] blocks (PrivilegedActions are ugly) and a suitable runtime environment for running untrusted code I guess I won't. Setting up a new zone seems easier.

http://www.mernst.org/blog/archives/03-01-2006_03-31-2006.html#48 2006-03-15T09:24:37+01:00 How to create uninitialized objects

You never stop learning. A colleague asked me about some code that would get stuck in the finalizer. The objects in question were created like this:

new IWMRMUplinkCollection(invoke_method(null, 0x9, DISPATCH_PROPERTYGET).intVal());

Now it turns out that invoke_method actually throws an exception. So why do any IWRMUplinkCollection instances wind up in the finalizer in the first place? The JLS contains a surprise: "At run time, evaluation of a class instance creation expression is as follows ... Space is allocated for the new class instance .. Next, the actual arguments to the constructor are evaluated, left-to-right ... Next, the selected constructor of the specified class type is invoked."

So surprisingly the instance is allocated before the constructor args are evaluated. And it is put on the finalizer queue which turns out to be a VM bug: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4034630 which recently celebrated its 8th anniversary. This bug is closed as duplicate of 5092933 which is not available for public access. Did someone come up with an exploit?

For the ones that get tired from prose:

public class UnfinishedButFinalized { public static void main(String[] args) { while (true) { try { new Finalizable(new ArrayList(null)); System.out.println("Should never succeed"); System.exit(1); } catch (NullPointerException npe) { } } } static class Finalizable { List list = Collections.EMPTY_LIST; public Finalizable(List list) { this.list = list; } protected void finalize() throws Throwable { if (list == null) System.out.println("I should not be finalized"); } } }

http://www.mernst.org/blog/archives/03-01-2006_03-31-2006.html#46 2006-03-02T23:40:01+01:00 Practical SAX

As a CMS vendor we live on the edge between unstructured text and structured data. And somehow, don't we all build a little CMS once in a while? We deal with beans and business logic as well as rich and plain ole text and of course they interact. This is a writeup of some my experiences with and interpretations of SAX.

Choosing the representation

On the delivery side we deal with practically read-only data. We want to deliver XML content a) fast and b) concurrently, yet dynamically transformed per request. Requirement a) hints at an in-memory data structure like DOM. Now DOM trees tend to use quite some memory but most surprisingly, a DOM tree is not necessarily thread-safe, not even for reading. Try hitting hard on a Xerces document with several threads and you'll be surprised. XML itself is too slow if you want to transform it before delivery. So in order to stay lean and thread-safe, we chose a binary stream encoding similar to what has been done for Fast Infoset (though our format never leaves the JVM). A "Markup" as we call it, is a frozen representation of SAX events that can be concurrently written onto a chain of ContentHandlers. Sizewise, with some naive compression in place (like using indices for repeated namespace uris, for example), it turns out, a Markup takes up about as much space as the original XML.

SAX and Namespaces

SAX namespace support comes in two flavours. Depending on the configuration of a SAX parser's "prefixes" option, it will deliver xmlns* attributes for namespace declarations and optionally qnames with prefixes. We settled on the "no-prefixes" default as it carries the least ambiguity. It turns out this is a big weakness of dealing with SAX. Most APIs that deal with SAX do not specify which namespace flavour they would like their events in. They go with the default (no prefixes) but expect qnames after all (since all parsers deliver them anyway). The correct way would be to deal with the XMLReader interface which offers the aforementioned configuration option, like the TraX SaxSource interface. Yet, J2SE's transformer won't ask a SaxSource for prefixes and will barf if you don't hand them anyway (One of my Mustang contributions fixed that). The SAXResult API doesn't even give you the chance to define what mode it will be operated in. All in all, this is a weak point of SAX. They should have settled for exactly one representation.

Fragments

As we mix templated beans and rich text, we're mostly dealing with XML fragments as opposed to whole documents. SAX is not really well-defined in this area either. Common practice seems to be to send #startDocument and #endDocument anyway as it serves as init and stop event for serializers. Also, can your serializer deal with namespace declarations "around" the fragment? startPrefixMapping, startElement, endElement, startElement, endElement, endPrefixMapping?

Mixing SAX and character data

We want to interop with character-oriented output in both directions. We want to embed JSP-in-SAX-in-JSP. SAX in JSP is not too difficult, it's plain serialization, right? It's a little more complicated. Typically your JSP will declare the outer namespaces using plain text. Now when we serialize a fragment, we want to leverage this context, so our serializer must support *declaration* of namespaces (so it knows which prefixes to generate) without actually printing the xmlns attributes (already there). Nothing that the Apache or JDK serializer will do for us. Vice versa, while we're deep in a chain of filters piping SAX events around, we find that we want to embed another fragment that is rendered with a JSP. We don't want to parse the JSP's output into SAX; we want to simply embed it raw. SAX doesn't give us that feature, so we support an extended SAX protocol to transport chunks of "raw" character data. (The apache serializer actually supports something similar, through the #start/#endNonEscaping protocol.) So we have a really nice embedding option where the JSP writes onto a custom writer that actually sends chunks of non-escaped chars down the pipeline.

Transformations

Although state machines can be difficult to handle, a little framework code can help you out. I find that a filter-based transformation approach works fine for use-cases like link rewriting. You can even easily build filters that listen to the stream and start processing at a certain XPath and call you back with the fragment below. And it's magnitudes faster than XSL. Not to speak of the amount of temporary memory allocated.

Conclusion

SAX is the foster child of the XML APIs. It always ranks as "also", "low-level", "if you need the speed". I was surprised about the number of ambiguities and bugs I ran into with seemingly straightforward tasks. Yet after some persistence, it's the ideal choice for us. It's fast, low-overhead and still gives us all the dynamics we need. I'm looking forward to some benchmarking on a T2000(?) coming to our house soon.

http://www.mernst.org/blog/archives/02-01-2006_02-28-2006.html#43 2006-02-23T01:12:34+01:00 Is it so difficult?

Again and again , if you're life is too simple, make it complicated. Write a hundred times:

R resource = acquireResource(); try { use(resource); } finally { release(resource); }

Well at least it saved James Roberson's day ...

http://www.mernst.org/blog/archives/02-01-2006_02-28-2006.html#42 2006-02-13T20:59:31+01:00 Moot Challenge

Of course I should have been more specific . Actually this code does exactly what I want: the class to define extends another class loaded by the "parent" class loader. The invoking code will only refer to it through its superclass. A new classloader seems much cleaner to me than injecting those bytes into the parent class loader itself, for example like java.lang.reflect.Proxy. It makes the generated class eligible for collection, too, if it's no longer being used. Or is there some catch I'm missing here? How expensive is a class loader anyway? Can the VM deal with hundreds or thousands of class loaders? Will the PermSpace blow up? Gotta try...

A valid point raised by Eugene, though. Challenging the blogosphere (can't believe I just wrote this word) if you don't have comments is kind of questionable.

http://www.mernst.org/blog/archives/02-01-2006_02-28-2006.html#41 2006-02-13T08:39:29+01:00 The shortest way to define a class ...

private static <T> Class<T> defineClass(ClassLoader parent, final String className, final byte[] bytes) { final Class[] result = new Class[1]; new ClassLoader(parent) { { result[0] = defineClass(className, bytes, 0, bytes.length); } }; return (Class<T>) result[0]; }
Can you do it shorter?
Tom Hawtin can:

private static <T> Class<T> defineClass(ClassLoader parent, final String className, final byte[] bytes) { return new ClassLoader(parent) { public Class defineClass() { return defineClass(className, bytes, 0, bytes.length); } }.defineClass(); }
I was surprised but yes, the type of an anonymous class construction expression is the full anonymous type, not just the supertype: http://java.sun.com/docs/books/jls/third_edition/html/expressions.html#15.9.1: The class being instantiated is the anonymous subclass. ... The type of the class instance creation expression is the class type being instantiated.
http://www.mernst.org/blog/archives/02-01-2006_02-28-2006.html#40 2006-02-12T23:32:25+01:00 Compiled EL: an adventure in bytecode engineering

We have a running gag at work. When a webapp is slow to start, we'll say "ah it needs to compile the JSPs". Actually, nowadays, JSP compilation is pretty much useless. No one uses Java code in JSPs anymore. It is string literals, custom tags and expression language. Now the sad thing is, the EL is interpreted, too, so there's no need to compile JSPs anymore except for legacy compliance.

Interpreted EL turns me off. Too much reflection during evaluation, BeanInfo caching, ... Tomcat's JSP implementation doesn't even remember the parsed expression trees with a page instance, so they parse every expression anew, mitigated by yet another cache. A nice synchronization bottleneck.

So I've been intrigued by the idea of compiling EL to bytecode. Since I didn't want to wait for Gilad to bring out #invokedynamic (which is probably running in the lab already), I decided to go the strongly typed route. With generic type information, you actually have a fair chance to do proper typing for an expression without casting. Thus the compiler interface takes a type environment as follows:

public static CompiledExpression compile( String expressionString, Map<String, ? extends Type> env); public interface CompiledExpression { Type getResultType(); Object evaluate(Map<String, ?> env) throws Exception; }

The expression string is parsed with the Commons EL parser. For bytecode generation, the compiler draws all its information from the runtime environment; no searching/parsing of class files takes place. The java.lang.reflect.Type hierarchy holds all the required information. Resolving all wildcards and type variables correctly proved a fun exercise; the operators and type coercion are a bitch. So back to the fun part; I built a simple interactive toplevel which looks like follows:

string: class java.lang.String = string map: interface java.util.Map[class java.lang.String, class java.lang.String] = {key=value} > map.keySet.iterator.next Code: 0: aload_1 1: ldc #14; //String map 3: invokeinterface #20, 2; //InterfaceMethod java/util/Map.get:(Ljava/lang/Object;)Ljava/lang/Object; 8: checkcast #16; //class java/util/Map 11: invokeinterface #24, 1; //InterfaceMethod java/util/Map.keySet:()Ljava/util/Set; 16: invokeinterface #30, 1; //InterfaceMethod java/util/Set.iterator:()Ljava/util/Iterator; 21: invokeinterface #36, 1; //InterfaceMethod java/util/Iterator.next:()Ljava/lang/Object; 26: checkcast #38; //class java/lang/String 29: areturn type: class java.lang.String value: "key" > map["key"].class.newInstance Code: 0: aload_1 1: ldc #14; //String map 3: invokeinterface #20, 2; //InterfaceMethod java/util/Map.get:(Ljava/lang/Object;)Ljava/lang/Object; 8: checkcast #16; //class java/util/Map 11: ldc #22; //String key 13: invokeinterface #20, 2; //InterfaceMethod java/util/Map.get:(Ljava/lang/Object;)Ljava/lang/Object; 18: checkcast #24; //class java/lang/String 21: invokevirtual #28; //Method java/lang/String.getClass:()Ljava/lang/Class; 24: invokevirtual #34; //Method java/lang/Class.newInstance:()Ljava/lang/Object; 27: areturn type: ? extends [class java.lang.Object] value: ""

As you can see, you can call methods; a nice add-on to EL is the ability to call methods with arguments. The compiler curries one argument a time; a curried call is typed as a Map:

> map.put["key"] ... type: interface java.util.Map[class java.lang.String, class java.lang.String] value: fun { {key=value}.put("key", $0) }

So Map.put is called with one argument, one remaining. This is represented by a special Map implementation that completes the call when #get is called. This part right now requires reflective method invocation at runtime. However, the Expression instance already contains the Method object to invoke and thus profits from java.lang.reflect's bytecode generation. I want to play with functional expressions, too, e.g. collection[map.get] should apply map.get to each collection element.

If you're interested to toy around with it, let me know.

http://www.mernst.org/blog/archives/02-01-2006_02-28-2006.html#39 2006-02-05T01:05:42+01:00 Code snippet: calling javap programmatically

This just came in handy. I'm doing some bytecode generation, using the javap classes to disassemble the generated code on the go. You need tools.jar in your classpath:

private static Field JAVAP_C = null; static { try { JAVAP_C = JavapEnvironment.class.getDeclaredField("showDisassembled"); // :-( JAVAP_C.setAccessible(true); } catch (NoSuchFieldException e) { } } public static void javap(byte[] classBytes, String methodName) { JavapEnvironment env = new JavapEnvironment(); try { if (JAVAP_C != null) JAVAP_C.set(env, true); } catch (IllegalAccessException e) { throw new RuntimeException(e); } PrintWriter pw = new PrintWriter(System.out); JavapPrinter javapPrinter = new JavapPrinter(new ByteArrayInputStream(classBytes), pw, env); ClassData classData = new ClassData(new ByteArrayInputStream(classBytes)); MethodData[] methods = classData.getMethods(); for (int i = 0; i < methods.length; i++) { MethodData method = methods[i]; if (method.getName().equals(methodName)) javapPrinter.printMethodAttributes(method); } pw.flush(); }

Example output:

Code: 0: iconst_3 1: ineg 2: invokestatic #18; //Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer; 5: areturn

http://www.mernst.org/blog/archives/02-01-2006_02-28-2006.html#38 2006-02-05T00:15:57+01:00 the j.u.logging default configuration
I don't mind the j.u.logging API. Its default configuration sucks hard, however. How to enable FINE output programmatically?
ExpressionCompiler.logger.setLevel(Level.FINE); // you would think Logger.getLogger("").getHandlers()[0].setLevel(Level.ALL); // #!@#!@$
Why is that? $JRE_HOME/lib/logging.properties decides what goes to your console:
# Limit the message that are printed on the console to INFO and above. java.util.logging.ConsoleHandler.level = INFO
See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4462908 . Looks like I finally arrived in 2001...
http://www.mernst.org/blog/archives/02-01-2006_02-28-2006.html#37 2006-02-03T10:13:48+01:00 My vote for Dolphin

In case I haven't said this publicly enough yet: I consider Isolates the most urgently needed innovation for Java. By far. I'm surprised this doesn't get more attention. I'll celebrate the day when we can start and tear down subsystems without worrying about thread and memory leaks. The day where I can clearly distinguish between a problem in my software or a customer customization. Until then, I have completely given up on webapp redeployment, for example.

A quote from Bill's post: "Isolates are slated for Java 1.6". Sun, please get this out of the door for Dolphin!

http://www.mernst.org/blog/archives/01-01-2006_01-31-2006.html#36 2007-01-16T20:51:30.345+01:00 Mustang's HTTP Server

Update: Alan Bateman has kindly provided me with a link to the HTTP server API documentation. This will soon be hooked up in the Mustang docs.

Several times, I've seen Mustang's inclusion of an HTTP server mentioned. A technical article points to bug 6270015 , which asks for support of a lightweight HTTP server API. This bug is marked "closed, fixed". Yet, you won't find anything in the Mustang JavaDoc on it. What's the deal? Well, if we read carefully, Mustang will include an HTTP server API, not JSE 6 . If we follow David Herrons recommendation , we should therefore forget about this quickly.

However, that API looks like it had some thinking put into it, is documented and itself distinguishes thoroughly between com.sun.* (sort-of presentable?) and sun.* (don't go there?) packages. Here's how you run a simple server:

HttpServer httpServer = HttpServer.create(new InetSocketAddress(8000), 5); httpServer.createContext("/", new HttpHandler() { public void handle(final HttpExchange exchange) throws IOException { Headers requestHeaders = exchange.getRequestHeaders(); exchange.getResponseHeaders().set("Content-Type", "text/plain;charset=utf-8"); exchange.sendResponseHeaders(200, 0); OutputStream responseBody = exchange.getResponseBody(); PrintWriter printWriter = new PrintWriter(new OutputStreamWriter(responseBody, "UTF-8")); for (Map.Entry <String, List<String>> entry : requestHeaders.entrySet()) printWriter.println(entry.getKey() + ": " + entry.getValue()); printWriter.close(); } }); // the server will be single-threaded unless you uncomment this line // httpServer.setExecutor(Executors.newCachedThreadPool()); httpServer.start();

Simple enough. If this thing survives further scrutiny it has the potential to put Jetty out of business for me for embedded HTTP endpoints. If it just weren't for the license...

http://www.mernst.org/blog/archives/01-01-2006_01-31-2006.html#33 2006-01-22T23:01:22+01:00 Mastering NIO

For the umpteenth time, I'm trying to move from an intellectual to a practical understanding of non-blocking network IO and I've come to something working this time. I know about MINA , I've read about emberIO , yet I want to learn it the hard way by building a framework myself that is simple, purely non-blocking and extensible. I'm not actually expecting to reach the fastest or most scalable solution, actually I'm more interested in the programming model. Non-blocking IO flips the roles: you no longer have an inputstream/outputstream pair that you can operate with in a single method call. Instead your code needs to become a state machine; input is pushed into it; output is pulled from it. That's why it's so hard to get the servlet api together with NIO.

So what do I have?

An IODispatcher represents a selector with an associated thread pool. You add channels and associated protocols to a dispatcher:

class IODispatcher { public <C extends SelectableChannel> void addChannel(C channel, Protocol<? super C> protocol) }
Each registered channel runs a protocol, an implementation of the following interface:
public interface Protocol<C> { void setConnection(ChannelConnection<? extends C> connection); int interest(); void run() throws IOException; void destroy(); }

The IODispatcher selects on the channel with the protocol's current interest. When selected it will run the protocol in the thread pool. While running, the channel's interest is set to 0 and the IODispatcher ensures that this protocol instance will not run again concurrently. When #run finishes, the protocol will be entered for selection again with a possibly new interest set. If the protocol closes its channel or throws an IOException, the IODispatcher will remove the protocol and invoke its #destroy method.

A Protocol is a state machine. In order to make writing protocols easier, there's a MultiStageProtocol with individual stages that can indicate successor stages after running. These are the building blocks I hope to build abstractions on; so far I have acceptor, connector and writer stages. Input and conversions are typical next candidates. For reference, the connector stage looks as follows:

public class Connector<C extends SocketChannel> implements ProtocolStage<C> { private ProtocolStage<C> next; public Connector(ProtocolStage<C> next) { this.next = next; } public int interest() { return SelectionKey.OP_CONNECT; } public ProtocolStage<C> run(ChannelConnection<? extends C> connection) throws IOException { return connection.getChannel().finishConnect() ? next : this; } public void destroy(ChannelConnection<? extends C> connection) { } }

The code is running very nicely now; I stumbled over two NIO bugs (4729342,5076772) that I'm preparing fixes for. In the longer run, I'd like to build a renderer for my Axt template engine that implements a pull model and run it on this NIO framework.

That was the morning news, time for a shower now.

http://www.mernst.org/blog/archives/01-01-2006_01-31-2006.html#32 2006-10-13T08:33:19+01:00 SPRONG: a strongly typed spring bean container

Spring is all over the place. For a reason: spring is well-engineered, well-supported and does a lot of things just right.

Everyone would agree if it wasn't for the XML bean definitions. These files define the application structure by defining instances of bean classes and wiring them to each other. So what's wrong with these files? First of all they break too easily. You rename a property, you forget to add a setter, you forget the default constructor. Tools are getting better but they're just not there yet. And sometimes they just cannot know. Second, there is no well-defined extension mechanism. As an ISV, we're shipping software with a bunch of .xml descriptors and it is very difficult to distinguish parts that we want the customer to customize and which we don't. What's the strategy?

Spring does not depend on the XML in any way. You can just as well instantiate all the beans in Java code. So what is it that makes me write XML anyway? I see a number of properties:

You don't need to compile them and you can easily tweak them after deployment. That's a moot point, I think. Au contraire, I'm writing this because I _want_ to compile them.

Auto-wiring: basically, you don't need to define all properties yourself, Spring will find right candidates by name or type. I could use it all the time but I'm wary about this and avoid it.

Automatic singleton-ness. Unless told otherwise, the spring bean factory will instantiate a bean definition only once.

Lifecycle and postprocessing. Spring invokes init/destroy methods and fulfills a number of common dependencies through *Aware interfaces (ServletContextAware, ResourceLoaderAware, ...).

Automatically right order of initialization. If bean A depends on bean B, B must be initialized first.

The last three are important to me. So what I want is a strongly typed, Java mechanism for bean definitions that still supports those "container properties".

My solution looks as follows:

Instead of an XML file, you write a Java class, representing the container.

For every bean definition you write a property getter method. In order to make it a "bean definition" method, you apply a @Bean annotation.

A container api lets you create instances of that class that honor the container properties. How does it do that? It uses CGLib to generate a subclass of the original class and adds fields for the singletons, initialization flags and lifecycle/postprocessing code.

A glimpse:

class App { String driverUrl; @Configuration(key = "db.driverUrl") public void setDriverUrl(String url) { this.driverUrl = url; } @Bean protected DataSource getDataSource() { return new DriverManagerDataSource(driverUrl); } @Bean public ServiceA getServiceA() { return new ServiceA(getDataSource()); } @Bean public ServiceB getServiceB() { return new ServiceB(getDataSource()); } } ... ContainerBeanFactory<App> f = ContainerBeanFactory.newFactory(App.class); App app = f.create(properties); // actually a subclass of App assert app.getServiceA() == app.getServiceA(); assert app.getServiceA().getDataSource() == app.getServiceB().getDataSource();

Before I go, you can see three great things already: configuration properties can be imported (a la ${xxx} in Spring XML), #getDataSource constitutes a container-private bean and App itself is a strongly typed interface of the set of beans exported. I'll use that to actually define the interface the Spring MVC Dispatcher Servlet has with a Spring MVC webapp. So expect more later.

http://www.mernst.org/blog/archives/12-01-2005_12-31-2005.html#31 2005-12-29T10:27:22+01:00 10-Fix Delivered

So, Mustang b63 ships with my Java reimplementation of File#deleteOnExit. Apps using a lot of these will now experience a proper out of heap error. Nothing exciting, still I'm glad it made it.

http://www.mernst.org/blog/archives/12-01-2005_12-31-2005.html#30 2005-12-09T23:19:15+01:00 OutOfMemoryError

I'm more and more convinced that OutOfMemoryErrors are useless. They don't have a clear cause and as such the recipient of the error has no way to correct them. Especially in multithreaded applications a random thread gets hit and will cease to work. As I commented on Alan's Posting I've had several occasions where a Tomcat servlet engine would go mute because the acceptor thread got hit by OOM.

Second, memory allocation is such a fundamental prerequisite on the Java platform that handling such an error in Java is likely to not work anyway. My catch/finally/ShutdownHook runs in a very ill-defined environment. The same way a VM-, Internal- or whatever error is not useful up there. It's like 'No keyboard present. Hit F1 to continue.'

I think a VM that runs into OOM should just panic and terminate. ASAP.

That being said, the fact that I have to tell my JVM in advance how much memory I'm going to need feels so MacOS 8 to me.

http://www.mernst.org/blog/archives/12-01-2005_12-31-2005.html#29 2005-12-04T19:37:44+01:00 Mustang Contributions
I've made two contributions to Mustang. A small fix for a TrAX bug and a "jumbo" patch to replace the native implementation of File#deleteOnExit. The second issue once cost me a lot of time to track down - I didn't want anyone else to fall into that trap again. It actually tackles a number of bug reports. The first one is "fix in progress" now and the second one is coming along well, too. You can track your submissions here . I really like the idea, however I wonder how cost-efficient that process is. It takes a lot of work to review and integrate patches if you want to maintain a certain level of quality and I get the impression that some Sun engineers are very careful people.
http://www.mernst.org/blog/archives/10-01-2005_10-31-2005.html#28 2005-10-29T19:45:41+02:00 Multi-Core CPUs and the concurrency hype
Lately, there's been a lot of buzz about how multi-core CPU's will change the face of the software world. Programmers would be facing a new challenge: to write concurrent software, otherwise the software won't perform. I'm surprised about the level of attention. I think this is rather old news. Multiprocessors have been around for a long time; we've been paying attention to actually using the available CPUs for a long time. And our customers are doing the same - they complain if we don't.
http://www.mernst.org/blog/archives/10-01-2005_10-31-2005.html#27 2005-10-29T19:20:32+02:00 MSNBot hallo ?
Is that what I'm paying my bandwidth for? Get a grip folks!
65.54.188.103 - - [18/Aug/2005:09:40:39 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:09:47:50 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:09:53:02 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:09:58:14 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:10:02:06 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:10:07:02 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:10:12:01 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:10:18:22 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:10:28:06 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:10:33:48 +0200] "GET /robots.txt HTTP/1.0" 404 204 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:10:37:45 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:10:45:19 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:10:53:56 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:10:59:44 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)" 65.54.188.103 - - [18/Aug/2005:11:08:41 +0200] "GET /blog/rss.xml HTTP/1.0" 200 20823 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"

http://www.mernst.org/blog/archives/08-01-2005_08-31-2005.html#26 2005-08-25T23:51:42+02:00 ariadna: heap dump without ctrl-break
I just uploaded a new version of ariadna that allows for generating a heap dump from within the application, not just CTRL-break/SIGQUIT. It's still written to fixed files - no added luxury here :-) It's simpler than I thought: an agent dll is automatically searched for JNI entry points, so I didn't have to deal with RegisterNatives().
http://www.mernst.org/blog/archives/08-01-2005_08-31-2005.html#25 2005-08-25T09:27:27+02:00 CLR versions

I just downloaded Avalon & Indigo RC. Avalon would not install - I figure it's because of the Avalon CTP that's still lying around. Trying to UNINSTALL Avalon CTP, I get the following neat error message:

I guess I haven't figured out Microsoft's versioning story yet. I have a more recent 2.0 beta 2. Why can't it just use that one? What do I do? I will _NOT_ deinstall beta 2, install 204xxx in order to deinstall Avalon CTP. .NET's beta track feels much more painful than Java to me. C# Express and SQL Server Express had incompatible CLR requirements as well. I understand that, but why can't I simply install both?

I would appreciate some tips on this.

http://www.mernst.org/blog/archives/05-01-2005_05-31-2005.html#20 2005-05-24T08:20:54+02:00 Class.newInstance and Pawlovian dogs

I'm a Pawlovian dog. My bell is Class#newInstance. Whenever I see a framework that allows you to configure extensions via classname and does Class.forName().newInstance(), I think "you should integrate that with spring".

Here's how I did it for ULC. ULC is Canoo's "Universal Lightweight Client" - a ui framework that runs a swing ui engine on the client but all the ui and application logic on the server. ULC comes as a servlet accepts the application class as an init-parameter. For every new UI session, it will instantiate it and call start() - then you're left alone. Now you can go and find friends - typically via ServletContext. Since I prefer dependency injection over service locators I want Spring to instantiate my application class WITH dependencies.

Unsurprisingly, it's very easy. I wrapped the ULC servlet in a controller which gives it a bootstrap application class name. This bootstrap application grabs the current application context and the application bean name from the request, instantiates the application bean (MUST be prototype bean definition), and delegate all ULC lifecycle calls to it. Alternatively, my code can instantiate a whole child context per UI session if you have more beans you want to set up per session. For example:

<!DOCTYPE beans PUBLIC "-//SPRING//DTD BEAN//EN" "http://www.springframework.org/dtd/spring-beans.dtd"> <beans> <bean id="listModel" class="com.ulcjava.base.application.DefaultListModel"/> <bean id="eventDispatcher" class="org.mernst.ulcjava.EventDispatcher"> <property name="eventQueue" ref="eventQueue"/> <property name="delay" value="5000"/> <property name="handlers"> <map> <entry key="org.mernst.ulcjava.ListUpdateEvent"> <bean class="org.mernst.ulcjava.ListUpdateHandler"> <property name="model" ref="listModel"/> </bean> </entry> </map> </property> </bean> <bean class="org.mernst.ulcjava.Viewer"> <property name="listModel" ref="listModel"/> </bean> </beans>

Net result: ULC applications can be easily set-up in Spring - no more service lookup necessary. The code should appear soon on http://ulc-community.canoo.com/ .

Spring starts to be the all-integrating technology - lately we built an integration to host the WASP container and export Spring-managed web service implementations.

http://www.mernst.org/blog/archives/05-01-2005_05-31-2005.html#19 2005-05-13T12:24:04+02:00 Specs and Implementations

Something's wrong with software specifications. Almost every JSR defines an API in form of Java interfaces. Can I download these interfaces in a jar file? No, the API typically comes in form of JavaDoc that every implementation has to reverse-engineer into Java code and a "reference implementation" that contains _a_ version of the interfaces. As it stands, there are no official servlet-api jars to write software against. There's the Tomcat servlet apis and there's a lot of others. Have you found yourself with two, say, jmx.jar in your app? Nominally the same version, yet different size? Can I drop one, you ask yourself? Will something break?

Even worse are specifications that spell out algorithms over pages in pseudo-formal notation. Bullet points instead of loops. Ambiguous, natural language, open for interpretation. Have a look at the JSF specification:

Examine the FacesContext instance for the current request. If it already contains a UIViewRoot:

Set the locale on this UIViewRoot to the value returned by the getRequestLocale() method on the ExternalContext for this request.

For each component in the component tree, determine if a ValueExpression for binding is present. If so, call the setValue() method on this ValueExpression, passing the component instance on which it was found.

Take no further action during this phase, and return.

This goes on for pages. The whole request cycle spelt out in Jenglish. What the heck? Why don't they invest all the time they've debated about the words in a better base implementation with useful extension points? Instead a verification kit is produced to ensure a vendor did a correct job translating the english back into code. When I look at many bugs I encounter in standards implementations I question the quality of these TCKs.

Well, enough now, I actually wanted to read the JSF spec ...

http://www.mernst.org/blog/archives/05-01-2005_05-31-2005.html#17 2005-05-08T10:37:51+02:00 JSP: reusable chunks of markup

[This is a reply to http://www.theserverside.com/news/thread.tss?thread_id=33326#166473 ; theserverside forums wouldn't let me post this: "invalid html" although there is none]

JSP lacks in so many ways.
* reusable markup: tag files should have an invocation api from Java. I once hacked Jasper to allow that. You could get a SimpleTag instance for a path and invoke it. Spec people weren't interested. Promised me something else exciting - I'm still waiting.
* The grand unified EL still cannot call methods. ${content.isVisibleTo(user)} ?
* JSPs can't be called from somewhere else than the web app. Template staging in a CMS? No way.
* JSPs need compilation. Precompile, ja ja.
* Look ma, no escaping. Oh you're doing markup? Please use a taglib. Otherwise welcome scripting attacks.

I've had it. That's why I'd like to advertise a lightweight, fast, embeddable, XML-centric template engine I've built: https://axt.dev.java.net

http://www.mernst.org/blog/archives/04-01-2005_04-30-2005.html#14 2005-04-16T10:43:56+02:00 BBCB
I'm in!
I thought coming to the CLR team I'd be more popular among my peers, more respected by my native Australians, and maybe even win the "Australian of the year" award for my contributions to society and my display of willingness to put up with this country… Unfortunately, my dreams have not been realized. Instead, whenever I proudly proclaim to my peers that I, Joel Pobar, am a member of the Common Language Runtime team, the immediate response is “Can you tell Chris Brumme to add new stuff to his blog? I've been waiting a year now!. I've reevaluated my goals and ambitions. I've decided to cave in to the wants and needs of our customers who prefer architectural quality meat, instead of sour Australian PM potatoes. I'm campaigning to Bring Back Chris Brumme's Blog!
From Joel Pobar, via BradA .
http://www.mernst.org/blog/archives/04-01-2005_04-30-2005.html#11 2005-04-08T08:55:34+02:00 Axt Design Choices

So what was I thinking? There's few things the world needs less than a new template engine. Here's my feature list that made me roll my own anyway:

A template ist editable in an xml editor. Don't give me the <a href='<c:url value="..."/>'> mess. A template is at least be well-formed xml. This rules out #velocity.hashmichtot#, too.

A template is previewable. A design with template annotations applied stills look exactly the same in a browser. This means you can keep your "blindtext" (placeholder text) around.

A template is fed with a binding of names to (java) objects accessed through an easy yet powerful expression language (OGNL).

A template garantuees well-formed xml output. It does so by generating output to a SAX ContentHandler, not a Writer.

A template is extendable with plugins that can manipulate their body content. Just as a JSP 2.0 SimpleTag can evaluate its body on a custom Writer and apply filtering, ... an axt template can evaluate its body to a custom ContentHandler and perform any manipulation it likes.

Axt is easily embeddable. There are no dependencies on the filesystem, a compiler, a servlet engine. The only requirements are DOM, SAX, OGNL.

Of course, it must be fast. Templates and OGNL expressions are pre-parsed and thread-safe. The OGNL interpreter is also thread-safe and caches its introspection results. I've been able to churn out 1300 (small) pages per second (about 5MB/s) from an 8 processor Sun using Axt to render pages in Tomcat 5. And it scales wonderfully to many clients.

It's simple. 28 files, 1200 LOC. Apache Jasper is 30 times that big. Small things rule.

Interested? Go try it, the project has been approved meanwhile. The documentation is poor, so you want to dig into the source. Drop me a mail and I'll get you started if you have problems.
http://www.mernst.org/blog/archives/03-01-2005_03-31-2005.html#10 2005-03-03T22:57:54+01:00 Templating for XML
How do you create XML programmatically? I've been fed up with the templating options known to me for a number of reasons:

JSP: JSP does not directly target XML. You have to use custom tags to do escaping. The spec regards markup as the exceptional case: (paraphrased) "in the unlikely event you want to do xml, ... use something like c:out". There also is no JSP implementation that neither uses a web container, the filesystem nor compilation, although the JSP 2.0 API and EL would allow for such scriptless, interpreted templates.

Velocity: much easier to embed, but still targeted at producing characters, not infoset.

XSLT: garantuees well-formed output. However, takes XML as input where I'm looking for something that accepts a map of Java beans. Xalan Java extensions make some things possible but it's not a natural integration.

There may be other options, but I bet they're either plaintext or XML transformations. I'm looking for XML production out of beans. So I decided to roll my own, "axt", "attributes-only xml templates". The project at https://axt.dev.java.net has its approval pending. A copy from the description:

"A template processor for producing XML. It takes XML templates as input, garantuees well-formed output unlike JSP or Velocity, is designed to process beans as a data model unlike XSLT. axt uses the OGNL expression language for expressions. axt exclusively uses namespaced attributes as template annotations and no elements, like Zope's tal."

I have to go now but I'll write more about it later.

http://www.mernst.org/blog/archives/02-01-2005_02-28-2005.html#9 2005-02-28T09:54:10+01:00 Spring prototype definitions as a factory

Mike wrote about using spring to create job processors dynamically from job types. We use a similar pattern in our content management system. Documents in our CMS have types and different attributes based on type. Document types are grouped in a hierarchy. It's obvious to map document types to a Java class hierarchy - and it's obvious that an application programmer wants to extend these classes.

We want to provide the developer with a factory interface that you can hand an arbitrary document and you'll receive an instance of the corresponding java class. Easy enough, we thought, we'll have a map from document types to class objects, call #newInstance and #setDocument. However, it's not that easy. The application developer will want to access different application services from within such a bean - we either need to provide a service locator or the services need to be injected into the beans. I disliked the service locator idea and since we're using spring to wire our framework objects it was only natural to let the developer tap into that functionality as well. So instead of having our own map, we simply ask spring's bean factory for an implementation (which must not be a singleton, of course), based on a naming convention:

<bean id="doctypeA" class="ABean" singleton="false"/>  <bean id="doctypeB" class="BBean" singleton="false">  <property name="service">...</property> </bean>

This way, our factory is basically just a thin wrapper around a spring bean factory:

Object getObject(Document document) { result = beanFactory.getBean(document.type.name); // create bean result.document = document; return result; }

Although the principle is so simple, I still find the effect of DI stunning. This factory pattern actually reminds me of the technique of currying: in a way, what the factory holds for "docTypeB" is a parameterless function that is the result of pre-binding the "service" parameter.

http://www.mernst.org/blog/archives/11-01-2004_11-30-2004.html#6 2004-11-15T20:03:47+01:00 ariadna: memory leaks be gone

I've posted ariadna , a tool for dumping a JVM's heap and analyzing it for unwanted reference chains.

The tool, respectively its predecessor heapprofile , grew out of my frustration about commercial profilers. You have a piece of software that you know leaks memory. Go ahead and find a tool that helps you. All I found was all-encompassing profiler packages >50MB that run very slowly. And they try to visualize ten thousands of objects as boxes in a graph on a small canvas. Ridiculous. Also they record where objects are created. Pretty unnecessary, my IDE can tell me that. All I want is a tool that shows me: "there's 200 JSomethings" where there shouldn't and "this JSomething is around because ...". HAT is such a tool and it's been around forever. We used it to debug our editor back with JDK 1.2. Only it took a lot of memory and the agent required in the VM would slow things down, too. Reproducing in slo-mo.

Anyway, ariadna is my attempt at attacking this. I guess tools have become better by now but feeling up the VM from your agent is fun anyway. What it does? At first, I load all classes, tag them and remember their name. Then I iterate over the heap, again tagging each object with a unique id (I wonder how much memory that costs in the VM; have to browse the 1.5 SRL code). Finally I iterate over all references, dumping referrer and referee tag to a file. For each object, I also dump its class tag. An analyzer server maps the file into memory, sorts it by referee's tag. Now the incoming references of each object can be found in O(log(n)) which feels lightning fast on small heaps. It's so fast that the shortest reference chain from the root set is displayed with every object you navigate to. I'll should have tried larger heaps before going public ...

http://www.mernst.org/blog/archives/11-01-2004_11-30-2004.html#2 2004-11-11T00:03:13+01:00 Enter

This is my third attempt at writing my very first blog entry. I decided it would be on the blog software I chose. My web space doesn't provide any CGI/PHP/Perl and I actually don't want to administer it anyway.

Thingamablog is completely standalone, written in Java and will publish my entries as static files along with an RSS 2.0 feed via FTP. Nice. A rich client UI makes it all the better. Kudos so far. I've expected much worse of a Java UI because I've only seen worse.

In the future, I hope to let you know about me and what I'm doing. And if nobody reads this it can be my journal.

http://www.mernst.org/blog/archives/11-01-2004_11-30-2004.html#0 2004-11-10T23:37:12+01:00