Marco Scata: 2009

27 October 2009

Pythagoras On Screen

Euclid of Alexandria "invented" geometry. Before him, there was no squares or circles. Everything was just a collection of messy shapes. One even wonders how they managed to build houses without straight lines and triangles.

Ok, just kidding...

Pythagoras of Samos actually did know a thing or two about triangles before Euclid ever did, so much so that he discovered the mathematical relationship between the sides of right-angled triangles.
How are these two gentlemen related? Well, they lived a couple of centuries apart, so they are not directly related to each other as such, but in terms of mathematics they are very closely related: the pythagorean theorem only "works properly" in euclidean spaces.

What are euclidean spaces? They are those spaces that satisfy the axioms of Euclidean geometry. Most of us have studied Euclidean geometry at school: we learn that the shortest path between two points is the straight line, that the sum of the angles of a triangle is 180 degrees, and other wonderful rules.
A lot of these rules, however, start falling apart when we try to apply them to non-Euclidean geometries. In general, though, if we stick to flat 2-D surfaces (like a piece of paper), Euclid's rules will survive just fine.

Let's test this.

If we draw on a piece of paper a right-angled triangle, with one leg being 300mm long and the other being 400mm long, Pythagoras taught us that the hypotenuse should be 500mm long, and indeed it is. In general, the hypotenuse will always be longer than either of the two legs.
What else do we have, apart from a piece of paper resting on a desk, that is a flat 2-D surface? Well, there's the flat LCD screen on which I'm writing this article, of course. So if I draw a segment that is 300 pixels long, at a right angle with another segment that is 400 pixels long, then I should be able to join the two open ends of each segment with a third segment that is 500 pixels long, right?

Wrong.

Why? The reason is not that difficult to find: the computer screen is not a continuous surface: it's made out of "tiles" called pixels, and each pixel is either turned on or off. There is no such thing as a fractionary pixel, so you can't draw half a pixel, or two-third of a pixel: you either draw it entirely or you don't. In other words, the surface of my LCD screen is quantised. So when the two legs of my right-angled triangle on screen are 12 and 8 pixels long, the hypotenuse cannot possibly be 14.42 pixels. So how long is it? 14 pixels? 15 pixels? Does it round up by eccess or by defect?

Well... Neither.

It turns out that it depends on how you want to draw the line so that it appears continuous to the average person looking at the screen. I think two pixels can be considered contiguous if they touch each other one way or another, even if only by one corner. Sticking to this constraint, the minimum number of pixels required to draw the hypotenuse of the above triangle would be 12, which is the same number of pixels required to draw the longest leg of the triangle.

In my head, this means two very important things:

quantised surfaces are not Euclidean in nature, even though they may intuitively appear to be so
the concept of "distance" between two points on a quantised surface is very, very funny

10 September 2009

To Infinity and Beyond

Here's a thought...

Let's take two series, call them A and B.
Let's say each term in A is less than or equal to the corresponding term in B.
Logic would suggest that, if B is convergent, then A is convergent too.

So how about this...

Let B be the geometric series 1 + 2/3 + 4/9 + 8/27 + ... + (2/3)^N + ...
This series converges to 3.

Let A be the series 1 + 1/2 + 1/3 + 1/4 + ... + (1/N) + ...
This series is not convergent.

However, each term of A is less than or equal to the corresponding term in B, i.e.
1 <= 1 1/2 <= 2/3 1/3 <= 4/9 1/4 <= 8/27 ... 1/N <= (2/3)^N ...

If B is convergent, doesn't this mean that A should be convergent too?
Hmmmm...
M.

21 August 2009

Ushiro Irimi Nage

No, there's no such thing as ushiro irimi nage, as far as I know... or is there?

I was training as usual on Thursday, starting with one hour of aiki-jo (time to brush up the 13 jo kata), then off to tai jutsu. Without fail, we start with some tai no henko and morote dori kokkyu nage, and that's when I had the idea.

After years of kokkyu nage, I never cease to find inspiration from this technique. It's difficult to explain the technique in words, as most aikidokas will know, but I'll try, so here are the basic concepts (in my opinion) of the morote dori kokkyu nage basic form:

Engage kokkyu.
Slight step on the side, just enough to misalign the direction of attack.
Lower the centre while keeping an upright posture, to break a potential arm lock.
Turn and change hanmi while cutting upwards as if with aiki-ken.
Enter uke's space while extending kokkyu up and to the rear (this breaks uke's balance).
Rotate upper body (this starts the nage as uke starts falling backwards).
Settle extending kokkyu down and to the rear (this completes the nage).

...and there was my revelation: steps 5, 6, and 7 are also the steps that complete all irimi nage techniques (at least in my head they are). Here's what I see happening in the irimi nage basic form, independently of the attack:

Enter uke's space while extending kokkyu up and at the front (this breaks uke's balance).
Rotate upper body as in the fourth jo subury (this starts the nage as uke starts falling backwards).
Settle extending kokkyu downwards (this completes the nage).

There are technical differences, of course. For example, the nage in irimi nage is always performed to the front, and it uses aiki-jo movements instead of aiki-ken, but both techniques are about entering uke's space strongly, extending kokkyu through uke's space, rotating the upper body and completing the nage by settling the body weight and extending kokkyu downwards.

Now, every time I think of the basic form of kokkyu nage I think of "ushiro irimi nage", or "irimi nage to the rear", and that changes the whole perspective of this technique. Is it right? Is it wrong? It doesn't matter. Every so often in aikido I find something that turns all my understandings upside down, or even throw them out of the window to start again from scratch, and that's why I love it.
M.

31 July 2009

Lost in Reflection

The time for a new project had come. I was, as usual, excited about the idea of sowing the seed of a new technological creature and bear it for the initial gestation period, until it was mature enough for someone else to take over the incubation and seeing it through to begin its autonomous digital life.

This time it was about web services. We had already decided to write them in Java, and we had a rough idea of the interfaces we were going to expose, but the rest was "carte blanche". This was particularly exciting because, even though I had worked with web services on other projects, this was the first time I was actually going to design web services from the start, so there I was, rolling up my sleeves and reading the Axis2 reference pages.

The other exciting part was that someone else was going to develop an application that would consume my web services, and due to certain timing requirements in their development lifecycle they would have to do their work in parallel with mine. This introduced an interesting challenge: the interfaces would have to be pretty much rock-solid long before any actual web services code was written.

No problem, I thought. We already knew more or less what interfaces we were going to expose, so I can literally code them in Java, build the WSDL from the compiled interfaces and ship the WSDL to the developers on the "consumer project". This way, we can then work out the web services implementation and the consumer implementation independently.

First things first: the contract is not just about the information that a specific method takes as arguments and the information it will spit out as a response. There is a bit more to it. For example, we knew that in our implementation of the web services every method should have a set of "standard" arguments in its request, like the id of the application that originated the request, or the id of the specific language requested by the consumer (e.g. EN, DE, FR, etc). We also knew every method response must include some sort of status code and status message. Finally, we wanted to also include support for digitally signing both the request and the response.

Easy, I thought, we would obviously stick all of these in the relevant super-classes in our implementation, so I created the related super-interfaces in the package that I would use to create the WSDL.

Just for clarity, it worked out to be something like this:

// Base interface for all requests and responses
public interface WebServiceMessage {
    public String getSignature();
    public void setSignature(String signature);
}

// Base interface for all requests
public interface WebServiceRequest extends WebServiceMessage {
    public String getLanguageId();
    public void setLanguageId(String languageId);
}

// Base interface for all responses
public interface WebServiceResponse extends WebServiceMessage {
    public int getStatusCode();
    public String getStatusMsg();
}

// The request for the method doSomething() in the web service
public interface SomeRequest extends WebServiceRequest {
    public int getSomeArg();
    public void setSomeArg(int someArg);
}

// The response for the method doSomething() in the web service
public interface SomeResponse extends WebServiceResponse {
    public int getSomeResult();
    public void setSomeResult(int someResult);
}

// This is the actual web service
public interface MyWebService {
    public SomeResponse doSomething(SomeRequest request);
}

Just to prove the concept, I carried on and mocked up a basic implementation for a couple more methods, then packaged and published the mock service and proceeded to give it a simple test through SoapUI. I gave SoapUI the address of my service endpoint and it quickly digested the web service's WSDL, presenting all the right methods, which was good... until I asked SoapUI to generate a sample request for one of the methods.

The request was incomplete. There were some elements completely missing. So I checked the WSDL and... shock, horror... the schema for SomeRequest only showed some members but not others!

<complexType name="SomeRequest">
        <complexContent>
          <extension base="impl:SomeRequest">
            <sequence>
              <element name="someArg" nillable="true" type="xsd:int"/>
            </sequence>
          </extension>
        </complexContent>
      </complexType>

Where were the inherited members, like the language id? Lost. Gone. No trace.

Did I use the wrong arguments in java2WSDL? Nope. It turned out that java2WSDL did not support interface inheritance. Digging around online forums and googling up "java2WSDL support inheritance" unfortunately confirmed that. Apparently this is a limitation of the reflection engine they use in Axis2, to be rectified in a future release.

This exercise has now resulted in a couple of learning points for me.

even when you think you have identified your assumptions, think again: there are a lot of implicit assumptions (these are the assumptions that are not actually spelled out) in a piece of architecture, and "support for inheritance" is one of those
after you have identified your assumptions, check them out: it turned out that java2WSDL supported class inheritance, but not interface inheritance
hold on writing your mocks until you've been through the above: I could have saved a whole day of refactoring if I had waited mocking up the web service until I had checked the WSDL generation

Mea culpa.
M.

01 July 2009

Cloud Sweet Cloud

What's this cloud thing? Where is it? What does it do? Why should I bother? Why should I care?
I care because I'm a geek, and any new-tech stuff simply gets my unconditional attention, but aside from that I have recently found more reasons to care about it.
I am on this project with a fairly complex project whose main characteristics are:

zero client entry points (all web-based)
multi-tier architecture
inter-layer messaging based on HTTP or HTTPS
scalability is achieved by simply adding more modules
pretty much any kind of load balancing solution can be implemented at the client entry point

When we implement it, we figure out how many concurrent users we might expect, and size the implementation accordingly, in terms of number of processors needed, amount of memory needed, dedicated bandwidth, etc.

The key to any implementation is, indeed the number of concurrent users, but guess what... in many cases getting the project number of concurrent users is like choosing the numbers to play at the lottery: a wild guess. I've seen the same happening many through my career. It's just the way it goes.

You can call the country or regional manager, ask about the market projections for the first year, then talk with the marketing guys, see if there have been pre-registrations etc, and finally they all agree with a ballpark estimate of 'N' concurrent users. Perfect! What's the confidence level on this? Hmmm... they say pretty good, with a 20% floor and a 50% ceiling. Excellent!

So what do you do? You size for N x 2 concurrent users and proceed with the implementation. Then, of course demand will be way over the projections and your business unit will have to rush new equipment to take care of it, but until it does it will effectively lose business, and inevitably people will start pointing fingers...
Maybe sizing for N x 2 is not enough, so what do you do? Size for N x 3? N x 5? And what if the actual demand end up sticking to the projections or be less? You then have a lot of spare capacity that pulls down the return on investment. So what's the magic multiplier for sizing a projection of N concurrent users?

Of course, there is no such thing. But wait a minute...

Enter the cloud.

There have been a lot of discussion about 'what' actually is 'the cloud'. To me, there is no such thing as 'the cloud'. There are service providers. Some of the solutions are geared towards running web applications, like Google App Engine or Windows Azure. Other solutions are geared towards storage and backup, like Nirvanix. Some other solutions are geared towards data centre resources outsourcing, like Amazon EC2.
Amazon EC2 is the stuff I've been playing with and it's brilliant. The cost structure is somewhat exotic: you pay for CPU/hour, TB/month transfer rates, GB/month storage rates, etc. However, once you get to grips with this model I find it a lot more deterministic than traditional costing. A systems architect will have a good idea about these figures for system usage, so budgeting should not be a serious problem. But budgeting and costing is not why I'm a big fan of this type of cloud solution.

The reason why I like it is that in the cloud there are no such things as spare capacity, forecast errors, starved resources, supplier delays, or last-minute server re-allocation traumatic stress disorder.
Mind you, I'm not saying this s the silver bullet of all scalable implementations. It has its pros and cons, and things to consider are the loss of control over parts of your data centre, the (un)reliability of internet, remote administration costs, staff training, securing the data channels with the cloud, encription, compression, and so on.

Yes, nobody said it doesn't need any homework, and each project is different, but for me and some of my projects this is a godsend. Need more capacity for the Christmas rush online sales? No problem, let me open my ElasticFox plugin and I'll double your capacity in less than an hour, already configured, already load balanced, and fully operational... oh, and with no downtime of course. Then what? Need to downsize after the Christmas rush? No problem, let me open my ElasticFox plugin and remove some spare capacity.

Hold on, that's too simple. Let's try this... I am going to release an updated version of the sytem and planning for a good load test and a stress test to see how far I can push it, so I want a single stack system for the stress test, and a 3-stack load balanced system for the load test, then I want the same again but pointing at a different back-end (one has standard test data, the other has localised data in different languages). No problem, let me open my ElasticFox plugin and... well... you know the story.

M.

15 April 2009

Silly Application!

Note to self: always define in detail what parts of what file or resource in what module needs to be localised.
We have this JSF-based web application where the UI is a bunch of .xhtml files, and we sent it off for localisation through third parties.

Sure enough, they did a good job with the strings. We switch the locale, and everything comes up in a different language. Perfect. Then we noticed a bunch of forms would not submit any more, often throwing some javascript errors.

It didn't take long to find the problem: they had localised the names of the controls in the .xhtml files too!
Silly application! If it expects data from a control named "user" on an English locale, why can't it work out that on a French locale the data should be picked up from a control named "utilisateur" instead? :-)

M.

07 April 2009

Architectural Inertia

Inertia: indisposition to motion, exertion, or change (source: Merriam-Webster Online Dictionary)

What happens when you subscribe to some web-based service and then you forget your password? No big deal, you can usually click on a convenient hyperlink next to the logon box labelled "Forgot your password?". That usually takes you to some sort of password-retrieval process involving the service provider sending you an email with a specially crafted URL; you then open that email, click on that URL, and enter a new password in the web form that has just opened up for you. Job done.

There are many variations on the theme, some of which are more involved than others, for example asking a whole set of personal questions to confirm your identity, but the general idea is always the same, and in many cases you can also retrieve your username through a similar procedure. These are good examples of helping users help themselves when a problem arises, or automated technical support.

Web applications, however, are moving away from usernames in favor of some other identifier that is more directly linked with the user, like the user's primary email address.

A number of web application designers, therefore, thought it would be a good idea to keep the same theme for automated tech support by offering right next to the logon box a convenient hyperlink labelled... "Forgot your email address?"

:-)

31 March 2009

The Cloakroom Pattern

I am writing this in the context of a web application.

In the "good old days", browsers were single-viewed: there was no such thing as tabbed browsing. If you wanted to browse multiple web pages at the same time, you had to open multiple browser windows. Web architectures, application servers and frameworks have evolved to satisfy this single view of the world by managing context at three different levels: application, session and request.

A request exists when the client sends a message to the server. The moment I open my browser google up some stuff, a request springs to life. This can store transient (short-lived) information, like the terms I want to search for, which are unlikely to be re-used after the request has been satisfied.
At the application level, a context exists for the entire life of the application, until the server is restarted. For example, I might need to audit certain user activities to a database, so I might want load the database connection details when my application starts up and store them in the application context.

Somewhere between the application context and the request context, a web application might also want to manage a user session. For example, when I am browsing my web mail, there must be some kind of information that persists between one request and the other, so that for example I don't need to login again if I move from one mail item to the next.

How does caching takes place and why?

Let's say I sell pizza, and that part of my web application has a search function that goes through my catalog and returns to the user all the pizza styles that match the user's choices. Let's also say that the application stores the search criteria in an object that can be cached, let's call it searchObj, so if the user refreshes the page (without changing the search criteria), the application saves time and resources by simply re-using the same data instead of making a new round trip to the database.

According to what we said above, if searchObj needs to be persisted across requests, it makes sense to cache it at the session level.

So here I am as a potential customer using this pizza web application, searching for pizza that contains ham, so I type "ham" in the input box, click the submit button and look at the resulting list. All the listed pizzas have ham in the ingredients. If I happen to refresh the browser, the application simply re-uses the same list without making a new round trip to the database.

Now let's say I open a new browser tab (not a new browser window) to display the results for a different search. This time I want to search for pizza that contains olives, so I type "olives" in the input box, click the submit button and look at the resulting list. All the listed pizzas have olives in the ingredients. Great.
Now I go back to the previous browser tab, the one with ham-based pizzas, and hit the refresh button. All the listed pizzas now have olives in the ingredients.

What happened?

It happened that searchObj was overwritten by the second search, but how?
Let's think of this scenario in a different way. Let's say I need some milk, and that I suffer from particularly bad memory, so before I forget I decide to write "Milk" on a post-it note and stick it to the door, then I go to get your jacket, car keys, etc. Now let's say my lodger, the sentient invisible pet chimp Iain, needs some fruit juice but instead of writing "Fruit Juice" alongside "Milk" on my post-it note, he decides to replace my post-it note with another one saying "Fruit Juice". Now I'm ready to go out, but of course I have forgotten what I needed to buy, so I pick up the post-it note from the door and happily go on to buy... fruit juice!

In this example, the post-it note is searchObj, Iain and I are the request-scope beans activated from two different tabs of the same browser, and the door is the session. Assume my house only has one door, the entrance/exit door (multiple tabs on the same browser share the same session).

How can we solve the problem?

In terms of "post-it note on the door", it's fairly easy: we draw two squares on the door and label them "Marco" and "Iain". Now we can use our own post-it notes, as long as we stick them in our own designated area on the door.

How does that translate into a web application?

We need to think of this type of context as sitting somewhere between the request scope and the session scope. If we think of each browser tab as a different "view" of our user session, then we can talk of view-level context and view-scoped objects. However, this is not a built-in functionality in the well-known web application frameworks or containers, so we need to simulate it, but how?

In the above example, we said the door represents the session, so we need to stick into the session some kind of container that can hold labelled compartments. How about a Map, for example a Hashtable? Yep, could do with one of those, but how do we actually generate the keys? In other words, how do we make sure that each tabbed view of the same browser unequivocally identifies itself when posting information to the session and retrieving information from the session?

I'm not sure we can handle the "unequivocally" part, but here's what I would do: I would use the coat check pattern, also referred to as the cloakroom pattern. I don't think you'll find that in reputable software engineering books, so don't bother looking.

This is a snippet from Wikipedia's definition of "Cloakroom".
"Attended cloakrooms, or coat checks, are staffed rooms where coats and bags can be stored securely. Typically, a ticket is given to the customer, with a corresponding ticket attached to the garment or item."

In particular, you'll see that tickets are generally given away in sequential order, and that you don't actually need to show personal ID documents when picking up your coat: you simply produce the ticket that was given to you when you gave them your coat. For our web application, the tickets are issued by some sort of decorator class that intercepts HTTP requests and does something like this...

check if there is a querystring variable called "coatNum" (or whatever you fancy)
if there is one, do nothing, otherwise increment an internal counter and use it to decorate the querystring by appending the "coatNum" variable name and value

For a JSF application, this might be a phase listener (maybe for the RESTORE_VIEW event?). For an IceFaces application, things have already been worked out in the form of the "rvn" querystring variable.

For added security, some might argue that the view number should not be easily guessed, so sequential numbering is actually discouraged, but remember that we are talking about different tabs or views of the same browser window. In any case, just for clarity, I will stick to a counter that gets incremented every time it's used.

There is a second part to this: once we know what view number the request wants to use, how do we use that view number to organise our view-scope objects? We said we could use a Map, but where does that live? In the real life coat check scenario, let's say at a classical concert, there can be multiple coat rooms, with each coat room used by multiple punters. This might suggest that the Map holding view-scoped objects should be application-scoped. Even though it is possible to do so, it would require additional overhead in terms of resources, because *all* view-scoped objects for *all* users in the entire application. Also, we would have to write additional code to manage object expiry and cleanup, otherwise we would see the Map growing to infinity and beyond. There are also some security and privacy concerns, since every request would have access to *all* view-scoped objects.

One solution is therefore to stick the Map in the session, or a session-bound bean. As a result, the internal counter that identifies the view number must also be session-bound, so that it starts at zero every time a new session is generated.

In summary, here is what I would do every time a new request comes in:

use an internal session-bound variable to generate view identifiers (e.g. a counter) and a session-bound Map to cache view-level objects
intercept requests and check for a query string variable that identifies the view number
if it's not present, then decorate the query string with a variable that identifies the view number
if the session-bound Map already contains an object for the given view number, then discard the object received from the request and re-use the cached object instead, otherwise take the object from the request and cache it
process the request and return a response to the client

Hold on a minute... what if the request actually uses more than one object?

In that case we don't simply have a session-bound Map, but rather a Map of Maps. In other words, the session-bound Map will still be keyed by view ID, but the value for a given view ID will be a Map of objects, keyed by object ID. We can therefore talk about a "view map" being a collection of "object maps".
This is the revised workflow:

use an internal session-bound variable to generate view identifiers (e.g. a counter) and a session-bound Map (the view map) to cache view-level object maps
intercept requests and check for a query string variable that identifies the view number
if it's not present, then decorate the query string with a variable that identifies the view number
for the view-level object that should be cached, find its identifier (might well be a simple obj.toString() call)
if the view map contains an object map for the given view ID, then retrieve it, otherwise create a new object map and put it in the view map for the given view ID
if the object map contains an object with the same object ID, then discard the object received from the request and re-use the cached object, otherwise take the object from the request and put it in the object map
process the request and send a response to the client

The mechanism can also be extended with an expiration algorithm that kicks in on steps 5 and 6, so that cached objects are refreshed from time to time if needed, but that is another matter altogether.