Fair Games

Sep 7, 2008

(I've been neglecting the 'Accessibility' post category at this site for a while now, and it's high time to correct that mistake. Expect some more posts in this area over the coming days.)

There's a really great article that's been posted about the work my dad is doing with Maze Day. Maze Day is a yearly event geared towards children with disabilities, and held at the Computer Science Department at UNC. Each year, the students come out to try out new games and technologies being developed for them. Some really cool stuff has been developed over the past few events, and I know the students always have a blast taking part. Check out the article to learn about the cool things being developed (and given away for free!) for kids with disabilities.

I just completed the seventh and final installment of the Harry Potter series: Harry Potter and the Deathly Hallows. Without a doubt, this final volume is the pinnacle of the series. The narrative is unlike any of the previous installments, and reads like a non-stop action movie. And while I'm on the topic of movies, I will go ahead and predict that the feature film for this story will not do it the appropriate justice. To fully appreciate this story, and the overall arc of the boy wizard, one must spend time with the books.

I'll admit that I was apprehensive of going into this final story; a little scared, even. A great sense of foreboding precedes the reader into this final volume, and never once lets go. My nerves are pretty shot as a result (J. K. Rowling is truly a master of the cliffhanger). Thankfully, I can say that the finale is well worth the journey through seven years in the life of Harry Potter.

Completing this series is difficult. I've become friends with the characters in these books, and to know that their adventures are over is a little sad. But, as I mentioned in a recent post, the reading bug has bitten me once again, and I look forward to delving into other worlds.

If you haven't read this series, pick it up. If you think the series is just for younger readers, think again. And if you're persistent enough to read through all seven volumes, you will be rewarded. I feel safe enough to say that these books now rank among my favorites, and will hold a treasured spot on my book shelves.

There's an interesting article offering a preview of the next major release of the WordPress blogging platform. It looks like the old administration look and feel will be returning, especially on the write panel. I'm looking forward to having more vertical screen space when writing, but these changes will (yet again) take some getting used to. There's lots more changes in store, so 2.7 should be an interesting release.

Google Chrome

Sep 4, 2008

There's an incredibly insightful comic (hat tip to Dustin) on the new Google Chrome web browser. It explains a number of the design decisions that the Chrome team has made, and the ramifications behind them. There are some very interesting ideas in this new web browser:

  • Instead of taking the pure multi-threaded route, Google has instead opted for a multi-process route. According to their explanation, this requires a larger up-front memory quota, but reduces memory fragmentation over time (the cause of the much misunderstood 'memory leak' in Firefox).
  • The user interface is quite clever, with tabs appearing above all of the other browser chrome. This groups the controls more logically, and reinforces the separate processes model (you can drag tabs from one window to another, for example).
  • Chrome's security model is clever, again thanks to the multi-process model.

As can be expected with this kind of thing, the media is buzzing about this new entry into the browser space. Some people are heralding its arrival, while others are brushing it off. There are several problems I foresee with Chrome that I believe will prevent it from becoming the new defacto web browser:

Accessibility
This is the biggest potential flaw with this web browser. According to one report, Chrome is far from accessible. No matter how good Chrome turns out to be from a functionality point-of-view, if it's not accessible, it won't be accepted by major corporations or government entities. Given Google's very poor track record, I don't have high hopes on improvements in this area.
No Add-ons
As far as I know, Chrome does not support add-ons like Firefox. That means no Adblock Plus, CoLT, or Firebug. That's a deal breaker for me.
Security Concerns
Google's security model for Chrome is clever, but as security problems are found, how quickly will they be patched? Google has never been prompt on releases (the last Google Talk update was in 2006), so I'm leery of how readily they will respond.
Stupid Name
Chrome is a ridiculous name. How many millions of other stuff out there has the word 'chrome' in it? It doesn't stand out, and seems a little bland, in my opinion.

I haven't yet downloaded the browser to try it out, but I plan on doing so soon. Have you tried it out? If so, what do you think?

Behind the Demoman

Aug 30, 2008

The Team Fortress 2 Blog has a really insightful article on the design process behind the demoman class. Most interesting is the fact that the demoman started out as a red-haired Caucasian which, in the author's words was "perhaps a little too much like a groundskeeper in Springfield." These behind-the-scenes looks at the game design process are really interesting to read, making the TF2 blog a worthy bookmark.

Order of the Phoenix

Aug 29, 2008

In order to properly describe my reading experience with Harry Potter and the Order of the Phoenix, I'd like to talk briefly about another, unrelated book.

The seventh and final book in the Chronicles of Narnia series, entitled The Last Battle, is perhaps one of the most frustrating books I've ever read. Strangely enough, it's also the most gratifying of the series. For what must be easily three quarters of the book, the reader is forced through one bad event after another. No hope ever presents itself and all seems lost. At one point, I seriously considered putting the book down for good. Thankfully, I didn't do so. The final quarter of the book makes the difficulty of the first three quarters worthwhile.

The Order of the Phoenix has a similar resonance with me. Much darker than the previous volumes, the fifth installment of the Harry Potter series is a frustrating read at first. Bad fortune seems to follow Harry around, affecting everyone around him. This makes the reader's job an unpleasant one. Thankfully, about two-thirds of the way through the book, things begin to turn around. As with the previous installment of the series (Harry Potter and the Goblet of Fire), this book ends on a rather melancholy note. This makes me a little anxious about what's to come in the final two volumes, especially the final one.

All in all, I'm really enjoying the series; much more than I thought I would originally. These books have rekindled my interest in reading, which I'm excited about. I used to read all the time when I was younger, but my efforts have been focused elsewhere for the past several years. Thankfully, that spark has been ignited again. I'm already thinking about what to read after I conclude the Potter series. Maybe I'll take a stab at the Lord of the Rings books...

Just the other day, after weeks of nagging by the automatic update daemon, I installed Windows XP service pack 3 on my desktop machine. So far, so good. I was quite leery of installing the update pack, seeing as Microsoft has fallen off the beam of competent programming in recent times. In fact, I'm still putting off updating my work laptop, out of fear that something will stop working. I've read reports online (and heard firsthand accounts at work) of automatic updates failing to install properly after installing SP3. Since no updates have been pushed out since I updated my system, I can't say one way or the other.

A few weeks ago, I was in search of something new to read, so I decided to pick up the first six volumes of the Harry Potter series. Having only seen a couple of the movies (the first and third movies, I now recall), I wasn't too familiar with the story line. What better way, then, to read the whole story from the beginning?

I just now completed the fourth book (Harry Potter and the Goblet of Fire), and will begin on the fifth (Harry Potter and the Order of the Phoenix) tonight. Without a doubt, J. K. Rowling is one of the great authors of our time. Every book I've read so far has been a delight, packed with action that runs the gamut of emotions. I've laughed out loud a number of times (there are some very funny scenes in each book), I've been tense with anticipation at the climax of each story, and there have even been a few sad moments, especially in the book I just finished (which is quite dark at the end).

I'm glad that I've finally gotten around to reading this series. If you haven't read them yet, I highly recommend doing so.

Team Fortress 2 Arena

Aug 21, 2008

The new arena game type in Team Fortress 2 is a strange beast. Players are given one life per round, and there's essentially no health to be had in each map. Rounds are played until either an entire team has been killed off, or a team captures the one and only capture point (which becomes active after 60 seconds). If a team wins 5 rounds in a row, the teams are scrambled and play begins again. I like this feature, but I think the limit should be lowered to 3 rounds in a row.

I've been playing this game type each night since its release on Tuesday, and I have to admit that I don't really like it. This most likely stems from the fact that I'm no good at arena mode; I'm often the first to be killed, and standard Team Fortress 2 tactics don't apply. Arena feels nothing like the game I've come to love, and I'm rather disappointed that Valve has expanded the game in this direction. A 'last man standing' approach doesn't yield for great team play, which is the entire point of the game. For now, I'll be sticking to the classics: payload, control point, and capture the flag.

In my previous article on Unicode, I discussed a little bit of background on Unicode, how to prep PHP to serve UTF-8 encoded content, and how to handle displaying Unicode characters. There's still a bit more we need to talk about, however, before we can truly claim internationalization support.

Prepping MySQL for Unicode

MySQL allows you to specify a character encoding at four different levels: server, database, table, and column. This flexibility becomes quite useful when working on a shared host (like I do at DreamHost). In my particular case, I do not have control over either the server or database setting (and both are unfortunately set to latin1). As a result, I set my desired character encoding at the table level.

To see what your current system and database settings are, issue the following SQL commands at the MySQL command prompt:

SHOW VARIABLES LIKE 'character_set_system';
SHOW VARIABLES LIKE 'character_set_database';

To see what character set a table is using, issue the following command:

SHOW CREATE TABLE myTable;

If you are fortunate enough to have control over the database-level character set, you can set it using the following command:

(CREATE | ALTER) DATABASE ... DEFAULT CHARACTER SET utf8;

The table-specific commands are similar:

(CREATE | ALTER) TABLE ... DEFAULT CHARACTER SET utf8;

Column level character encoding can be specified when creating a table or by altering the desired column:

CREATE TABLE MyTable ( column1 TEXT CHARACTER SET utf8 );
ALTER TABLE MyTable MODIFY column1 TEXT CHARACTER SET utf8;

I personally recommend setting the character encoding as high up as you have the capability to. That way, you won't have to remember to set it on any new tables or columns (or even databases).

If you have existing tables that do not use the utf8 character encoding, you can convert them with a simple command:

ALTER TABLE ... CONVERT TO CHARACTER SET utf8;

Be very careful when attempting to convert your data. The convert command assumes that the existing data is encoded as latin1. Any Unicode characters that already exist will become corrupted during the conversion process. There are some ways to get around this limitation, which may be helpful if you've already got some Unicode data stored in your database.

Communicating with MySQL

Once our tables are ready to accept Unicode data, we need to make some minor changes in the way we connect our application to the database. Essentially, we will be specifying the character encoding that our connection should use. This call needs to be made very early in the order of operations. I personally make this call immediately after creating my database connection. There are several ways we can set the character encoding, depending on the version of PHP and the programming paradigms in use. The first method involves a call to the mysql_query() function:

mysql_query("SET NAMES 'utf8'");

An alternative to this in PHP version 5.2 or later involves a call to the mysql_set_charset() function:

mysql_set_charset('utf8',$conn);

And yet another alternative, if you're using the MySQL Improved extension, comes via the set_charset() function. Here's an example from my code:

// Change the character set to UTF-8 (have to do it early)
if(! $db->set_charset("utf8"))
{
    printf("Error loading character set utf8: %s\n", $db->error);
}

Once you have specified the character encoding for your database connection, your database queries (both setting and retrieving data) will be able to handle international characters.

Accepting Unicode Input

The final hurdle in adding internationalization support to our web application is accepting unicode input from the user. This is pretty easy to do, thanks to the accept-charset attribute on the form element:

<form accept-charset="utf8" ... >

Explicitly setting the character encoding on each form that can accept extended characters from your users will solve all kinds of potential problems (see the "Form submission and i18n" link in the Resources section below for much more on this topic).

Potential Pitfalls

Since PHP (prior to version 6) considers a character just one byte long, there are some potential coding problems that you might run into in your application:

Checking String Length

Using the strlen function to check the length of a given string can cause problems with strings containing international characters. For example, a string comprising 10 characters of a double-byte alphabet would return a length of 20. This might cause problems if you are expecting the string to be no longer than 10 characters. Thankfully, there's an elegant hack that we can use to get around this:

function utf8_strlen($string) {
    return strlen(utf8_decode($string));
}

The utf8_decode function will turn anything outside of the standard ISO-8859-1 encoding into a question mark, which gets counted as a single character in the strlen function (which is exactly what we wanted). Pretty slick!

Case Conversions

Forcing a particular case for string comparisons can be problematic with international character sets. In some languages, case has no meaning. So there's not a whole lot that one can do short of creating a lookup table. One example of such a lookup table comes from the mbstring extension. The Dokuwiki project implemented this solution in their conversion to UTF-8.

Using Regular Expressions

The Perl-Compatible Regular Expression (PCRE) functions in PHP support the UTF-8 encoding, through use of the /u pattern modifier. If you are making use of regular expressions in your application, you'll definitely want to look into this modifier.

Additional Resources

In learning about how to add internationalization support to web applications, I gathered a number of excellent resources that I highly recommend bookmarking. Without further ado, here's the list I've created:

Visual Studio 2005 introduced support for doing parallel builds in solutions that contain more than one project. This is a great idea, especially on systems equipped with multi-core processors. Unfortunately, the developers at Microsoft apparently don't know how to program a multi-threaded application.

Suppose we're building two projects within one solution, call them Project A and Project B. If A and B exist in completely different folders, and are mutually exclusive in every way possible, the parallel build option is quite handy (improved build performance). However, if projects A and B share any code, any code at all, you run the risk of build failures. It seems as though Visual Studio doesn't lock files appropriately during the build process. So, if each instance of the compiler tries to build the same file at the same time, one of them will fall over and die, complaining that "no class instances were found."

It's shocking to me that something so seemingly simple could be broken in an application of this caliber.

Heavy Updates

Aug 13, 2008

The official Team Fortress 2 blog has word of the upcoming update to the heavy class. There are tons of additions in the pipeline:

  • New game type (as yet unannounced) with five new maps
  • New payload map
  • New community made map
  • Three unlockable weapons
  • Thirty-five new achievements

An official heavy update site is revealing new information each day this week, as well as Monday and Tuesday of next week. So far, they've revealed the new community map (cp_steel), one of the new unlockable weapons (a pair of boxing gloves), and the list of achievements (though none of them have descriptions yet). As before, a free weekend will be held for folks new to the game. I'm looking forward to this update; it'll give me a reason to jump back into TF2.

One Bad Mutha

Aug 12, 2008

I was saddened to hear that music legend Isaac Hayes died on Sunday. He was an incredible composer and performer, and his additions to the music world will be greatly missed. Of his many works, my personal favorite (and, in my opinion, his best) is the soundtrack to the classic 1971 film Shaft. I highly recommend picking it up; it's top quality stuff, especially if you're a jazz and soul fan. While you're at it, pick up the film too; I consider it one of my top ten favorite movies.

Kayaking Fun

Aug 2, 2008

My dad and I took a 4-hour kayaking class this afternoon, from the folks at Frog Hollow Outdoors. The "intro to kayaking" class was incredibly thorough, and I learned a ton about how to kayak correctly. Topics covered included a number of various paddling strokes, wet exits (what to do when your kayak flips upside down with you in it), deep water rescues, and more. It's some of the best $68 I've ever spent. I had a blast, learned way more than I thought I would, and got excellent exercise (I'm so tired at the moment).

If you're in the Triangle area, and you're looking for a way to learn about kayaking (or canoeing for that matter), check out Frog Hollow. Our particular instructor, a guy by the name of Banks, was incredibly knowledgeable and very friendly. It was well worth the trip, and I'd do it again in a heart beat. My next goal: river kayaking fundamentals. I can't wait.

Fixing Pathfinding

Jul 29, 2008

I just finished reading an excellent article on how to fix pathfinding in games. The author presents a number of excellent examples of how today's pathfinding can break (with examples from legendary games like Oblivion and Half Life 2), and offers a great solution: use a navigation mesh instead of a waypoint graph. Genius.

I ran across another weird and subtle bug in Visual Studio 2005. If you've got a solution with many project in it, you can set one of those projects to be the default project at startup (i.e. when you open the solution file). But this setting apparently resides in the user options file (.suo), which is something we don't keep in our code repository (since it differs for every user). So how can you set a default startup project that affects anyone working with your code? Simple: hack the solution file.

Thankfully, the solution file is just plain text. Apparently, if there's no user options file for a given solution, Visual Studio 2005 simply selects the first project it comes across in the solution file. Here's a quick example of what a solution file looks like (wrapped lines marked with »):

Microsoft Visual Studio Solution File, Format Version 9.00
# Visual Studio 2005
Project("{3853E850-5CD7-11DD-AD8B-0800200C9A66}") = "ProjectA", »
"projecta.vcproj", "{D9BA97DE-0D09-4C35-99D6-CC4C30A6279C}"
EndProject
Project("{3853E850-5CD7-11DD-AD8B-0800200C9A66}") = "ProjectB", »
"projectb.vcproj", "{E1D73B44-57D9-4202-A92A-0296E3583AC4}"
EndProject
Global
{ ... a bunch of junk goes here ... }
EndGlobal

In this case, Project A will be the default startup project. To make Project B the default, simply move its associated lines above Project A in the file, like so:

Microsoft Visual Studio Solution File, Format Version 9.00
# Visual Studio 2005
Project("{3853E850-5CD7-11DD-AD8B-0800200C9A66}") = "ProjectB", »
"projectb.vcproj", "{E1D73B44-57D9-4202-A92A-0296E3583AC4}"
EndProject
Project("{3853E850-5CD7-11DD-AD8B-0800200C9A66}") = "ProjectA", »
"projecta.vcproj", "{D9BA97DE-0D09-4C35-99D6-CC4C30A6279C}"
EndProject
Global
{ ... a bunch of junk goes here ... }
EndGlobal

Don't forget to grab the end tags of each project (and any child content that may live between them).

Dustin and his wife recently uncovered an interesting limitation of my Monkey Album software: characters outside of the ISO-8859-1 (Latin 1) character set don't render properly. This comes as no surprise, seeing as I didn't design for Unicode. Being a rather egregious display error, I decided to set out and fix the problem. In the process, I learned quite a lot about Unicode, and how it affects web applications. This post will be the first of two detailing how to add Unicode support to a web application. I will only be exposing a tip of the Unicode iceberg in these posts. The ideas and practices behind Unicode support can (and do) fill the pages of many books. That said, let's jump in.

Brief Background

For the uninitiated, Unicode is a coded character set. That is, it maps a unique scalar value (a code point) to each character in a character set. ASCII is another example of a coded character set. Each character in a coded character set is intended to be encoded using a character encoding scheme. ISO-8859-1 is an example of a character encoding scheme.

It is important to note that ISO-8859-1 is the default encoding for documents on the web served via HTTP with a MIME type beginning with "text/". So, if you're not set up to specifically serve another encoding, your web pages are most likely using ISO-8859-1. This works just fine if you speak English or a subset of European languages. But because the ISO-8859-1 character encoding uses only 8 bits for its encoding scheme, it is limited to 256 possible characters. It turns out that 256 characters isn't enough for international text representation (the Chinese and Japanese languages come to mind). What can we do?

Thankfully, we have a solution in Unicode. A number of Unicode encoding schemes are available for us to use: UTF-7, UTF-8, UTF-16, and UTF-32. Each has its merits and detractors, but it turns out that UTF-8 is the preferred encoding of choice in the computing world (it's a nice trade off between space allocation and capability). As a bonus, UTF-8 works nicely with ASCII, which makes migrating English-based websites much easier.

Unfortunately, we have another major problem to deal with. All PHP releases (prior to the upcoming PHP 6) internally represent a character with 8-bits. That's right: PHP has no native support for international characters (yet)! This means that we have to be extra vigilant in our pursuit of internationalization support. So how do we do it?

Prepping Our PHP for Unicode

In order for our PHP application to properly display Unicode characters, we need to do some preparatory work. This involves setting the appropriate character encoding in a few places. We'll first set the encoding in the header:

header('Content-Type: text/html; charset=UTF-8');

Remember that the header() function must be called before we output any HTML, so it needs to appear early in the chain of events. Note also that the header call incorrectly labels the encoding as a 'charset,' making the naming conventions even more confusing.

We can also specify the encoding through the use of a meta tag (I recommend setting this even if you set the header):

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

If you take this route, make sure this tag is placed near the top of the <head> element in your HTML (before your <title> element, in fact). Otherwise, the browser may select an incorrect encoding.

To verify that that the appropriate encoding is being used, you can use the View Page Info feature in Firefox (just right click the page and you'll see it in the context menu). Here's an example:

Page Info Dialog in Firefox Showing UTF-8 Encoding
Displaying Unicode Text

One of the primary functions that PHP provides to convert characters into their HTML entity equivalents is the aptly named htmlentities() function. However, since we're converting our application to support UTF-8, we don't need to make use of this function. Why is this? First, HTML entities are generally only understood by web browsers. By converting special characters into HTML entity equivalents, it becomes much harder to move data between the web application and other data sources (RSS feeds, for example). Second, and most importantly, UTF-8 allows us to display extended characters directly. To quote Harry Fuecks [PDF], with UTF-8 "we don't need no stinkin' entities." Instead, we should only worry about the "special five":

  • Ampersand (&)
  • Double Quote (")
  • Single Quote (')
  • Less Than (<)
  • Greater Than (>)

Thankfully, PHP gives us the htmlspecialchars() function to handle these five special characters. One very important thing to note is that this function allows you to specify the character encoding to use when parsing the supplied text. For example:

htmlspecialchars($incomingString, ENT_QUOTES, "UTF-8");

Specifying the character encoding is very important when using this function! Otherwise, you open yourself up to to a rather nasty cross-site-scripting vulnerability, something that even Google was susceptible to a while back. In short, the character encoding specified in your htmlspecialchars() call should match the encoding being served by the page.

What Next?

In the next article, I'll cover the following topics:

  • Prepping MySQL databases for Unicode
  • Accepting Unicode characters from the user
  • Potential PHP pitfalls
  • Useful resources (loads of helpful links)

As always, if you have suggestions or questions, feel free to post them.

Recently at work, I spent a fair amount of time debugging some strange run-time errors in one of our test tools (after having ported it from Visual Studio 2003 to VS 2005). When starting up a debug build of the tool, I would get the following error message:

An application has made an attempt to load the C runtime library incorrectly. Please contact the application's support team for more information.

This error message turned out to be a red herring, though it pointed me in the direction of the actual culprit: a circular dependency chain of debug and release versions of various Microsoft DLLs. In trying to figure out what was going wrong, I ran across an incredibly helpful article on troubleshooting these kinds of issues. The author presents seven different scenarios that can arise with executables built in Visual Studio 2005, along with solutions for each one. It's a great resource to have if you run into these kinds of problems.

Upcoming Stuff

Jul 22, 2008

Apologies for the infrequent updates: my sister recently came back to the US for about a week (which was fun), then my grandmother came to visit (which I also enjoyed), and now I'm sick. So there you have it.

A two-part series on Unicode support for PHP web applications is coming, provided I can feel better and get rid of my writer's block.

Just yesterday, Microsoft released a 'critical update' for issue 951748, fixing a DNS security hole. It turns out that this fix completely hoses the Zone Alarm software firewall (which I happen to run). In essence, you completely lose your internet connection.

The folks that make the Zone Alarm firewall are aware of the problem. For now, they suggest two workarounds: either uninstall the MS fix or set the firewall security slider to medium (down from high). Hopefully, a true fix will be issued within the next few days.

Update: This problem has even made Slashdot.