Stage IV Breast Cancer Foundation
September 30th, 2009My favorite charitable foundation, the Gal to Gal Foundation will be executing their annual virtual (i.e. online) walk through the month of October with a goal of raising more than $250,000 for education, awareness, and granting wishes to women with stage IV breast cancer.
It only takes $5 and some fun to help make a difference, please visit them here - www.galtogalwalk.org. You can learn more at the www.galtogalfoundation.org website.
Targa (TGA) Plug-in Library
August 24th, 2009I found that the TGA file format is quite useful in game development, and is used extensively by a number of modern game-engines, including Unreal Engine 3. There are a number of online examples and tutorials describing how to bang out some C or C++ code to load a TGA file so that it can be used by a DirectX or OpenGL application. However, I wanted to create my own library that solves the following problems:
- It can load the file into a graphics-library compatible format in a single pass (thus being much faster than the online examples I’ve read).
- Handles arbitrary bit-length encoding.
- Handles run-length encoded (RLE) Targa files.
- Automatically converts any TGA file to my desired format – 32-bit RGBA. If an alpha channel is missing, a default one is automatically created. It also converts from black & white encoding.
- Allows me to apply color-channel masking or replacement by modifying the serialized RGBA value, allowing me to manipulate the image prior to creation as a graphics-library texture.
Given that I’ve never found a real use for Targa color mappings, I dropped support for using them in my reference library. The reference library I created is written in C, and compiles natively on Windows, Linux and Mac OS X. Both the ZIP archive (download it here) and the tarball (download it here) contain a Microsoft Visual Studio solution and project files as well as a simple makefile. You can read the Doxygen-generated documentation here.
Using RSA to Exchange Cryptographic Keys
August 21st, 2009The following is a methodology for programmers to provide a simplified implementation for using the RSA algorithm to exchange a shared private key from another cryptosystem such as AES or Blowfish for further secure communications. This methodology will be described following the typical practice in cryptography as an imaginary conversation between two fictional characters that have names taken from successive letters of the alphabet. For the the following discussion, Alfred represents an entity wishing to instantiate secure communications, and Beatrice is representative of the recipient.
To begin, Alfred has already obtained in a secure fashion a copy of Beatrice’s public RSA key, e.g. she handed him a disk with the key, or he verified it with a hashing algorithm, or he used a trusted 3rd-party mechanism, etc. Alfred begins his conversation with Beatrice by generating his own RSA key with an arbitrary number of bits (though hopefully as least as large as Beatrice’s). Additionally, Alfred should pick a key length that is of sufficient length as to be thought computationally impossible to guess it in a reasonable time frame, say a couple of centuries at least.
At this point, Alfred has a copy of Beatrice’s public RSA key, and Alfred has generated his own RSA key. Alfred then transmits his public key to Beatrice, which has been first encrypted with Beatrice’s public key, and then cryptographically signed with Alfred’s private key. This way, only Beatrice may be able to decrypt Alfred’s public key, and with Alfred’s private key signature, Beatrice can validate that she is indeed speaking to Alfred.
Upon receipt of Alfred’s message, Beatrice will then decrypt Alfred’s public key using her private key. She then will generate a new key that will be used as a private shared key for further communications. She takes her new key, and encrypts it with Alfred’s public RSA key. Beatrice then signs the result with her private key, and transmits the resulting message to Alfred. At this point, Beatrice has a cryptographically-signed message from Alfred, along with his public key, and the new shared private key generated by Beatrice.
Upon receipt of Beatrice’s response, Alfred uses his private key to decrypt the private shared key he received from Beatrice. He can further use Beatrice’s public key to validate her private-key signature of the message she sent. At this point, both Alfred and Beatrice have established each other’s cryptographic but not real identity, and have exchanged a key that they can use to safely encrypt any further messages they wish to exchange.
If Alfred and Beatrice want to eliminate any real possibility of cryptanalytic attack against their further communications, they may use the private shared key only to exchange a single message, and will generate a new shared key every time they wish to send each other a message.
So, how to we easily translate this into programmer-land? Here is a simple example: if you think of Alfred as a computer game client, and Beatrice as a computer game server, you now have a methodology that will allow you to prevent 3rd-parties from listening in on any sensitive communications, e.g. paying a monthly subscription fee, transmission of critical or advertisement material, purchasing merchandise from inside a game, etc.
Whitepaper on Software Development (Part 1)
August 13th, 2009Having worked as the lead-geek on a number of disparate projects over the last decade ranging from small one-off applications to multi-million-lines-of-code major enterprise applications and multiplayer games, I have developed a realistic and rather hard-nosed approach to software development revolving around the following principles, namely:
- Maintainability
- First refer to rule number one.
I name maintainability as the primary (and in the above list, sole) principle to software development for a number of highly-practical reasons, which include:
If the code is easy to read, well organized and does not include sections of “clever” code, then an at time I need to I can go back and optimize it for performance, fix bugs and add new functionality. Through a rigorous practice of unit-test-driven development, and time I want to analyze a bug in a given package, add a new function, hand-off the code to another developer, etc. I can simply use the existing unit-test as a baseline sanity-check to make sure that I have not done something with the code that will make it unusable to other coders or layers in the application hierarchy.
Organization
The organizational principles I follow are largely derive from long years of using Unix and Linux systems, and also many years of intensive development using Java. I like to start by defining a source directory in my project root that then contains a list of directories that match the set of languages I utilize for the project. For example, if I wanted to write a search engine in Java with various data-access and performance-optimization extensions in a language such as C or C++, I would start by creating a “src” directory, from which I would create “src/java” and “src/c” or “src/cpp”. That way I can at-a-glance begin to navigate through my source hierarchy. Additionally, this makes searching, especially command-line tools on a Unix or Linux platform very convenient.
From the root location, I might also define a number of other directories based upon the needs of the project, such as a “bin” directory for compiled binaries, a “conf” directory for configuration files, a “data” directory for application-specific data-files, an “assets” directory for game-engine assets, a “docs” directory for project-documentation, a “plugin” directory for external libraries, a “log” directory for application logs, a “dist” directory for JAR files compiled from Java class files, and a “unit” directory to contain compiled unit-tests.
In a typical project, I usually have a few handy command-line bash-shell scripts in the “bin” directory for various convenience routines I might need, along with symbolic-links to Perl scripts from the “src/perl/tools” directory. In the “conf” directory, I will have my default Doxygen and JavaDoc configuration files, in the “docs/doxygen/[project package name]” and “docs/javadoc/[project package name]” directories I can browse the current source-code documentation. I typically also have business plans, technical design documents and project notes in the “docs/business-plans”, “docs/design” and “docs/notes” directories. The root of the project directory (which also is the root of the CVS or SVN repository for the project) I will have the appropriate Make, Ant or Microsoft Visual studio project files I need to compile and execute the source code, unit tests and applications contained within the project.
Since all of my projects are organized around this same simplified layout, I always know where to go to find anything I might need at any given moment during the software development lifecycle.
Coding Standards
As I mentioned previously, a good coding standard has several primary goals which make your life as a programmer much easier:
following a consistent standard across all source in a given project makes the project much easier to maintain, enhance and hand-off
a good coding standard, by definition should outline practices that enable a programmer to naturally implement policies that cut down on common mistakes
All code should be prefixed with a multi-line set of comments that define the following set of information: the name of the source-code file (which at-a-glance will tell you if you run into a copy-paste nightmare), the appropriate copyright notice, a succinct description of the functionality provided by the code, and the name of the original author (so you can immediately go to him or her with questions without having to check your versioning-system’s history).
When writing code, the use of hard to read or understand – i.e. clever code – should almost always be avoided (there are of course exceptions to every rule, however they should be few and far between and made only for mission-critical reasons, e.g. embedding assembly to optimize a math function for a game engine, etc.).
The source code itself should be self-documented by the use of good naming conventions, including consistent syntax, verb-noun (i.e. English-language-style function, class and variable names) naming, and standardized indentation. Therefore the use of comments should be kept to an absolute minimum. This is because comments often disagree with the actual program logic due to the nature of programmers who over time update the code but not the comments. Then, when it comes time to fix a bug or make an enhancement, even the original author might not know if they should follow the comments or follow the code.
It does not matter if you like variables to be named color_highlight_value or colorHighlightValue as long as they are consistent across the entire project. Additionally, abbreviations, single-character and abstract naming conventions should be entirely avoided. It is very, very difficult to read code where all of the variables are named like “a1”, “c12”, “b3c7”, etc. instead of something that is self-describing. Additionally, it is much easier in the long run to have a variable named “isEnabled” rather than “isEnbld” (or worse, “b12”) - you only have to type two additional characters to make it easier to read.
When it comes to single-character variable or function names, the primary reason to avoid them is to make all of your source easily searchable in your text editor of choice. For example, it is common to use the letter “i” to denote a iterator value in a for-loop. However trying opening up a source-file with a thousand lines of code and search on the letter “i”. It is much easier if the iterator variable is either properly named, or it is at least two characters, such as “ii”. e.g.
void exampleForLoopFunction()
{
int ii = 0;
for(ii = 0; ii < 16; ii++) {
// see how searchable this is
}
}
…additionally, always define variables at the top of a function or method so that the person coming along behind you can read the set of variables the function is dealing with before attempting to understand the logic of the function. Also, it is very important to always initialize variables when they are defined, where possible, even in a language like Java which does this for you order to always clearly indicate to another programmer what you are doing with the state of your set of variables. This also cuts down on common mistakes related to variable initialization.
In order to clearly define your program logic intention, you should always denote the scope of the code you are writing in consistent fashion. For example, if you prefer a newline after a function definition and the curly-brace, always do so. Never use statements such as “if”, “for”, “while”, etc. without following it with a scope definition. Skipping this may easily lead to making simple mistakes which could have been fixed by typing the two extra characters.
Here is an example, copied verbatim from an undisclosed open-source C project intended to return the name of the a state represented by an integer:
const char *
chk_st_str(int st) {
switch(st) {
case 1: return "good";
case 3: return "bad";
case 7: return "unknown";
}
return "???";
}
…here is what I did after a quick refactor pass:
typedef enum _State {
STATE_GOOD = 1,
STATE_BAD = 3,
STATE_UNKNOWN = 7,
STATE_ERROR = -1
} State;
const char *stateToString(State state)
{
char *result = "Error";
switch(state) {
case STATE_GOOD:
result = "Good";
break;
case STATE_BAD:
result = "Bad";
break;
case STATE_UNKNOWN:
result = "Unknown";
break;
}
return result;
}
I’m out of time to continue this post, so here ends part 1 of my thoughts on good practices in software development.
Game Server Browser
July 29th, 2009A part of the Leverage 2 technology suite I’m working on for Pragmatic Solutions, Inc. is an online game monitoring system (GMS) daemon that attempts to query all game servers in its’ known protocol list once every twenty minutes. This information is fed into a search-engined based master game browser system application code-named “Orion” which is capable of tracking all online servers and players simultaneously. Orion uses the attribute-based search-engine (ABSE) core which is a loosely user-defined-indexed search-engine containing data-objects called “containers” which contain attributes called “entities”. For any given entity that matches one or more user-defined indexes the container is automatically indexed using the entity value.
The data is accessible via HTTP and JSON interfaces using the ABSE Query Engine (AQE) which provides an SQL-like syntax to access information within ABSE. This prototype technology to track online game players and servers in real time is showcased here. You may visit the work-in-progress mygameIQ.com to see one of the sites Orion will power. Currently hosted on xaede.com is a once-a-minute snapshot of the current state of online game-play, showing the top 30 game titles by player count, and the top 30 game titles by server count.
AA3 Master Browser Upgrade
July 24th, 2009The lead America’s Army 3 (an Unreal Engine 3 game title, see this blog post for a detailed description the AA3 systems I designed) game programmer and I identified a number of problems with the current AA3 game server browser (called the Master Browser System or MBS in AA3 nomenclature) this past week. These included an issue in the MBS SDK that prevented the multi-threaded server-query game-logic from executing game-server queries cleanly due to forcing the queries into a single-threaded state via an extraneous thread lock. Additionally the UnrealScript that interfaced with the SDK API call did not do so in a very efficient fashion. These problems were compounded by the thread-model confusion in the query pool design.
I began troubleshooting this problem by using the AA3 Arcturus framework to perform rapid application development (RAD) of a multi-threaded server-query testbed application based upon the existing single-threaded unit test design. I then used it to both tune the MBS SDK performance while eliminating the extraneous thread-lock in a thread-safe fashion. Additionally, this provided an ideal framework for optimizing how the SDK API call was utilized. Once this pass had been completed, I made a new SDK build, and also gave a copy of my testbed source code to the lead programmer to show a simple example of how to make the most efficient use of the component.
This process resulted in a radical performance enhancement in the in-game server-browser. Here are the current performance metrics: the SDK is able to obtain 365 servers from the MBS master server in 0.189 seconds, all 365 servers were queried (using a thread-pool of 16 threads) in 6.087 seconds with an average of 0.017 seconds per server. The total execution time, including SDK initialization and cleanup time was 6.459 seconds.
In summary, the performance went from a minute or two to query all of the AA3 game servers, to under 7 seconds. I currently anticipate that this upgrade will be included in the next build of AA3, namely 3.0.6. You can see the current set of online servers here.
World of Warcraft
July 22nd, 2009The concept behind the Combat Performance Reports – www.wowcpr.com.
A friend of mine is an avid World of Warcraft player that has been a long-time fan of both the game as well as reviewing statistics about his game-play. He had been paying for some online 3rd-party services to generate statistical reports from his game combat logs, and he felt that the services he was paying for were becoming stagnant. He began looking for alternative solutions to his game-play performance analytical needs, and determined that the existing websites that provide these services were either difficult to use or simply not reporting the information that he was looking for. He and I decided to work together to create our own website to fulfill this need. This article is a description of the summarization engine I created to populate the RDBMS database that powers the corresponding website – www.wowcpr.com.
The data-summarization engine in overview parses a World of Warcraft combat log into a series of events from which it creates a series of statistical summaries that are then inserted into the database. In order to rapidly develop this application, I began by using my rapid application development (RAD) framework – Project Asgard (www.projectasgard.org).
I wanted to be able to read the data from the combat logs very quickly so I first implemented a background thread that performed optimized reads on the log file and placed the data into an in-memory buffer – i.e. a read-ahead system - that I could parse events from.
I next created a component that could recognize every event type from the combat log, and turn each event into a structure that could be analyzed by higher application layers. To validate these event data structures I created a managed set of b+tree indexes that contained every known player spell, power, event type, event-attribute type, and character classes. I then utilized these indexes to validate every event read from the combat log so that a known data-set will be analyzed by the higher application layers, which made the logic implementation predictive as well as mitigating the garbage-in-garbage-out effect.
Given that I didn’t want to store all of this meta-data in code or configuration files, I created a simple SQL database schema to store both the meta-data as well as the set of statistical summaries I wanted to generate. I determined the summaries by creating a simple spreadsheet of each game-play event, and determined what the various event commonalties are. For example, the Spell Heal, Spell Period Heal, and Spell Leech events all can contribute to the Healing summary. From which, I created a simple SQL API, with a plug-in for the RDMBS I chose to implement the project with, namely PostgreSQL. I next began the implementation of a data-access layer for the specific application-layer touch-points to the database.
I also added a simple system to turn any given event into a unique key so that I could persist event-indexes to an ISAM database for automated log-event merging. How this works is when the engine detects an existing index when processing an event, it performs an SQL lookup of the existing statistical summary or summaries that the event contributed too, and then retains the summary in-memory for application of summary-updates with any additional contributing events the engine discovers later in the log file.
Since a given log file may contain multiple game-play scenarios, such as raids, boss fights, or multiple days worth of play, I created a trigger system to partition each the log file into multiple child summarizations as well as the main log-file summary. The trigger system is designed as a state engine to reflect the current state of the game. From this, the various triggers will automatically activate based upon changes in the state machine, which is set based upon events parsed from the log file.
For each character noted by the engine (including players, pets, and NPCs) a unique summary record is created for the log file as well as each child-summary noted by the trigger system. This summary contains general information about character, including the character’s name, GUID, class and number of seconds spent playing the game.
Finally, I created a simple framework that allowed me to contain multiple summary types, indexed by both the character that initiated the event, as well as the character the event effected. From this framework any summary can be added dynamically by the engine as events parsed from the log file are considered. Once the summarization engine completes parsing of the file, the summaries are obtained from the framework, and inserted into the database for each game-play scenario determined by the trigger system, as well as the full log summary. For any summary that was obtained from the database during for automated log-merging, any summary that had an update applied to it is then updated, and a meta-record is created in the database noting that the parsed log file contributed data to the updated summary.
Here is an example of what the engine reports once it has completed processing a log file:
-=-=-=-=-=-=-=-=- WoW CPR Processing Report -=-=-=-=-=-=-=-=- [CprReport] file statistics: [CprReport] name : wowstats.testfile [CprReport] owner : [email]@[domain] [CprReport] region code : US [CprReport] realm name : Draenor [CprReport] realm ID : 0 [CprReport] event year : 2009 [CprReport] event type : Raid [CprReport] byte length : 9701594 [CprReport] line count : 65537 [CprReport] invalid lines : 0 (0.00 %) [CprReport] avg bytes per line : 148 [CprReport] processing time : 3.771135 [CprReport] event statistics: [CprReport] valid count : 65536 (100.00 %) [CprReport] invalid count : 1 (0.00 %) [CprReport] unknown count : 0 (0.00 %) [CprReport] duplicate count : 0 (0.00 %) [CprReport] total parsing time : 0.108922 [CprReport] avg parsing time : 0.000002 [CprReport] total processing time : 0.423945 [CprReport] avg processing time : 0.000006 [CprReport] key hash time : 0.380303 [CprReport] avg key hash time : 0.000006 [CprReport] merge index time : 0.000000 [CprReport] avg merge index time : 0.000000 [CprReport] player statistics: [CprReport] valid count : 131072 (200.00 %) [CprReport] invalid count : 0 (0.00 %) [CprReport] summary records : 475 [CprReport] player exists count : 0 (0.00 %) [CprReport] new player count : 0 (0.00 %) [CprReport] database access time : 0.000000 [CprReport] avg access time : 0.000000 [CprReport] link index update time : 0.006813 [CprReport] index update time : 0.117868 [CprReport] avg index update time : 0.000002 [CprReport] summarization statistics: [CprReport] valid events count : 30744 (46.91 %) [CprReport] invalid events count : 0 (0.00 %) [CprReport] not-applicable count : 34792 (53.09 %) [CprReport] index updates : 30744 (100.00 %) [CprReport] index update failures : 0 (0.00 %) [CprReport] construction time : 0.117316 [CprReport] avg construction time : 0.000002 [CprReport] index update time : 0.133505 [CprReport] avg index update time : 0.000002 [CprReport] summary records : 0 [CprReport] database merges : 0 (nan %) [CprReport] database updates : 0 (nan %) [CprReport] database update time : 0.000000 [CprReport] avg update time : nan Processing complete, total run time: 3.956268 seconds
The Battle for America’s Army 3
July 10th, 2009This blog details my personal experiences with keeping the various systems that run the Unreal Engine 3 game title America’s Army 3 (AA3) running during and after the initial launch on June 17th (please refer to my previous blog post for a detailed description of the various back-end systems I designed and lead a team of developers to implement).
These systems include the following:
Authentication – the system to persist player account information, player profiles, character information (a “soldier” in AA3) and retrieve statistical summaries.
Attribute Tracking System (ATS) – the database-persisted rule-based character progression management engine.
Statistics Tracking System (STS) – a system for collecting in-game events, applying distributed massively parallel processing (MPP) statistical summarization, and persisting the results to a central database repository.
Master Browser System (MBS) – a system for tracking the state of online players and servers while providing an interface to allow a player to find a server adhering to user-specified criteria to play on.
Dynamic Content Delivery System (DCDS) – a system to delivering game assets to the game engine during play based upon game state and play events. It also provides the network for downloading the game via the Deploy Client application.

The various server components of AA3 use a specialized session-server system code-named Titan designed to both persist player and game related data across the cluster of servers as well as act as a cache for common Oracle queries. One of the first problems we discovered after launch is that the load radically escalated on Titan to around 3,000 transactions per second as the game was downloaded and gamers began to login and play. As a point of comparison, a typical Apache web server installation on Linux will degrade in performance rapidly past the threshold of a few hundred transactions per second.
Given that Titan uses a PostgreSQL database for local data storage, the in-memory caching system for Titan objects was not able to relieve enough stress on the database to prevent Titan from rapidly degrading in performance. In order to correct this, I first performed a hand-tuning pass on the PostgreSQL database configuration file to allow it to take advantage of the 8 GB of RAM on the Linux box it was running on. I then made a pass on the SQL that Titan was using to communicate with PostgreSQL, and enhanced the database schema to only create the indexes required to support the various where clauses. I then restarted Titan and PostgreSQL with the updates and continued monitoring.
Using information gathered from utilities such as “top” and “vmstat” as well as Titan’s real-time status-reporting system, I was able to determine that the performance had not increased to the point where Titan was able to execute low-latency transactions, so I decided to take advantage of the generous amount of RAM on the Titan system. I created a filesystem on Linux that was mounted against the systems’ memory – a ramdisk. I then moved the PostgreSQL database files onto the ramdisk, and resumed monitoring. This resulted in a radical increase in performance, and allowed Titan to keep up with the ongoing production load. Because of the indexed nature of the database schema, I created a script that would be periodically executed to optimize the internal indexes and SQL query execution environment.
I noted that Titan was beginning to handle an even larger number of transactions per second (peaks of around 12,000 transactions per second) due to the performance enhancements (which allowed more throughput). In consequence I wanted to decrease the number of transactions that Titan was required to handle, so I began by identifying redundant Titan queries from the AA3 services. I then designed an in-memory caching system to the Titan C API that allows the various AA3 systems to cache objects obtained from Titan in memory until they were either not used for 20 minutes or had been updated (the session server update code-paths evict the cached object). Once the new AA3 server builds were fully staged to production, the statistics indicated that we had reduced the query-load on Oracle by 34% as well as a 33% reduction in transactions-per-second to the Titan session server. Additionally, the various AA3 systems were running 26% faster.
In order to support AA3, we implemented an Oracle 11g Real Application Cluster (RAC) on two servers connected via dual 6 Gbps fiber-channels to SAN stuffed with 15k RPM SCSI drives. We use the Oracle grid-computing configuration for this implementation. One of the first problems we experienced was that when an application connects to Oracle via the OCI SDK API, every server in the RAC allocates additional system memory to handle the connection, even though the application only executes a given SQL statement on a single server in the RAC at a time. This lead to a condition where Oracle rapidly used up the 8 GB of RAM on the server and began swapping. Once the Linux swap system began executing, we experienced a radical degradation in SAN performance because of the induced IO wait caused by rapid local-swapping of actively in-use memory.
In order to immediately address this problem, we simply shutdown the cluster one server at a time to eliminate outages, and added more RAM. Additionally, we made a pass on the Oracle memory configuration so that the various buckets of memory that Oracle uses were optimized to our usage pattern. This, combined with some monitoring scripts I wrote to monitor Oracles CPU, memory, swap-space and IO allowed us to keep Oracle up and running, and we are automatically notified when a problem begins so that we can take direct action to resolve issues before they become critical.
Once of the issues that I noted in passing was that we were experiencing oddities in the latency of the local area network (LAN) of the hosting center where we run the AA3 systems. Given that the issue was intermittent, and did not appear to affect either ICMP or UDP but only TCP/IP, it was hard to create a trouble ticket in their reporting system to track down the issue. I was able to stage a conference call with the network engineering staff at the hosting center, and from my description of the problem and the diagnostics I was able to run, we were able to track down the problem to a set of faulty rules in one of the managed OSI layer 3 switches on the network.
Additionally I noted that the SNMP used by the hosting company was mis-configured, and around every 60 seconds the various servers were using 50% of the CPU capacity for a period of around 10-20 seconds in reporting on status back to the hosting company’s monitoring system. However we were able to update their SNMP polling configuration and eliminate this problem all together.
By June 19th I had noted that our overall external systems load – i.e. the game client and game server communicating with the back-end systems – resulted in an average load of 3,825 transactions per second, not including any of the web servers. The peak load noted was 10,171 transactions per second, which is very impressive considering that the majority of this load is being served up by 3 fairly wimpy 1U rack-mounted systems. For a comparison here, the average load the back-end systems support for the previous Unreal Engine 2 title America’s Army 2 is around 190 transactions per second, with a peak load of around 300 transactions per second.
This massive load as you might expect resulted in exposing a number of crash bugs in the various AA3 services. However, we were able to rapidly debug and fix these issues on the fly given that the massive load would rapidly reproduce the problem, and reached 100% stability within a few hours after launch. In order to make sure that these systems would continue to perform normally and responsively while we were not actively monitoring them (e.g. when we sleep, etc.) we created scripts that wrapped our server unit-tests that could perform automated monitoring, notification, and restart.
At this point, you might ask: why were we not able to anticipate the amount of load before the game was launched and put in place systems that can adequately handle the load? Ignoring the fact that this rather simple question is actually quite complex to answer in a technically accurate fashion, there were two reasons why we were not anticipating this level of traffic, namely: 1) in the crush to prepare the game for production release, we were never able to put together a testing or beta-testing community of players that could generate more than a fraction of the anticipated production load, and 2) there is no good way to predict how popular a given game title will be.
So in conclusion, while we experienced a number of problems with the initial launch, I’m very proud of my team and what we were able to implement for this game. I’m also happy to report that we rapidly achieved 100% stability with low-latency performance within a few days post initial launch.
Design and Development of America’s Army 3
July 10th, 2009March, 2009
Now that the Unreal Engine 3 game title America’s Army 3 (AA3) is nearly ready for release, it seems like a good time to talk about some of what my team has done to improve on what we did in America’s Army 2 (AA2). I’ll cover several subsystems that Pragmatic Solutions has provided and how they are designed to improve the gameplay experience.
First I’d like to give some background on what we did for AA2 and how it impacted the development of AA3 technology.
When we began work on AA2, the game had already been released. Its only custom component was the authentication system, which we were tasked to replace. If that went well, we’d start replacing some of the game’s off-the-shelf components with custom designed solutions. So our first requirement was to provide a new authentication service that would support the existing communication protocol, and which could be dropped in to seamlessly replace the existing system.
At that time AA2, which was built on the Unreal 2 game engine, ran on Windows, Linux, and Mac OS X, so we needed an application framework that would run transparently on all three operating systems. It also needed to be in a language that would integrate with the game engine which meant either C or C++. In order to address these issues we developed an application framework in C, code-named Andromeda, which provided the APIs that we needed to build and deploy cross-platform software.
We took a hard look at the existing messaging protocol and noticed two serious flaws. The first was that even though it was encrypted using the Blowfish algorithm, it used the same key for every message. This was compounded by the key being vulnerable to a simple dictionary attack. The second problem was that the messages were not self-descriptive, leading to many parsing errors and causing innumerable problems both on the client and server, including garbage data being inserted into the server databases.
We developed a new communications protocol that solved the existing protocol’s problems. The new messages are self-defining, which addressed the parsing error problems. Also, per the Army’s requirements, we developed a new cryptographic communications protocol that used the RSA algorithm with a trusted third-party methodology to exchange either a Blowfish or AES cryptographic key. The exchanged key is used for successive messages.
In order to use the legacy protocol for the initial rollout we created a message translation layer that converted existing messages into our new format. To use our new framework to exchange messages we created both client and server modules that supported the new protocol. Once our new framework had proven itself, we removed the message conversion code and began using the new message formats natively via a set of in-game APIs.
The server module in the framework used a transaction-management design pattern where a different message-handler function was executed for each type of message received. This design methodology has allowed us to use the same messaging protocol and accompanying server component for each new service that we have implemented. From this base we were able to create new modules for AA2, including a match-making solution, an in-game asset delivery system, and a statistics tracking system.
When we were ready to begin AA3 development, we sat down with the game developers, reviewed the systems we were providing for AA2, and discussed what problems they had and where we could develop solutions to those problems. Specifically we focused on any issues they had with our components, on what existing features they no longer needed, and on what new features they anticipated having a use for.
In order to develop the suite of components for AA3 we began refactoring our cross-platform framework by implementing a set of consolidated APIs. We code-named the new framework Arcturus. Arcturus improves on Andromeda in a number of ways, chief among them being a simplified set of functions that followed a common design pattern. This gave Arcturus a more cohesive design than what had evolved in the Andromeda code base and better prepared us for the work ahead.
One of the biggest performance issues with the AA2 architecture is the excessive overhead required to implement the cryptographic communications protocol. The processing time required was extreme enough to impact the player experience as well as game server performance. Additionally the security system implemented to monitor the exchange of messages compounds the problems by rejecting good messages from locations that have high latency, causing them to resend the messages, with all of their cryptographic overhead.
For AA3 we designed two new protocols to address this issue. The first, which we termed the high-security protocol, was a new self-descriptive format that was designed to be used via TCP/IP. This implemented a new cryptographic communications protocol via a known-key, low-latency, stream cipher, which had much lower overhead. Additionally, the encryption can be disabled for any transactions that don’t need it, resulting in even lower overhead. The second protocol, which we termed the high-performance protocol, used a new lightweight, self-descriptive format transported via UDP. The second protocol also supports multi-packet transactions for large messages.
The first system we implemented for AA3 was the authentication service. Because we were not dropping in a replacement for an existing service we were able to design from a clean slate. We started by noting that authentication transactions in AA3 are one of two types, synchronous or asynchronous. In synchronous authentication the player must wait for the authentication to complete before continuing. Asynchronous transactions are used by the game to gather information that requires authentication but that is not directly tied to player interaction. For synchronous transactions we provided an API that blocks until the transaction is completed (the communication with the server has received its response). For asynchronous transactions we implemented a message queue allowing the developers to periodically poll the queue to see if responses have been returned.
We also added new features to the AA3 service that go beyond typical authentication transactions. Once a player is logged in, their session is handed off between the various AA3 systems, so they need not login again as they move from game to web to other game-related systems. We also provide, via this service, for the management of a player’s account and game characters within that account. Additionally other information generated by AA3 services, such as statistical reports, can be accessed through this system.
The next system we implemented was an event-driven, rule-based, scoring engine, known as the Attribute Tracking System (ATS). ATS is designed to provide a mechanism for calculating player scores as well as managing player character progression. It accomplishes this by allowing the developer to define in-game events which are examined by the ATS engine. The engine uses a rule-based design, similar to a scripting language, to determine which player scores should be updated or which character attributes should be modified. This all takes place in real-time, which allows players to view their player scores and character attributes, as they change, while playing the game. Once a round of game-play is completed the set of ATS events are uploaded to the master ATS server where they are processed to create the official scores and attribute changes which are then persisted to the database. The ATS service also provides a web-based interface which allows game developers to update the rules whenever they feel it is necessary, without the need to deploy new game patches.
To further explain how the AA3 ATS system is currently being used, here is a real-world example:
Some AA3 player-attributes:
- Duty
- Respect
- SelflessService
Some AA3 events and event-attributes:
- Mission Completed
- Player’s Team
- Is Fire-team Leader
- Objective Completed
- Player’s Teammate Links
- Is player low on health
The ATS engine can update the above-shown player-attributes via rules such as the following:
- Rule “Mission Completed – Fire-team Leader Bonus”
- Event Name: Mission Completed
- Processing Priority: 10
- Qualifiers:
- Player’s Team: is on the winning team.
- Is Fire-team Leader: true
- Actions:
- Increment “Duty” by the “Mission No Man Left Behind Bonus” multiplied by the “Number of Friendlies Left Alive”.
In the above example, when a “Mission Completed” event occurs, the ATS rule engine will examine the event to determine if if matches any of the qualifying criteria for the current set of defined rules. When the criteria is met, the action of incrementing the “Duty” player-attribute will occur as a product of a bonus variable multiplied by a game-state variable.
Note that in the above rule example, the processing priority allows rules to be executed in a hierarchal fashion, based upon their priority as defined in the ATS system. Additionally, variables may be defined in the system to be used when constructing a rule – for example the “Mission No Man Left Behind Bonus” is a variable used when incrementing the player’s “Duty” score. Another variable is the “Number of Friendlies Left Alive” - which reflects the current state of the game during play. Variables may be set in two ways – one, by defining them in an ATS rule, or by setting them via the SDK API to reflect the current state of the application. The way in which the ATS is used is very open-ended, given that any number of attributes, events, variables and rules can be defined. Additionally, any number of rules can be defined for the same event, any number of qualifiers can be defined for a rule, and any number of actions can be implemented for a given rule.
In AA2 we implemented a match-making solution to replace the off-the shelf component that they started with, this new, custom system, was called MBS, for Master Browser System. Match-making, in a game, is the process of finding a server on which the player wants to play. The decision is often based on where their friends are playing, or may just be based on what servers seem popular at the moment. This system was designed around the concept of providing a filtered list of players and servers to the browsing player based on a set of constraints that they specify. The system suffered from several critical flaws, primarily the overhead of the message cryptography. Additionally, the query engine used a string parsing design pattern which was slow. Lastly it pulled from the MBS server the entire list of online servers and players, including all of their current state information, which turns out to be a huge amount of data to get all at once.
In order to provide a better performing system under AA3, we first switched to our newly designed message protocols, eliminating much of the old protocol’s transactional overhead. Additional speed was gained by the switch from TCP/IP to UDP. Then we implemented the MBS server around a search engine which eliminated the overhead caused by the old string parsing methodology. Finally, we limited the data returned by not including all the state information about the servers, allowing the client software to collect that information directly and only for the filter-reduced list of servers.
The next system we developed for AA3 was the Dynamic Content Delivery System (DCDS). The purpose of DCDS is to allow rapid delivery of data to the client systems and to reduce the load on the Army game servers. There are two main types of data that need to be delivered, static and dynamic. Static data includes such things as the game itself and any patches to the game, which are downloaded once and kept as long as the game is installed. Dynamic data is data that is loaded for use in the game, particularly for temporary use tailored to the individual player, and then, often, discarded when it is no longer needed. To allow for dynamic data, DCDS exposes APIs to the game developers that let them request new assets (such as textures or 3D objects) while the game is being played. To reduce the load on the Army servers both static and dynamic content is delivered by a peer-to-peer protocol that is loosely modeled on the commonly used “torrent” systems.
One of the important tools for enhancing future versions of any game such as AA2 and AA3, and to provide feedback to players about their performance, is to capture statistics about the way the game is played. In AA2 the game developers were required to capture statistical events and summarize them within the game code, uploading the summarized events at the end of a match. This caused problems. Game developers, as a core competency, are not statisticians, so it took several iterations to come up with the right set of information to report. Additionally, the system we developed for them only allowed the upload of known summarizations, so we had to release patches to the game itself in order to gather new information.
We implemented a new service for AA3, called the Statistics Tracking System (STS). We began by providing a simple API that allowed the game developers to notify the STS SDK (software development kit) of any in-game event that might be useful to track. We then implemented a distributed, automated, rule-based, summarization server, following the Massively Parallel Processing (MPP) design pattern. Use of the MPP pattern provides for high scalability, allowing the service to be deployed to fit the needs of the game. This server obtains a bulk upload of game events from the SDK and transforms them into intermediate statistical summaries. At optimal intervals, the intermediate summaries are uploaded to the master database repository, where they are assembled into final summary form. The set of rules, by which the MPP servers summarize events, are defined by the game developers via a web interface, and can be updated at any time without the need to patch the game. This allows the game developers to define new summarizations of game-play events after the game has been launched. They only need to patch the game itself when they determine new events that they want to track.
To further explain how the AA3 STS system is currently being used, here is a real-world example:
Some AA3 STS events and event attributes:
- Enemy Neutralized
- Mission ID
- Mission Variant
- Mission Size
- Linked
- Soldier Name
- Mission Type
- Range
- Player Name
- Target Soldier Name
- Melee Fired
- Weapon
- Mission ID
- Mission Variant
- Mission Size
- Linked
- Soldier Name
- Mission Type
- Velocity
- Player Name
- Posture
From which some rules may be defined to automatically create statistical summaries, such as:
Number of Enemies Neutralized for Live-Variant Large Size
- Event: Enemy Neutralized
- Function: Count
- Identifying Attributes
- Soldier Name
- Target Soldier Name
- Constraints
- Mission Variant: Live
- Mission Size: Large
Number of Melee Fired by Player and Weapon
- Event: Melee Fired
- Function: Count
- Identifying Attributes
- Soldier Name
- Weapon
- Constraints
- None
Average Melee Fired Velocity by Weapon
- Event: Melee Fired
- Function: Average
- Identifying Attributes
- Weapon
- Constraints
- None
- Target Attribute: Velocity
In the above examples, the following statistical summaries would be automatically generated by the STS system:
- The “Number of Enemies Neutralized for Dev-Variant Large Size” will create a count of the number of enemies neutralized on large, “Live” variant maps for each player by each neutralized enemy.
- The “Number of Melee Fired by Player and Weapon” will create a count of of the number of times a weapon is fired in a melee battle for each player by each weapon type.
- The “Average Melee Fired Velocity by Weapon” will create the average velocity of a weapon fired in a melee battle for each weapon type.
Note that as shown in the above examples, each STS rule is for a given event, and will only execute if the set of identifying attributes are set for that event and if the event meets the constraint-rule criteria. The identifying attributes allow the STS system to create summaries for the identifying attribute values found in the event. When a target attribute is specified, the rule’s function is applied to the attribute-value found in the event. Currently count, average, high, low and sum are supported function-types.
These five systems, Authentication, the Attribute Tracking System, the Master Browser System, the Dynamic Content Delivery System, and the Statistics Tracking System, are a suite of solutions that the AA3 game developers are using to help create the next generation gaming experience.
English Stemming Algorithm (2003)
June 14th, 2009English, Joshua S. (2003)
Introduction
This document describes the English Stemming Algorithm, which is intended for reducing the morphological variation of a given stream of tokens through conflation; commonly called stemming. Stemming is used in information retrieval (IR) applications to enhance performance.
Definition
The English Stemming Algorithm (ESA) uses a list of rules to reduce any given token, such as a word, to a common base. The rules are applied iteratively by matching a suffix string against token endings. The rule is flagged so that it may or may not apply to the token depending on if the token has already been modified by a previous rule. Each rule specifies removing, replacing, or adding characters to the end of the token.
The ordering of the rules is significant because a rule may be designed to act upon changes made by a previous rule. Each rule may also be of arbitrary length and consists of one or more instances of the four ESA rule components. The list of rules is processed iteratively until each has had a chance to apply itself to the token. When considering the flag of the second or greater instance in a rule, the flag only applies to changes made by the rule.
The ESA imposes the following boundary conditions. The input must be greater than two characters in length and must contain at least one vowel and one consonant. The output must be greater than two characters in length.
Rule Format
Rules consist of the following four parts:
- A suffix string (token ending) of 1 or more characters.
- A flag (y, Y, n, or N) indicating if the rule can be applied to the token. If the flag is ‘y’ or ‘Y’ and the token has already been modified by another rule or this same rule (for rules with multiple instances) the rule cannot be applied to the token.
- The number of characters to remove.
- The string to append to the token consisting of 0 or more characters.
In the reference implementation, rules were formatted with a comma character as a delimiter. For example:
“ing,n,3,”
The above rule looks for tokens ending in ‘ing’, this rule is applied regardless of token modification, three characters are removed, and no characters are appended. The rule format may be repeated to create more complex rules. For example:
“s,n,1,,ies,n,2,,si,n,1,s”
Note that the rule format is the same but is simply extended using the defined ESA format. It was noted that future implementations might use a different character to delineate the rule sections for ease of modification and readability.
Reference Implementation
The ESA was implemented and tested using a simple rule list containing 67 items that were derived from the rule set used by the Porter Stemming Algorithm (Porter, M.F., 1980).
It was noted during implementation that organizing the rules in a tree structure would lend itself to the highest performance possible. This was due to the fact that a simple tree search could be performed to determine the rule(s) to be used for any given token.
However, for the sake of simplicity the rules were simply organized into an array where they could be tested iteratively against a stream of tokens. It was found that even use of this low-performance design still resulted in quick, powerful stemming.
The ESA test performance time was 0.000058 seconds per word on a Pentium Celeron 500 MHz system. The reference implementation has currently only been compared to the Porter Stemming Algorithm which has a performance time of 0.000072 seconds per word on the same system.
Comparison
The ESA is comparable to the Paice/Husk or Lancaster Stemming Algorithm (Paice, C. Husk, G. 1980) in that it uses a series of rules. However ESA is able to use a shorter (and therefore faster) rule set than Paice/Husk due to its ability to have extended and interdependent rules.
The ESA is also comparable to the Porter Stemming Algorithm in that it uses a similar rule set to accomplish stemming. The ESA is simpler and faster because it uses a single step to process tokens unlike the multi-step process that Porter used.
ESA is superior to both the Lovins (Lovins, J.B. 1968) and Dawson (Dawson, J.L. 1974) Stemming Algorithms in the same fashion that Paice/Husk is superior to them. It is not hampered by post processing (as in Lovins) and ‘partial-matching’ (as found in Dawson). It is also not restricted to a set list of suffixes as found in both of these algorithms.
The ESA is also much faster than such inflectional linguistic morphology algorithms as Krovetz (Krovetz, B. 1993). It is also much more reliable due to the fact that the English language does not lend itself inflectional linguistic morphology because it is a weak morphological language.
Download reference-implementation source here: download









