The Value of Experience, and the Value of not Remembering Everything

Sep 19, 2013

Our last technical article was very popular, especially on our Facebook page. So here’s more information about how we run our technology at Trade-Ideas.

What’s the difference between an experienced programmer and a novice? In particular, what sets apart the person who’s been working with a certain programming language, or other tools, from the person who just started? Some skills and intuitions will easily transfer from one programming language to the next. But someone who’s been working in a specific environment for a while will have an advantage over someone new, even if the second person has more experience with programming in general.

The difference is the custom tool set. Over time each successful programmer will build up his own library of custom tools. There are things that you might think should come with the compiler or operating system, but don’t. The third or forth time you solve a problem, you ask yourself why you keep starting from scratch. You move the code into your own little library. These things start to take on a very personal feel, because they’re all customized by the person who’s using them.

In our case, it all starts with MiscUtil.h. We build a lot of high performance server code at Trade-Ideas. The back end is mostly written in C++ on Linux. That’s a proven platform. It’s fast, reliable. But sometimes it feels like it’s missing something. Lots of somethings. MiscUtil is at the base of what we’ve added. This is very generic code that can be imported into any number of different programs. Some files are more specific. They deal with databases, networking, etc. MiscUtil is at the base of the tool stack.

I’m a big fan of STL. It gives C++ a lot of the best features of scripting languages. It takes care of the memory management for you. I almost never directly call “new” or “delete”. That problem has been solved. STL keeps me from leaking memory. That’s especially important in the server room. Some of our servers can run for months or even years before they are rebooted. And some of them process thousands of events per second. Even a small memory leak — the kind that you might ignore in a desktop program — would build up and cause problems in the server room.

In fact, one of the first lines in MiscSupport.h is:

typedef std::map< std::string, std::string > PropertyList;

A “property list” maps one string to another. 20 years ago, this was popular, but only in “slow” languages like scripting languages. Now I use this everywhere. Imagine a simple web query, like this:

https://trade-ideas.54solutions.com/StockInfo/_TopListResult.html?sort=MaxRV&WN=Test&MinPrice=5&MinVol5D=100000&XN=on&X_NYSE=on&show0=Price&show1=TV&show2=FCD

The software will ask questions like:

What do we “sort” by? The answer is “MaxRV” That is, the stock with the highest relative volume on top.
What’s the first column to show? Look for “show0”, and you you’ll see the field name is “Price”.
Should we see stocks from the NYSE? Look for “X_NYSE” and see that maps to “on”. So this exchange has been turned on.

This type of arrangement is popular in web software, because it is so simple and so powerful. But we also use it in the C++ software. In fact, we allow you to use the exact same configuration string whether you are on the web server or our propriety C++ software. That’s useful if you switch between the two different environments. (More details on configuration strings.) This is one of so many uses of a “PropertyList” in our code.

That example describes the way we set up a new request. But we use STL maps even in tight loops, when performance is important. Every time we get a new event or perform any interesting action, we record it in a map. We need this data to fine tune our software. And we need to use an STL map to store the data because these data structures are so flexible.

An STL map is very nice, but it’s incomplete. One of the trickiest things is to pull a value out of the map without changing it. Let’s go back to the top list configuration, described above. What if I ask, what’s the maximum price of a stock that a user wants to see. The simple version of that code looks like config[“MaxPrice”]. In this case, the user didn’t specify a maximum price, so the result I get back is blank. That’s correct, but there’s a problem. We have now changed the object that stores the configuration. We are now storing the word “MaxPrice” even though the result is blank. That’s useless and it’s using resources, but that’s just how C++ works.

How bad is that? This exact example might not be too bad. We have a finite number of filters, so the configuration object won’t keep growing forever. And typically a configuration object is only used for a short time before we throw it out. But we keep some things longer, possibly forever. And we can have a lot more inputs. Imagine we have a map telling us the password associated with each user name. Again, this is server software. It might run for months or even years before being restarted. And the inputs come from the internet, so we can have any number of inputs. Imagine every time someone tries to log in, and they don’t have an account, we will store the bad username in memory forever. That’s not what we want!

So instead of config[“MaxPrice”] we call our custom function, getPropertyDefault(config, “MaxPrice”), to look up a value in that table. This gives us similar results, but without storing the bad values.

We have, in fact, five versions of this function. They are all very similar, but it’s C++ so you need to handle different situations. Some versions use pointers and some don’t. One is exactly like another, except for the word “const” sprinkled liberally through it. That’s C++. It always takes a little more effort than you’d expect to write a reusable library. But it’s worth it. These functions are called in hundreds of places in our code.

What happens if you request an item that doesn’t exist? If you use the non-pointer version, you get a default value. You get something like 0 or blank. That’s a common theme in our code. When you look for something that doesn’t exist, you get a reasonable value and the software doesn’t crash. You don’t have to check every time for a missing value; that would be too much work. And we certainly don’t throw and exception or crash the program when we don’t find a value that we’re looking for. That might work for a desktop program, but we can’t have the server crash when we get an unexpected value.

What about the pointer version? That returns NULL if we can’t find the value we’re looking for. That’s another common theme in our C++ software. It allows us to write some beautiful code, like this:

if (UserInformation *info = getProperty(allUsers, username))

{

// do something with info

}

This is nice because we have the variable called “info” and that variable exists exactly when it has good data. You won’t get into the body of the if statement if we couldn’t find any info for that user. And you can’t try to use the word “info” if you are not in that area; the compiler would tell you that your code makes no sense. Beautiful! As all good code should be.

Here’s the actual code we use for looking up a value in a map. I hope you find it as useful as we do.

// This looks up a value in a map and returns a default value. The most

// important difference between this and [] is that this function never

// modifies the map. [] will create a new entry if the value you are looking

// for does not exist. This can cause a type of memory leak, if random values

// are continuously added.

template< class KeyType, class ValueType >

ValueType getProperty(std::map< KeyType, ValueType > const &properties,

KeyType propertyName,

ValueType defaultValue)

{

typename std::map< KeyType, ValueType >::const_iterator item =

properties.find(propertyName);

if (item == properties.end())

{

return defaultValue;

}

else

{

return item->second;

}

// This is similar to the function above, but this only works on pointers, and

// it always returns a pointer to the value in the map, or it returns NULL if

// the item is not in the map. If the data type is already a pointer and

// cannot be null, then the previous version makes a little more sense. This

// version would give you a pointer to a pointer if the value type is a

// pointer.

template< class KeyType , class ValueType >

ValueType const *getProperty(std::map< KeyType, ValueType > const &properties,

KeyType propertyName)

{

typename std::map< KeyType, ValueType >::const_iterator item =

properties.find(propertyName);

if (item == properties.end())

{

return NULL;

}

else

{

return &(item->second);

}

// This is similar to the function above, but the table and the resulting

// pointer are not const. This is the only version of the three that lets you

// modify a value in the table.

template< class KeyType , class ValueType >

ValueType *getProperty(std::map< KeyType, ValueType > &properties,

KeyType propertyName)

{

typename std::map< KeyType, ValueType >::iterator item =

properties.find(propertyName);

if (item == properties.end())

{

return NULL;

}

else

{

return &(item->second);

}

// This is similar to the first version of getProperty. This version,

// however, creates a default value based on the type of the map. This

// is mostly used when the value type is a pointer. In this case the

// desired default is almost certainly NULL. However, the compiler has trouble

// trying to match NULL to that template for some reason.

template< class KeyType , class ValueType >

ValueType getPropertyDefault(std::map< KeyType, ValueType > const &properties,

KeyType propertyName)

{

return getProperty(properties, propertyName, ValueType());

}

// This rather ugly specialization is required when you have a map with

// std::string as the key type, and a literal string constant as the key.

template< class ValueType >

ValueType getPropertyDefault(std::map< std::string,

ValueType > const &properties,

char const *propertyName)

{

return getPropertyDefault(properties, std::string(propertyName));

}

Again, this was a look at some of the most basic code that we use to run our servers. This is the base on which so many other things are built. We will keep exploring, and I’m sure you’ll see more examples of this code.

The Value of Experience, and the Value of not Remembering Everything

The Value of Experience, and the Value of not Remembering Everything

Sep 19, 2013

Start Today