Saturday 20 July 2013

Measuring execution time - A tale of two systems

As I mentioned previously, I've been following Alexander Stepanov's A9 lectures.

After solving my previous problems, I was ready to run some tests. Before testing my version of minmax_element(), I've decided to test both the lecture's version and std's version.

The test shows the number of comparisons and time elapsed between two different strategies:
  • Using a simple approach, function minmax_element_simple(), shown below.
  • Using minmax_element().

template <typename I, typename Compare> // I is ForwardIterator
inline
std::pair<I, I> minmax_element_simple(I first, I last, Compare comp)
{
    return std::make_pair(course::min_element(first, last, comp), 
        course::max_element(first, last, comp));
}


Let's just say the results were not exactly as expected. minmax_element() performed 25% less comparisons. However, using uint64_t, it always took longer; how much longer depended on compiler (I've tested in GCC 4.7 and VC 11). Sometimes it took 35%-45% longer.

Unable to figure out what exactly was going on, I then sent Alexander Stepanov an e-mail. He was kind enough to answer, and told me it had to do with the cost of comparisons. As the cost of comparisons rises, the implementation that performs more comparisons will be penalized.

And that reminded me I had just the right class to run some tests, in my SSH code. Namely, this:

struct SSHSessionKey
{
    explicit SSHSessionKey(std::string const& pHost = "NO_HOST", 
        std::string const& pUser = "NO_USER"
        std::string const& pPort = "22")  
    : host(pHost), user(pUser), port(pPort) {}
 

#ifndef _MSC_VER
    ~SSHSessionKey() = default;
    SSHSessionKey(SSHSessionKey&&) = default;
    SSHSessionKey& operator=(SSHSessionKey&&) = default;
    SSHSessionKey(SSHSessionKey const&) = default;
    SSHSessionKey& operator=(SSHSessionKey const&) = default;
#endif
 
    bool operator<(SSHSessionKey const& other) const;
 

    std::string host;
    std::string user;
    std::string port;
};
 

// THE COMPARISON IS PERFORMED BY ATTRIBUTING A WEIGHT TO EACH FIELD: 
// HOST > PORT > USER.
inline
bool SSHSessionKey::operator<(SSHSessionKey const& other) const
{
    return (this->host < other.host) || (this->port < other.port) ||  
        (this->user < other.user);
}


Definitely, a higher cost than comparing two numbers.

So, after changing the code (and setting a lower limit, because 128 * 1024 * 1024 proved to be a bit too much for my system), I was able to verify that, indeed, for non-trivial cases, minmax_element() performs better. In this particular case, the average improvement was around 33%.

As Alexander Stepanov wrote in his reply, "Timing is tricky". On his system, e.g., he got different results from mine, even with uint64_t. Which drove home something I've been considering for some time.

You should always have a non-production system that's identical to your production system. And whenever you need to measure the efficiency of your software, that's where you should do it. Compile, deploy, and run it just as if it's the real thing. Yes, I know, instrumental code may have an impact, may introduce differences. Nothing's perfect. But while we have to live with the imperfections we can't remove, I see no reason why we should put up with those we can remove.

The way compilers/OSs/processors impact - and rearrange - our code's execution, we have to pay attention to the system where we measure performance; small differences between systems may translate into huge surprises when we finally run our code in production.

Thursday 18 July 2013

Back to libssh2/boost asio

I'm coming back to my libssh2/boost asio code, after taking a (long) break.

Why the break? I didn't like some parts of the design, but I wasn't able to find alternatives. And, more importantly, while I got to a working design in a short time, as I began improving that code, I found my knowledge of the language lacking, so I've decided to pursue other problems, in order to broaden that knowledge.

Now, as I return, my first task is getting a cleaner design. Next, I will be finishing the logging to put the code on GitHub.

I'll be discussing the classes SessionHandle and RemoteSessionHandle in this post. These classes are a bit different now, but the main concepts remain. I'll also talk about the class SessionConnection, which I haven't presented yet.

My first stumbling block appeared when I added move semantics to my SSH session classes.

RemoteSessionHandle has a reference (a pointer, actually) to an instance of SessionHandle. As I said at the time, that's "because the libssh2 calls need the LIBSSH2_SESSION* stored in SessionHandle", i.e., in order to do its work, RemoteSessionHandle needs a LIBSSH2_SESSION*, and this is owned by SessionHandle. Therefore, a RemoteSessionHandle instance needs a SessionHandle instance from which to get the LIBSSH2_SESSION*.

When creating an SSH session on a remote host, I create an instance of SessionConnection, which has, among its data members, an instance of SessionHandle and an instance of RemoteSessionHandle.

After the SessionHandle is successfully constructed, RemoteSessionHandle will be constructed, taking SessionHandle as an argument, and storing its address in a pointer.

So, SessionConnection is performing a "sort-of" dependency injection here. Since RemoteSessionHandle doesn't own this SessionHandle instance (so, no delete necessary), and it needs to be nullable (for the move operation), I've decide to use a naked pointer.

However, I ran into a problem with move. I'l try to demonstrate using my amazing artistic skills. Here's what we have before the move happens:

movedFromSessionConnection           movedToSessionConnection
+------------------------------+     +----------------------------+
| movedFromSessionHandle       |     | movedToSessionHandle       |
|            ^                 |     |            ^               |
|            |                 |     |            |               |
| movedFromRemoteSessionHandle |     | movedToRemoteSessionHandle |
+------------------------------+     +----------------------------+


Both SessionHandles and both RemoteSessionHandles are members of SessionConnection, and each RemoteSessionHandle points to its own SessionHandle instance, and all's well and jolly. Now, this is what we get with a "regular" move ctor, i.e., a move ctor that moves all the data members:

movedFromSessionConnection --- mv ---> movedToSessionConnection
+------------------------------+     +----------------------------+
| movedFromSessionHandle ---- mv ----> movedToSessionHandle       |
|            ^ -----------------------------------+               |
|                                                 |               |
| movedFromRemoteSessionHandle - mv -> movedToRemoteSessionHandle |
+------------------------------+     +----------------------------+


movedToRemoteSessionHandle received the contents of movedFromRemoteSessionHandle, including the pointer to movedFromSessionHandle. Which, after being moved from, is in an unknown state. So, the way this is designed, the move operation shouldn't move the pointer. Which is the bit I don't like very much.

I have considered several alternatives, and I've asked here, trying to get some more. However, I believe I was looking at the problem from the wrong angle - while there are several ways to perform this injection, in the end, it all leads to the same question:
How is RemoteSessionHandle going to store its dependency?

There are two options:
  • Store it as a pointer
The current problem doesn't come from storing a pointer, but rather from the scope of what's being pointed to. If SessionHandle's scope was decoupled from SessionConnection's (meaning, if SessionHandle could outlive SessionConnection), I wouldn't have this problem. OTOH, since these are auxiliary classes for SessionConnection, I like the fact that their scopes are coupled.

  • Store it as a value
None of these classes are copyable. Initially, I considered I'd never implement copy; now, I believe I may, someday, implement it. Still, "no copy" means the only way for RemoteSessionHandle to store its dependency by value is to remove sessionHandle from SessionConnection and place it in RemoteSessionHandle, as a data member. This also means that RemoteSessionHandle becomes the provider of the LIBSSH2_SESSION*, e.g., for creating the channel when executing commands.

For now, I believe the second option is the cleanest, even though I don't like having to use RemoteSessionHandle to get to the LIBSSH2_SESSION*. However, since I can't really come up with a rational reason for it, this is what went for.

I've left an issue to address at a later date: We're not supposed to perform a move while a command is executing. The channel classes don't have either copy nor move ctor/assignment. I may change this in the future.