Thursday 18 July 2013

Back to libssh2/boost asio

I'm coming back to my libssh2/boost asio code, after taking a (long) break.

Why the break? I didn't like some parts of the design, but I wasn't able to find alternatives. And, more importantly, while I got to a working design in a short time, as I began improving that code, I found my knowledge of the language lacking, so I've decided to pursue other problems, in order to broaden that knowledge.

Now, as I return, my first task is getting a cleaner design. Next, I will be finishing the logging to put the code on GitHub.

I'll be discussing the classes SessionHandle and RemoteSessionHandle in this post. These classes are a bit different now, but the main concepts remain. I'll also talk about the class SessionConnection, which I haven't presented yet.

My first stumbling block appeared when I added move semantics to my SSH session classes.

RemoteSessionHandle has a reference (a pointer, actually) to an instance of SessionHandle. As I said at the time, that's "because the libssh2 calls need the LIBSSH2_SESSION* stored in SessionHandle", i.e., in order to do its work, RemoteSessionHandle needs a LIBSSH2_SESSION*, and this is owned by SessionHandle. Therefore, a RemoteSessionHandle instance needs a SessionHandle instance from which to get the LIBSSH2_SESSION*.

When creating an SSH session on a remote host, I create an instance of SessionConnection, which has, among its data members, an instance of SessionHandle and an instance of RemoteSessionHandle.

After the SessionHandle is successfully constructed, RemoteSessionHandle will be constructed, taking SessionHandle as an argument, and storing its address in a pointer.

So, SessionConnection is performing a "sort-of" dependency injection here. Since RemoteSessionHandle doesn't own this SessionHandle instance (so, no delete necessary), and it needs to be nullable (for the move operation), I've decide to use a naked pointer.

However, I ran into a problem with move. I'l try to demonstrate using my amazing artistic skills. Here's what we have before the move happens:

movedFromSessionConnection           movedToSessionConnection
+------------------------------+     +----------------------------+
| movedFromSessionHandle       |     | movedToSessionHandle       |
|            ^                 |     |            ^               |
|            |                 |     |            |               |
| movedFromRemoteSessionHandle |     | movedToRemoteSessionHandle |
+------------------------------+     +----------------------------+


Both SessionHandles and both RemoteSessionHandles are members of SessionConnection, and each RemoteSessionHandle points to its own SessionHandle instance, and all's well and jolly. Now, this is what we get with a "regular" move ctor, i.e., a move ctor that moves all the data members:

movedFromSessionConnection --- mv ---> movedToSessionConnection
+------------------------------+     +----------------------------+
| movedFromSessionHandle ---- mv ----> movedToSessionHandle       |
|            ^ -----------------------------------+               |
|                                                 |               |
| movedFromRemoteSessionHandle - mv -> movedToRemoteSessionHandle |
+------------------------------+     +----------------------------+


movedToRemoteSessionHandle received the contents of movedFromRemoteSessionHandle, including the pointer to movedFromSessionHandle. Which, after being moved from, is in an unknown state. So, the way this is designed, the move operation shouldn't move the pointer. Which is the bit I don't like very much.

I have considered several alternatives, and I've asked here, trying to get some more. However, I believe I was looking at the problem from the wrong angle - while there are several ways to perform this injection, in the end, it all leads to the same question:
How is RemoteSessionHandle going to store its dependency?

There are two options:
  • Store it as a pointer
The current problem doesn't come from storing a pointer, but rather from the scope of what's being pointed to. If SessionHandle's scope was decoupled from SessionConnection's (meaning, if SessionHandle could outlive SessionConnection), I wouldn't have this problem. OTOH, since these are auxiliary classes for SessionConnection, I like the fact that their scopes are coupled.

  • Store it as a value
None of these classes are copyable. Initially, I considered I'd never implement copy; now, I believe I may, someday, implement it. Still, "no copy" means the only way for RemoteSessionHandle to store its dependency by value is to remove sessionHandle from SessionConnection and place it in RemoteSessionHandle, as a data member. This also means that RemoteSessionHandle becomes the provider of the LIBSSH2_SESSION*, e.g., for creating the channel when executing commands.

For now, I believe the second option is the cleanest, even though I don't like having to use RemoteSessionHandle to get to the LIBSSH2_SESSION*. However, since I can't really come up with a rational reason for it, this is what went for.

I've left an issue to address at a later date: We're not supposed to perform a move while a command is executing. The channel classes don't have either copy nor move ctor/assignment. I may change this in the future.

6 comments:

  1. Hello Paulo,

    I wanted to ask if you finally posted the code on Github. I would be very interested in it because I want to integrate libssh2 together with boost::asio which is already in use in my poject.

    Thanks,
    Mathias

    ReplyDelete
    Replies
    1. Hello, Mathias.

      I ended up not posting the code. As I reviewed it, and as I learned more about boost::asio, I realized the design was neither correct nor robust.

      The use case where I can guarantee it works is as follows:
      1. Connect to remote server via SSH.
      2. Execute command/script on remote server
      2.1. This step can be repeated several times, but it maintains no state - e.g., if you change to a different directory to execute a script, you must do it in the same client -> server request, something like "cd ./some/dir ; ./run_some_task.sh"
      3. exit

      After I got this use case working, I began a deeper examination of asio's patterns, in order to improve the design. However, shortly after that, I started taking on tasks at work that required me to learn a wider range of technology, and that hasn't really stopped yet. I love it, but it also means less time to dedicate to my personal work.

      If you can wait for 2-3 weeks, I can go back to the code and post the last stable version on Github. If your scenario fits what I described above, you can probably use it "as is". If your knowledge of asio is solid, I'm sure you can find plenty of ways to improve on my design.

      Thanks.
      Paulo Caetano

      Delete
    2. Hello Paulo,

      thank you very much for your response! I would appreciate very much if you could share your code, but only if you don't have to invest too much time... The use-case is exactly what I want to do.
      By the way: What have you used instead?

      Thanks,
      Mathias

      Delete
    3. No worries, Mathias. It's not so much a matter of investing too much time, but finding the time to do it, that's why I'm asking for 2-3 weeks. I expect it won't take more than 2 hours of actual work, but I'm in the middle of a particularly hectic month, which makes it all more unpredictable.

      I ended up using this code, because, despite its flaws, it worked for the intended use case, and it allowed me to automate a few manual tasks.

      Delete
    4. I've posted a basic example here.

      I've tested it with Qt Creator on MSYS2, using their packages.

      Tell me if you run into any problem. I won't be able to check this again until next Tuesday.

      Delete
    5. Great - thank you very much!
      /Mathias

      Delete