Tuesday, 13 October 2015

Visual Studio 2015, ICU, and error LNK2005

I'll begin by saying that I'm just going to ignore the fact that I haven't written anything in nearly nine months.


While building ICU 56.1 with VS 2015, I was greeted with thousands of errors like this (also described here by someone who came across the same problem):

error LNK2005: "public: static bool const
std::numeric_limits<unsigned short>::is_signed"
(?is_signed@?$numeric_limits@...@std@@2_NB) already defined in

This is defined in <limits>, in a statement like this:

_STCONS(bool, is_signed, false);

Looking at the pre-processor output, we can see its actual definition:

static constexpr bool is_signed = (bool)(false);

If I understood the Standard correctly, this should be OK, and there should be no duplicate symbols during linking. So, I was still missing a logical cause for this.

The usual internet search for «ICU LNK2005» didn't bring anything useful, except for the link above.

Then, as I concentrated my search on LNK2005, I came across this post. The same mysterious behaviour, but now there was a plausible explanation, in a comment by MS's Stephan T. Lavavej, in a quoted post from an MSDN blog:

We recommend against using /Za, which is best thought of as "enable extra conformance and extra compiler bugs", because it activates rarely-used and rarely-tested codepaths. I stopped testing the STL with /Za years ago, when it broke perfectly conformant code like vector<unique_ptr<T>>.  
That compiler bug was later fixed, but I haven't found the time to go re-enable that /Za test coverage. Implementing missing features and fixing bugs affecting all users has been higher priority than supporting this discouraged and rarely-used compiler option.

So, after removing /Za from all projects in ICU's allinone VS Solution (Project Properties -> Configuration Properties -> C/C++ -> Language -> Disable Language Exceptions -> No), I was able to build it with no errors, on all configurations (x86/x64, debug/release).

Apparently, it's one of those rare cases where the error is actually in the compiler, not in the code.

Saturday, 31 January 2015

CA Certificates - The tale of the invisible certificate

I've been through a memory upgrade on my 5-year old PC. My goal is to set up a few VMs running simultaneously, because I need to widen my scope for experimentation. I found out my BIOS has an incompatibility with the memory DIMMs currently available, but fortunately a friend lent me 8GB, so I can start working now, while I try to sort out this mess.

As I set up each VM, I'm importing my bookmarks, so that I have my net environment available "everywhere". And I've come across a curious situation, regarding certificates.

One of the URLs I have on my bookmarks is https://www.ddo.com/forums. The first time I accessed it on Firefox, I got an error message:
Peer's certificate has an invalid signature. (Error code: sec_error_bad_signature)

Using openssl s_client, I checked that ddo.com sends only its own certificate, not the chain, so I looked up the chain in IE, and checked the intermediate CA on Firefox's certificate store. It was there, but it was a different certificate - different signature, different validity (both valid, because the validities on both certificates overlapped), different issuer, only the subject was the same.

I exported that CA certificate from IE, and ran openssl verify using each CA certificate; using the one from Firefox certificate store, I got an error; using the site's CA certificate, the validation succeeded.

So, I imported the site's CA certificate to Firefox, accessed the site, and all was well again.

Then, I checked Firefox's certificate store. And I only found the exact same certificate that was there already, and which wasn't previously validating ddo.com's certificate. Except that now it was.

And much scratching of head ensued.

Until yesterday, when discussing this at lunch with a friend, he told me the obvious: "Well, if you imported it, and the site's certificate is now correctly validated, then it must be there, even if you can't see it". And that gave me a memory jolt, to an issue I had a little more a year ago, with Sun One's web server certificate store, where we had two certificates for the same CA, but only one was visible on the web console. In order to correctly see both, I had to use certutil on the command line.

And in this case, the solutions was the same:

certutil -L -d sql:path to Firefox profile directory -n certificate subject

Which promptly listed the two certificates.

And another Mystery of the Universe was solved during a meal.

I don't understand why the GUI shows just one certificate. I'm not going to say it's stupid because it may be a reasonable decision, based on knowledge I don't have. But to completely hide the fact that a CA has two simultaneously valid certificates on the store is terribly misleading, it's definitely not what I'd call a good solution.

In the end, it was command line to the rescue... as usual.

Saturday, 10 January 2015

Visual Studio - Getting to your debugging symbols

I've recently been reformulating my lib environment. Basically, it consists of:
  • A loading zone, where I install the library sources, and run the build process. It has a folder for each lib, with each version in its own sub-folder.
  • A library root folder, with sub-roots for mingw and MSVC, where I store the files resulting from the builds. Again, each lib folder has a sub-folder for each installed version.

There's a bit more to it, but it's not important for today's post.

This reformulation has gone through several iterations, and it's still a work in progress. This time, I had the following goals:
  • G1 Getting some further progress on automating the process of building libraries from source, both release and debug versions. Actually, it was this requirement for debug versions of all the libs that led to my patch to OpenSSL with a debug configuration for mingw32.
  • G2 Correcting some bad decisions regarding the lib folders' names, especially when it comes to version numbers. The goal is to use the same number format each lib uses for its own files, where applicable.
  • G3 Clearing the landing zone after building the libs. Now, when I finish building, I run make (actually, mingw32-make or nmake) clean.

Point G3 led me on another learning experience.

The most common debug option when using MSVC seems to be /Zi, which stores the debug symbols in PDB files. As such, I haven't looked into the other options, which store debugging information in the object files themselves; I won't discuss them in this post, but if I had to hazard a guess, I'd say the behaviour is the same as the one described below.

During a debug build, the compiler/linker stores the absolute path to the PDB file on the executable files (EXE or DLL). When you fire up the debugger, as it loads these executable files, it looks up the PDB using this path, to load the symbols. If it can't find the PDB, it then goes through other steps.

On the open-source projects I've seen, I've come across two default behaviours - either the PDB files are not copied to the library folder at all (e.g., Boost); or they're copied to the folder containing the static libraries/import libraries, but not the DLLs themselves (e.g., ICU). I call these default behaviours, because there may be options for gettting a different behaviour; I've looked for these, but found none.

This, combined with point G3 above, is not-so-good news, because make clean deletes the PDB files. So, when we fire up the debugger, it cannot find the symbols for these DLLs. It never happened before because my procedure has always been:
  • build release.
  • clean up.
  • build debug.
  • don't clean up.

So, the PDBs were always available at the landing zone, i.e., the path stored on the executables. And even though some libs copied them to the lib folders, these weren't actually used when the DLLs were loaded by the debugger.

There are several options for dealing with this, including these three:
  • O1 Setting up a local symbol server, which, as I understand it, is more of a local cache to store symbol files.
  • O2 Adding each individual folder where PDB files are installed to the _NT_SYMBOL_PATH env variable, which MS debuggers use to locate symbol files.
  • O3 Manually copying the PDB files to the DLLs folders.

I believe option O1 is the best, but I'll leave it for another iteration. For now, I'll go with option O3. Like I said, this is a work in progress, and I don't feel a particular pressure to go for the optimal solution (which I'll have to test before I adopt it), I prefer getting to a working solution faster, in order to keep my self-imposed deadline (which will end this weekend).

Of course, once you have all this worked out, you still need your debugger to load the correct DLLs. You may have other versions of those DLLs on your path; with popular libraries, like OpenSSL, this is more common than you may think.

On Qt Creator, assuring that you load the correct DLLs is very simple. On Projects mode (Ctrl + 5), you switch to the Run configuration and edit the PATH on the Run Environment. Since I don't have these libraries on the PATH, I always need to do this; if you usually have your debug libraries on the PATH, you don't need to edit anything.

Visual Studio is a different sort of creature. A larger sort of creature, where everything usually takes a bit more work to find.

I knew it was on Project Properties (Alt + F7). At first, I thought it was on Configuration Properties -> VC++ Directories -> Executable Directories. When I hit help, it took me here, where we can read: "Directories in which to search for executable files. Corresponds to the PATH environment variable". Fine, that's just what we need. Somewhat later, and after a moderate amount of gnashing of the dental (not mental, mind you) persuasion, I noticed that the description on the project Property Pages was a wee-bit more complete, namely "Directories in which to search for executable files while building a VC++ project". Ah... right... building the project... as in, "not running the executable".

Then, I turned to the next obvious choice, Configuration Properties -> Debugging -> Environment. This time, I read the description before going for the help button. It is quite helpful, it says "Specifies the environment for the debugee, or variables to merge with existing environment". After reading this, I knew exactly what to do. Which was hitting the help button, and hoping the help page for this option was more useful.

Fortunately, it was. We control the PATH here, using something like this: PATH=E:\Dev\lib\msvc\openssl\openssl1_0_1j\debug\bin;E:\Dev\lib\msvc\icu\icu54_1\debug\bin;%PATH%.

And, finally, after some gnashing of the mental persuasion, I can say that Visual Studio's debugger and me are finally getting along just fine.

As they say, until next time... when I turn this into reusable project configurations, instead of having to specify these manually on every project.

Thursday, 8 January 2015

OpenSSL - Debug build configuration for mingw32

I've just submitted a patch to OpenSSL, to include a debug build configuration for mingw32, as it currently lacks one.

Not only have I done the obvious - removed optimizations and add the -g compiler option, but I've also used debug-Cygwin as a starting point to include a set of debug #defines for OpenSSL. You can find the patch here.

At first, I sent it to the openssl-dev mailing list. Later, I find out there was already an issue open for this on OpenSSL's RT.

Anyway, the patch makes the changes below.


This is the script called from the command line for configuration, which will generate the makefile.

The patch adds MSYS* to the script's platform guessing mechanism, allowing us to build on MSYS2. It wasn't really necessary for the debug configuration, but I like to have the ability to build on MSYS2.


Configure is a perl script that is called by config.

The patch adds the mingw32 debug configuration (debug-mingw), and changes the part of the script that defines the executable extension, to look for /mingw/ instead of /^mingw/, which then makes it work correctly with both mingw configurations.

As a side note, I think it would have been better to name the debug configurations as <platform-debug>, rather than <debug-platform>, but I don't know the reasons behind this, so I might be wrong.


This file is a base for the final makefile.

The patch changes platform checks from mingw and mingw* to .*mingw and *mingw*. Curious, now that I'm writing this, it's the first time I've noticed the incoherence. I'll probably go back to the testing board and change that to *ming* on both.


This file is used when we're building shared libraries (DLLs on mingw).

This is where I was more bold with the changes. The original file was always passing -Wl,-s to the linker, meaning it would clear the symbols from the object files. However, on a debug build we want to keep the symbols, that's the main reason for wanting a debug build in the first place.

So, the patch not only adds a check for .*mingw instead of mingw, it also changes the linker arguments for the debug configuration, excluding the -s.

And that's it. Not that much work, but a lot of experimenting. And, since shell script and make are not the most user-friendly debugging experiences, it has required a lot of creative experimenting.

Hope someone finds it useful.

Saturday, 27 December 2014

'Tis the Season...

... to be jolly, and grateful, and hopeful.

On the personal front...

...this year has been very positive, the only clouds on the horizon being some health issues. If I had to choose the most important points of 2014, I'd go with these:
  • The level of complicity and understanding between me and my wife has increased tremendously. I believe we achieved this by undergoing a similar change, which takes us right to the next point...
  • We both began going at life with a more relaxed attitude. It's a daily exercise to keep from slipping, but it's quite worth it. Obviously, it occasionally slips, but not only have we gotten better at identifying these slips, both on ourselves and on each other, we are also quicker to defuse these situations, often by sharing a good laugh at ourselves.
The kids are working their way through college, on what are the first steps of their own journey, their own adventure. It's time they take full control of the pen and start writing their story; that's their part. Our part is hoping everything they've taken in through the years will serve them well in getting their bearings as they set out. More than hopeful, we're confident it will.
I'm a somewhat spiritual guy, although I don't often stop to think about it. However, looking at 2014, I feel blessed; I already knew I'd found someone more understanding of my failings than I ever deserved, but this year took it to a new height. If I have so much to write about in the following point is because I was fortunate enough to find someone as understanding as my wife.
If you don't care much for mildly technical stuff, you can jump straight to the end of the post.

On the professional front...

...this has been a year where I continued a trend that began in mid-2013, namely, generalization.
When it all began in 2011, I had picked up C#. Then, because I felt I wasn't learning anything other than the language, I've added C++. Actually, my plan was adding C++, but I ended up switching to C++, and C# was left behind. The learning experience was a lot more intense, not just about the language, but also about the whole system - processor, OS, environment, external libs, debugging, assembly, and a whole lot of etc.

Then, I've accidentally set out on a task for which I found little help on the web. So, for much of what I was doing, I was on my own. While I did manage to get a working result (which is still a top-hitter on google.com and bing.com), I wasn't totally happy with it, and I suspected I'd have been even less happy if my knowledge about what I was doing wasn't so lacking. So, I've stepped back and went back to basics. And I've quickly learned that, indeed, the potential for "improvement" (i.e., correction) in what I'd done was much bigger than I had anticipated.
Then, as I began taking on more technical tasks at work, the generalization began - certificates, network, DNS, managing Linux, setting up environments according to specific constraints, managing web servers, managing Weblogic server, picking up new languages (e.g., Ruby), revisiting familiar languages (e.g., Java).
At the same time, I've taken the learning experience provided by C++ deeper - assembly debugging, system traces, building gcc from source and installing it without touching the system gcc, doing the same with perl. And I've also began getting my feet wet with Javascript (bootstrap and query.js) and Android development.
And before I noticed, I had not only changed my course, but I was quite satisfied with that change. This is where I see myself going, becoming a generalist. I love solving problems, and you need a widespread knowledge to do it; you don't always find the cause of a problem at the same level, and I don't like getting stuck for lack of knowledge. I also don't like getting stuck for lack of system access, but that's the way things are when you work in a large-ish corporation.
So, here are my New Years resolutions, a.k.a., goals for 2015:
  • Redesign my libssh2 + asio components, incorporating what I've learned in the meantime. And hoping that two years from now I may look at what I've done, say "What I was thinking??!!", and repeat this goal again, as I keep learning more.
  • Pick up a functional language, probably Haskell or Erlang. It's about time I get my feet wet on functional programming. I love what I've learned about generic programming in C++ (actually, what I'm still learning), but it's time to add something more to the mix.
  • Continue my exposure to Javascript and Android.
  • Deepen my knowledge of systems/network administration.
  • Increase my knowledge of low-level tracing/debugging. It's time to begin some serious experiments with loading up core dumps and getting my bearings, and getting more mileage out of stuff like Windows Performance Analyzer.
Too ambitious, you say? Yes, I know. I won't manage to do all of this? Probably not. Which is a good thing, otherwise 2016 would be a very boring year.

At the end of the day...

...our family wishes you and your loved ones a Merry Christmas and a Happy 2015, filled with love, joy, and peace.

Sunday, 7 December 2014

A simple class to read files

Yes, I'm olde enough to know "simple" is a misnomer.
I have a new addition to my bluesy github repo.
Part of what I do at work involves reading through user-un-friendly log files. How user-un-friendly? Just to give a small example, the cases where a "log record" is on a single line are the exception (I wish it was just the Exceptions), and there's usually no easy way to correlate the beginning/end of operations with a single grep. This means I spend a lot of time creating scripts/small programs to process those log files.
I'm not an enthusiast of shell scripting. Oh, I can persuade it to do some complex stuff, yes, but it always seems to obey me reluctantly, and I'm left with the nagging feeling that, somehow, it's laughing behind my back. Not to mention that the resulting scripts are very... spaghetti-like. I find it hard to think modularly when writing shell scripts. I had a similar issue with Perl. It's still there with Ruby, but being modular with Ruby is a lot easier for me.
However, I don't have this difficulty with C++. Or Java, or C#, or Delphi/Lazarus, for that matter. It must be some sort of mental block, because while I couldn't create a modular design in shell scripting to save my life, that comes as second nature when I'm working in C++.
So, when I have to create some kind of custom tool to process these files, most of the time I turn to C++. Thanks to C++1y/Boost, I don't take much longer to write it than I would with any other tool/language, and I definitely appreciate that when I need to change something a few weeks/months later, I can find my bearings much more quickly.
And, after creating a few very similar programs, all driven by a file-reading loop, I've figured I had enough use-cases to create...


A class... actually, a class template, that reads lines from a file. Surprising, heh?
template <typename LineMatcher = SimpleLineMatcher,
    typename LineCounter = SimpleLineCounter<unsigned long>>
class FileLineReader : private LineMatcher, private LineCounter
The concept is simple - read a line, which becomes the current line, and give the caller a way to get it (or copy it). Then, we add some nuggets of convenience:
  • Keep a counter of read lines. Implemented by LineCounter.
  • Supply matching operations, that not only perform matching on the current line, but also allow skipping lines based on matching/non-matching. The matching is implemented by LineMatcher; which is then used by FileLineReader to implement the skipping.

Why inheritance, instead of composition? Because there will be cases where LineMatcher and LineCounter have no state, and a data member is a bit of waste (yes, a tiny little bit of a waste). Can this be abused? Absolutely, but you know - protect against Murphy, not Machavelli.

Skipping lines 

The first line skipping functions I introduced were SkipMatchingLines(), which skips lines while there's a match, and SkipLinesUntilMatch(), which skips lines until it finds a match.
These functions share a similar trait - their stop condition is met after reading the line that triggers the stopping condition. Suppose we have this file:

[2014-01-01 00:00:00.000] match-1 This is line 0
[2014-01-01 00:00:00.100] match-1 This is line 1
[2014-01-01 00:00:00.200] match-1 This is line 2
[2014-01-01 00:00:00.300] match-2 This is line 3
[2014-01-01 00:00:00.400] match-2 This is line 4
Something like this

FileLineReader<> flr{kFileName};
// Process lines
can only stop when the line "... line 3" is read, because there's no way we can perform a match against a line that hasn't been read yet. This means that when we reach "// Process lines", the current line will be the first line to process, so we should "process-then-read" (do-while), rather than the more intuitive "read-then-process" (while).

I've entertained the notion of using tellg()/seekg() to rewind the file (IOW, un-read the line), but I didn't even get started, after reading about how it behaves in text mode. So, I'll stick with "process-then-read", for now.

Another common scenario I encounter is skipping a certain number of lines, usually a header. So, I've added this:
void SkipNumberLines(unsigned int number_lines);

Because we're skipping lines independently of any matching, it didn't make sense to implement this in LineMatcher; so, I've implemented it straight in FileLineReader. I'm not completely happy with this solution, but I figured it's better than no solution.

Skipping dilemmas

I also thought it would make sense to keep semantic coherence between all the line-skipping functions. So, SkipNumberLines() should stop on the first line to process, not on the last line skipped (even though it could, because we're skipping a known number of lines), just like the other functions.
No biggie, we just perform an extra read - so, for SkipNumberLines(3), we actually read 4 lines. Lovely, everything's coherent, little ponies are happy, and fairies are spreading pixie dust and sneezes throughout the realm.
The thing is... nobody expects the Spanish Inquisition... their chief weapon is surprise and SkipNumberLines(0)... their two chief weapons are surprise, an almost fanatical devotion to the Pope, and SkipNumberLines(0)... OK, among their weaponry we can find elements as diverse as... blah, blah, blah, and SkipNumberLines(0).

Yep, SkipNumberLines(0).

Let's say you have a function that calculates how many lines to skip, based on, e.g., an input argument - it could be the type of file being processed, or the phase of the moon; then, you could use it like this:

You're aware CalculateNumberLinesToSkip() could return 0 (you could, say, be processing a file without a header). And I'd venture a guess that you would probably expect that SkipNumberLines(0) would skip, you know, a number of lines kinda equal to... zero. As in, more than -1, but definitely less than 1.

Which left me with two alternatives:
  1. Treat 0 as a special case. Which would force upon the caller the "specialness" of this case.
  2. Indulge in a fanatical devotion to coherence - read one line on SkipNumberLines(0), and consequences be damned.
  3. Forget coherence, and accept that even though they're all line-skipping functions, they can have different semantics/post-conditions.

OK, enough Spanish Inquisition jokes.
I've settled for alternative 3. I'm not entirely sure it's the best option, and as I get more use out of this class, I expect to find new scenarios and patterns, which will provide me with more data to revisit this design later. But, for now, it seems to be the safest choice.

And there you have it. Quite modest, yes, but it has been saving me some boilerplate code in these last few weeks.

I don't know yet how it will evolve. I can feel an itch with regards to a text mode rewind, but I'll have to dive into streams and buffers much more deeply than I feel like, at the moment.

And right now I have other challenges awaiting... impatiently, I might add. A crash course in some obscure aspects of Weblogic administration to better diagnose socket problems without resorting to strace.


Sunday, 14 September 2014

Performance puzzle(d)

So, back to programming.
Every now and then, I pick up a programming puzzle. The goal is not only to solve the puzzle, but also the learn more about the performance of my solutions.
A few days ago, I've decided to take a shot at making a Cracker Barrel solver. Simple stuff, just brute-force your way through solving the puzzle. First, I've started with a standard 15-hole board. Then, I've moved on to a configurable board (but never smaller than 15 holes). Then, I've implemented getting the board setup (including the list of all possible moves) just from the number of holes. Then, I've decided that this list would be kept in an std::array, which meant getting this information at compile-time, which gave an excuse for a little bit of basic recursive template meta-programming lite.

Yes, I've forced feature-creep on myself. Ah, and no design optimization - e.g, I didn't use STL's bitset<N> (or even C bit-twiddling) for the board, and I've created a good-ole struct for the moves, with three integers.

I've built it with GCC (mingw32/Qt Creator) and MSVC 2013. For a board with 15 holes, both were immediate. As I've moved to 21 holes, the GCC exe took a few seconds, but MSVC took almost... 1 minute?! With 28 holes, Qt Creator takes forever, and so does MSVC.

The fact that this takes forever doesn't surprise me. As I said, I've made no optimizations. However, the difference between GCC and MSVC puzzled me. So, I turned to Windows Performance Analyzer (WPA) to figure out what was going on.

Now, I had a good guess about what was going on - I was sure that, compiler optimizations notwithstanding, there was probably a lot of copying going on. There certainly is a lot of vectors getting constructed, and I was expecting to find my main culprit along those lines.

So, I fired up WPA, loaded the trace file, selected the CPU Usage (Sampled) graph, changed it to Display table only, opened the View Editor and set the Stack (call stack) to Visible.

And this was flagged as the most time-consuming operation of all:

std::_Equal_range<GameMove const *,GameMove,int,
    bool (__cdecl*)(GameMove const &,GameMove const &)

Sometimes, life proves more interesting than anticipated. However, this time it was just me not paying attention.

When I say I've made no optimizations, I didn't even go for the most important optimization of all - an intelligent algorithm. This means that when I calculate all the valid moves after each move, I go through the entire board, including spots for which no move is possible. And I'm using sorted vectors/arrays and equal_range() to perform this search.

So, while it the result was surprising, it shouldn't really have been.

Next step - find a more intelligent way to get the list of valid moves.