Explicit type conversion makes no sense
One of the most criticized “feature” in C++ is that, a constructor which may accept one argument IS an implicit conversion function from the argument type T to the class type U. For example:
struct A {
A(int, int = 4) {}
};
void f(A) {
}
f(3); // f(A) is called
Such a “feature” is terribly bad. It looks like something designed to disable the compilers’ type checking and to mess up the overloading set. But, all the criticisms are targeting the “implicit” aspect of this feature. I haven’t seen an argument to criticize the “conversion” part.
My argument is, for short,
- Implicit conversion makes sense.
- Explicit conversion makes no sense.
“WTF?” Let me explain.
Let’s define a type as a set of all the possible values and a set of all the possible operations which deal with the values. Now, say if we have two types, int and list<long>, is that possible to convert an int n to a list<long>? Oh, maybe a list of n zeros. But wait, why zeros? And, what if I expect a {n}? See? If we allow one possible value set to be converted to an arbitrary possible value set, the semantics of the conversions are arbitrary and multiple, but the explicit conversion can only carry one semantics.
So that’s why I say explicit conversion makes no sense. To express the meaning of an arbitrary conversion, the conversion function has to be named in some way to connect the two types involved, like list<long>::from_size(n), while an explicit conversion function is only named in a meaningless way (list<long>(n)).
But in one situation, a conversion is not arbitrary, and it can be safely and implicitly established from type T to U, if the possible value set of T is a subset of U’s. Actually, C++ understands this theory, and that is why C++ allows an object of a derived class to be implicitly converted to an object of the base class, since an object of a derived class is supposed to be able to be used anywhere in place of an object of the base class.[1]
As you can see, the problem is not caused by “implicit”; the problem is caused by “conversion” — mixing the conversion semantics into the constructors is a design error in C++.
So here are my suggestions:
- Use implicit conversion for its well-defined semantics;
- Prefer named functions/factories[2] over constructors.
Notes:
[1] Unfortunately, such a mechanism is broken if the derived class has a different representation, which results in a type punning, which simply breaks the type safety. Hopefully, pointer conversions and pointer to member conversions work.
[2] Like make_shared, and “named constructor”: https://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Named_Constructor
Use array as a tuple
After variadic template being supported by VC++ Nov 2012 CTP, the feature which will blow away most of our old code is supported by all of the major C++ compilers, and the next topic is how we use it well. A frequently raised question is that, how to expand a pack of arguments, like an std::tuple, to a function. Like, instead of writing
std::lock(l1, l2);
we want
std::tuple<std::mutex, std::mutex> t; vlock(t);
Now we already know the solution: the “indices trick”[1] — to expand a variadic indices, not the arguments.
template <size_t... I>
struct indices {};
template <size_t N, size_t... I>
struct build_indices : build_indices<N - 1, N - 1, I...> {};
template <size_t... I>
struct build_indices<0, I...> : indices<I...> {};
template <typename Tuple>
using tuple_indices = build_indices<std::tuple_size<Tuple>::value>;
template <typename Tuple>
using tuple_indices = build_indices<std::tuple_size<Tuple>::value>;
template <typename Tuple, size_t... I>
void _vlock(Tuple&& t, indices<I...>) {
std::lock(std::get<I>(std::forward<Tuple>(t))...);
}
template <typename Tuple>
void vlock(Tuple& t) {
_vlock(t, tuple_indices<Tuple>());
}
The best thing with the implementation I referred is that, it works with any type supporting the “tuple-like” protocol, e.g., std::pair, std::tuple, and std::array. The last one is very interesting since it provides both the tuple-like access and the STL container interfaces.
However, std::array has a tiny problem: its size template parameter has to be explicitly specified:
std::array<std::mutex, 2> t {};
So how about to use a native array instead? A trivial and evil approach is to specialize std::get, std::tuple_size and std::tuple_element before the definition of _vlock:
// unused specializations are omitted
namespace std {
template <size_t I, typename T, size_t N>
auto get(T (&a)[N]) -> T& {
static_assert(I < N, "out of range");
return a[I];
}
template <typename T, size_t N>
class tuple_size<T[N]> : public integral_constant<size_t, N> {};
}
std::mutex t[] = { {}, {} }; // bad example...
vlock(t);
However, to open the std namespace is not permitted by the standard. If you consider these specializations to be helpful, maybe you can submit a proposal to LWG.[2]
Links:
[1] The indices trick http://loungecpp.wikidot.com/tips-and-tricks%3Aindices
[2] How To Submit a Proposal http://isocpp.org/std/submit-a-proposal
Fix unpaired perfect forwarding
To enable perfect forwarding in C++11, you need a template parameter T paired with a function template parameter T&&. Scott Meyers calls it `Universal Reference’[1] because, by combining the special deduction rule (14.8.2.1/3) and the reference collapsing rule (8.3.2/6), you will get a lvalue reference or an rvalue reference corresponding to the expression category of the actually argument of the call. However, a problem was raised on comp.lang.c++:[2]
template<typename T> void foo(T&&, T&&);
A first impression is that, T&& is still a `universal reference’. But, the call foo(1, a) fails since T con’t be deduced to both int and int&. So this is the problem of our faked `universal reference’: in C++03, there is no way to have a cv-qualified or reference type deduced for T, but now, a lvalue reference is allowed, while the type comparison rule is unchanged.
We can simulate the rule by hand:
template <typename T1, typename T2> void foo(T1&&, T2&&,
typename std::enable_if<std::is_same<
typename std::remove_reference<T1>::type,
typename std::remove_reference<T2>::type>::value>::type* = 0)
{}
However, despite of its cryptic verbose grammar, the big problem is: what if we add more parameters, like T3? A new SFINAE template is needed:
template <typename...>
struct common_deduced_type;
template <typename T>
struct common_deduced_type<T, T> { typedef T type; };
template <typename T>
struct common_deduced_type<T&, T> { typedef T type; };
template <typename T>
struct common_deduced_type<T, T&> { typedef T type; };
template <typename T>
struct common_deduced_type<T&, T&> { typedef T type; };
template <typename T1, typename... T2>
struct common_deduced_type<T1, T2...> {
typedef typename common_deduced_type<T1,
typename common_deduced_type<T2...>::type>::type type;
};
Similar to std::common_type, but without implicit conversion, just removing the lvalue reference then to compare. The trick fix the unpaired prefect forwarding pretty well,
template <typename T1, typename T2, typename T3, typename T4>
void foo(T1&&, T2&&, T3&&, T4&&,
typename common_deduced_type<T1, T2, T3, T4>::type* = 0)
{}
but a modified type comparison matching the new deduction rule can be more straightforward.
Links:
[1] Universal References in C++11—Scott Meyers http://isocpp.org/blog/2012/11/universal-references-in-c11-scott-meyers
[2] Template argument as rvalue reference https://groups.google.com/forum/#!topic/comp.lang.c++/8YUIqp300ME/discussion
A simplified transctional ScopeGuard
Andrei’s ScopeGuard[1] idiom has been implemented by lots of people with the std::function plus a lambda, though most of the time what they need is just a:
unique_ptr<FILE, decltype((fclose))> fguard(fp, fclose);
However, no matter how the ScopeGuard is implemented, it is a code smell to use it as an ad-hoc RAII replacement. The cleanup logic is always the same, and the same code should be refactored. The original article by Andrei already pointed out the main purpose of ScopeGuard: to ease the implementation of the transactional operations to gain the strong exception safety. A rollback logic is specific to a business logic, which are seldom `refactorable’.
So that is why I design the interface of a ScopeGuard like the following:
// creates an automatically named guard defer ( <exps | statements> ); // creates an guard with a <name> defer ( <exps | statements> ) namely ( <name> );
And, to support the cascading undo operations, multiple guards can be disabled within one statement:
defer (op1; op2) namely (undo1); /* operations may fail */ defer (op3; op4; op5) namely (undo2); /* more operations may fail */ undo1 = undo2 = false; // commit
Compared with Go’s `defer’ keyword, my deferxx[2] does not support `recover’ since the exception in C++ is not reenterable. Compared with Boost.ScopeExit, deferxx ships with a built-in commit flag and a cleaner grammar (especially, without the ugly extra semi-colon after the curly braces). Compared with D’s scope guard statement, deferxx is not exception-sensible, since the exception in C++ is not detectable (std::uncaught_exception() does not distinguish an exception propagating across a destructor from an exception thrown from the scope)[3]. However, I don’t think it’s a good idea to turn every failure into an exception. And, coincidentally, the success return code and errno of the C library functions are 0, so that you can just assign those values to the guards ;)
Links:
[1] Generic<Programming>: Change the Way You Write Exception-Safe Code — Forever. http://www.drdobbs.com/cpp/generic-change-the-way-you-write-excepti/184403758
[2] A Go’s defer-like syntax scope guard idiom in C++11. https://github.com/lichray/deferxx
[3] GotW #47: Uncaught Exceptions. http://www.gotw.ca/gotw/047.htm
Use non-utf8 CJK locale with XTerm
XTerm is a UTF-8 -only terminal emulator when being compiled with the wide char support and running under a non-utf8 locale (like zh_CN.GB18030), and it invokes an external program, `luit`, to perform the encoding conversion. The good part of such a design is that we only need to care about the Unicode code points. For example, XTerm’s default double-click selection is character class based, so the alpha-num and ‘.’, ‘/’ are not grouped together, but typically, we want an escape-safe shell argument to be selected. So I add the following to ~/.Xresources
XTerm*charClass: 43-47:48,58:48,64:48
to assign assign all (except ‘=’, which I don’t want it in practice) of the non-metachar in csh to the class ALNUM. However, any wide characters are also safe w/o escaping. Since we only work with the code points, a simple patch will do the trick, https://gist.github.com/2872752 , by removing the sub-classifying on the Chinese/Japanese characters.
And, obviously, a font using ISO-10646 mapping always works:
XTerm*font: -*-lucidatypewriter-medium-*-*-*-18-*-*-*-*-*-iso10646-* XTerm*boldFont: -*-lucidatypewriter-bold-*-*-*-18-*-*-*-*-*-iso10646-* XTerm*wideFont: -*-wenquanyi bitmap song-bold-*-*-*-17-*-*-*-*-*-*-* XTerm*forcePackedFont: False
Note that the ‘wenquanyi bitmap song’ font has no bold version now; this one is generated by gbdfed and only supplies 12pt/100dpi/iso10646.
The -iso10646- tags on the normal fonts enable the Unicode line-drawing charactors[1]. However, the bad part of the ‘external conversion’ design comes: luit is not just a wrapper to iconv; it’s an ISO-2022[2] interpreter at the same time, and ISO-2022 happens to conflict with vt102’s alternate charset sequences, \E-(0 and \E(B (don’t ask me why ncurses’ NCURSES_NO_UTF8_ACS option, which is already set by luit, does not work here — see the title). And XTerm’s vt102 parser does not allow some non-standard escape sequences, like ^N/^O, to enable/disable the line-drawing characters.
So we need to force XTerm to recognize a customized escape sequences pair, like
# ~/.termcap # This requires a hack to the XTerm code, to make it accept DLE as ESC. xterm-iso2022|XTerm CJK:\ :eA@:as=^P(0:ae=^P(B:tc=xterm:
while using a customized
XTerm.termName: xterm-iso2022
Fortunately, XTerm’s textbook-style, state transition table based vt102 parser is easy to hack: https://gist.github.com/2892430 .
Special note to the FreeBSD users: If you use sudo, you have to link your .termcap to /root/ — the ncurses on FreeBSD reads the termcap info from the kernel, not /etc/termcap; only the user-supplied ones work.
Links:
[1] Box-drawing character: Unix, CP/M, BBS. https://en.wikipedia.org/wiki/Box-drawing_character#Unix.2C_CP.2FM.2C_BBS
[2] ISO/IEC 2022 character sets. https://en.wikipedia.org/wiki/ISO-2022#ISO.2FIEC_2022_character_sets