Programming tidbits: 2011

Monday, September 26, 2011

memset in constructors

Once in a while we write memset in constructors to avoid typing out the long initializer list.

Typical example code is:

memset(this, 0, sizeof (*this));

This can lead to different behaviors based on whether the class/struct on which this is being done is a POD or a non-POD. POD stands for plain old data types. A POD is essentially a collection of fields.It doesn’t have any specific semantics of its own.

If the class is non-POD, memset() will cause undefined behavior.
If the class is polymorphic then, Usually the virtual table pointer also becomes 0, leading to problems.
Even with POD, there is no guarantee that memset-ting things with 0 actually sets the variables to 0 -- this is especially the case if the types are float or double. The representation of 0.0 in float/double may not be all bits equal to 0. IEEE 754 does guarantee that all bits zero is equal to 0.0 but C++ doesn’t require IEEE 754 for representing floating point numbers and other representations may not guarantee the same.

Reference

http://www.codeguru.com/forum/archive/index.php/t-430848.html

Sunday, August 14, 2011

Rounding up!

Right shifting of positive integers leads to truncation:

9 >> 1
4
9 >> 2
2

Interestingly, right shift for negative integers also leads to truncation:

-9 >> 1
-5
-9 >> 2
-3

If we combine the effects, we can get rounding up during right shift for positive integers:

-(-9 >> 1)
5
-(-9 >> 2)
3

So here is a C function for rounding up while shifting:

int rightShiftRoundUp(int x, int n){
    return - ( -x >> n);
}

Voila!

Friday, August 12, 2011

Safe bool idiom

Note: This is just a rewrite of The Safe Bool Idiom by Bjorn Karlsson in my own words.

How do you provide for boolean tests for your classes in C++?

It is trivial for raw pointers:

if (T* p = get_some_value()){

  // p is valid and not null we can use it here

}

else{

  // p is null/invalid.

}

Any rvalue of arithmetic, enumeration, pointer, or pointer to member type, can be implicitly converted to an rvalue of type bool.

Smart pointers

May be you have a smart pointer class:

typedef smart_ptr<T> TPtr;

TPtr p(get_some_value());

You might have a member function for testing in boolean contexts:

if(p.is_valid()){

  // p is valid, use it

}

else{

  // p is not valid. Take proper action.

}

Cons

Verbose
p must be declared outside the if block (the scope in which it is really used).
The name is_valid may be different for different smart ptr like classes.

Objective

It should be possible to convert existing code to make use of smart pointers from raw pointers with minimum change in code base.

The obvious solution (operator bool)

We will use a class Testable in following.

May be you would code something like this:

class Testable {

  bool ok_;

public:

  explicit Testable(bool b=true):ok_(b) {}



  operator bool() const {

      return ok_;

  }

};

We can use like following:

Testable test;

if (test)

  std::cout << "Yes, test is working!\n";

else

  std::cout << "No, test is not working!\n";

Bravo!

But what about:

test << 1;

int i=test;

A bool operator introduces nonsense operations which are legal and allowed by C++ compiler.
All thanx to implicit conversion.

Take more side effects:

Testable a;

AnotherTestable b;



if (a==b) {

}



if (a<b) {

}

A possible enhancement here:

class Testable {

  bool ok_;

  private int () const ;

public:

  explicit Testable(bool b=true):ok_(b) {}



  operator bool() const {

      return ok_;

  }

};

Next comes operator !

Remember the old C trick?

int x = !! y; // maps y to 0 or 1.

We can do something like this in C++:

class Testable {

  bool ok_;

public:

  explicit Testable(bool b=true):ok_(b) {}



  bool operator!() const {

      return !ok_;

  }

};

Now we can write:

Testable test;

if (!!test)

  std::cout << "Yes, test is working!\n";

if (!test2) {

  std::cout << "No, test2 is not working!\n";

Pros:

No more implicit conversions
No more overloading issues
Known as double-bang trick

Cons:

Not as straight-forward as if(test)

The innocent void*

We can off course write a conversion function to void*:

class Testable {

  bool ok_;

public:

  explicit Testable(bool b=true):ok_(b) {}



  operator void*() const {

      return ok_==true ? this : 0;

  }

};

This can be safely used in if(test).

But what about following?

Testable test;

delete test;

The idiom is flawed
It is possible to compare objects of different types as all have implicit conversion to void*.

Lets go for a nested class

Courtesy Don Box 1996, C++ Report:

class Testable {

  bool ok_;

public:

  explicit Testable(bool b=true):ok_(b) {}



  class nested_class;



  operator const nested_class*() const {

      return ok_ ? reinterpret_cast<const nested_class*>(this) : 0;

  }

};

Pros

Rather than using a void* we are using a nested class.
It won’t be possible to compare objects of different types.
There is no need to define the nested class.

Cons:

Testable b1,b2;



if (b1==b2) {

}



if (b1<b2) {

}

The safe bool idiom

Lets declare an unspecified nested function type:

class Testable {

  bool ok_;

  typedef void (Testable::*bool_type)() const;

  void this_type_does_not_support_comparisons() const {}

public:

  explicit Testable(bool b=true):ok_(b) {}



  operator bool_type() const {

      return ok_==true ?

          &Testable::this_type_does_not_support_comparisons : 0;

  }

};

Testable::*bool_type is typedef for a pointer to a const member function of Testable.
this_type_does_not_support_comparisons is a private function with same signature as bool_type.
We introduce a conversion operator to bool_type. We return 0 if Testable is invalid. We return pointer tothis_type_does_not_support_comparisons if Testable is valid.

The above code still allows following:

Testable test;

Testable test2;

if (test1==test2) {}

if (test!=test2) {}

Lets close the loop as follows:

template <typename T>

bool operator!=(const Testable& lhs,const T& rhs) {

  lhs.this_type_does_not_support_comparisons();

  return false;

}

template <typename T>

bool operator==(const Testable& lhs,const T& rhs) {

  lhs.this_type_does_not_support_comparisons();

  return false;

}

this_type_does_not_support_comparisons is private
operator== and operator!= are now non-members
Any attemp to call them will result in compiler error.
The error will be generated only if there is an attempt to instantiate these templates. So no extra code is going to be added in the executable.

Assignments

Write a base class which can implement this idiom as a reusable component.
Study the implementation of boost::scoped_ptr especially boost/smart_ptr/detail/operator_bool.hpp.

References

http://www.artima.com/cppsource/safebool.html

Tuesday, August 9, 2011

How to force your fellow members to use your version of a library function

So you designed your own powerful and scalable and efficient version of malloc and named it my_malloc. Now you wish to make sure that everybody in your team uses this version of malloc rather than calling the standard C library version. How would you enforce that?

#undef  malloc
#define malloc use_my_malloc_please

You just put the above code in a common header file for your project which gets included everywhere. Anybody who tries to use standard library malloc, will get a compilation error. Hmm, this is only if you are a C style programmer. In C++ off course you can override new and delete operators for a class as well as introduce your own new handlers.

Offset of an attribute within a structure

Sometimes we may need to find out the offset of a particular attribute F of a structure T in C. Here is a simple one line macro to achieve the same:

  1: # define offsetof(T, F) ((unsigned int)((char *)&((T *)0)->F))

To understand, we are typecasting address NULL to type T and then computing the address of field F. Since the compiler knows about the layout of type T, hence it can compute this value during compilation and fill in wherever this is required. This is not computed at run-time.

Monday, August 8, 2011

Clamping short to unsigned char

Here is a standard video processing problems. Your 8-bit pixel values are typically in the range 0-255. If you do some processing on them which lead their values of this range you need to bring them down to this range. Negative values are clamped to 0 while positive values greater than 255 are clamped to 255.

A typical unoptimized implementation looks like follows.

  1: unsigned char clamp(short value){
  2:   if (value < 0) return 0;
  3:   if (value > 0xff) return 0xff;
  4:   return value;
  5: }

After long time, today I ended up seeing a much nicer implementation in FFmpeg code base which basically looks like:.

  1: const unsigned char clamp(int a)
  2: {
  3:   if (a&(~0xFF)) return (-a)>>31;
  4:   else return a;
  5: }

And I think, with a single if statement, its wonderful!

Sunday, August 7, 2011

A do while in a Macro

Looks like I know very little of C. So I never thought I would ever need to use a do while loop in a Macro.

Now consider this. A typical swap function looks like following:

  1: template void swap(T& a, T& b){
  2:   T tmp;
  3:   tmp = a;
  4:   a = b;
  5:   b = a;
  6: }

As you can see that there are three different things on which the function depends, the type T and the variables a and b. How can I write a C Macro which does exactly the same thing?

  1: #define FFSWAP(T,a,b) do{T SWAP_tmp= b; b= a; a= SWAP_tmp;}while(0)

This is taken liberally from common.h in FFmpeg source code.

So we are writing a do,while loop which runs exactly once. Inside the loop we have a temporary variable created of type T (we choose a name for the temporary variable which is not expected to be taken by other variables inside the function). And rest is essentially the same 3 statements to achieve swapping.I still don't fully understand why did we need to add the do/while with this Macro. It might work without do/while also.

Please note that I don't want to say that macros are better than function templates. I just want to say that, I never knew about this macro magic.

Functional programming and GCC

I was surprised to find that its possible in GCC to specify whether a function is pure or not at compile time. A pure function is a function which has no side effects. i.e. Its output depends solely on its input arguments and it doesn't affect any other resource in the system.

GCC allows one to specify function attributes. There is a specific attribute "pure" to specify pure function. E.g.: Publish Post

int square (int) __attribute__ ((pure));

Essentially this helps in doing specific compiler optimization (common sub-expression elimination). E.g. If a piece of code calls square(2)*square(2), compiler can rewrite it in such a way that square(2) is called once and the computed value is reused in place of second call to square(2).

Functions like strlen and memcpy are very good examples of pure functions. While flose(), feof() [in a multi-threading environment], a function which may lead to an infinite loop are all examples of non-pure functions.

This is covered in detail at Function Attributes in GCC.

Wednesday, June 8, 2011

Google Chrome Layout Bug

It seems that the em calculations in Google chrome are slightly off track. Following is a small piece of HTML using DOJO toolkit’s default CSS. Its font size and line height calculations are based on http://24ways.org/2006/compose-to-a-vertical-rhythm

  1: <!DOCTYPE html>
  2: <html>
  3:   <head>
  4:   <style>
  5:   @import url("http://ajax.googleapis.com/ajax/libs/dojo/1.6/dojo/resources/dojo.css");
  6:   </style>
  7:   </head>
  8:   <body>
  9: <blockquote>
 10: Testing dojo blockquote.
 11: </blockquote>
 12:   </body>
 13: </html>

The relevant CSS is presented below:

  1: body { 
  2:   font: 12px Myriad,Helvetica,Tahoma,Arial,clean,sans-serif; 
  3:   *font-size: 75%;
  4: }
  5: 
  6: 
  7: blockquote { 
  8:   font-size: 0.916em; 
  9:   margin-top: 3.272em; 
 10:   margin-bottom: 3.272em; 
 11:   line-height: 1.636em; 
 12:   padding: 1.636em; 
 13:   border-top: 1px solid #ccc; 
 14:   border-bottom: 1px solid #ccc;
 15: }
 16: 
 17:

Now if we look at this block-quote’s layout in Firefox (4), it looks like as follows:

While the layout in Google Chrome (11.0) looks like:

You can notice that Firefox reports the padding to be 18 pixels, while Chrome reports it to be 17 pixels. Similarly, Firefox says the margin to be 36 pixels while Chrome considers it to be 35 pixels.

I think at least one of them is wrong.

Looking at the stylesheet, the computation goes like this:

font-size = 12 * 0.916 = 10.992 pixels

padding = line-height = font-size * 1.636 = 17.983 pixels

margin-top = margin-bottom = font-size * 3.272 = 35.965 pixels

Looking at computed styles, we can notice that for both Firefox and Chrome, the font-size is coming out to be 11pixels. It looks like Chrome is truncating margins and padding while it is rounding the font size.

Monday, June 6, 2011

Browser-Specific Styles with the Dojo Toolkit

I was fascinated to learn how dojo.uacss.js adds browser specific CSS classes to html element which make writing maintainable CSS straightforward by providing simple ways to write browser specific CSS customizations.

Saturday, June 4, 2011

A python script for windows which works like which command on unix

Ned Batchelder: wh.py

Monday, May 16, 2011

svn: OPTIONS of '...' could not connect to server

I had this weird problem today with SVN. I was able to browse it using the web browser. I was able to update from SVN, but when I was trying to commit something, I got an error:

svn: OPTIONS of 'http://....': could not connect to server.

This looked weird. Finally I disabled my wifi link and restored a wired ethernet connection to my LAN. After this, the commit went fine without any issues.

I am still wondering what was the real cause behind this problem.

Thursday, May 5, 2011

The Safe Bool Idiom

The Safe Bool Idiom: "Learn how to validate objects in a boolean context without the usual harmful side effects."

Saturday, February 5, 2011

Building ssl for Python 2.5.4 on Windows

I went through the instructions from this blog post:

In the end while building ssl, I faced a one last problem:


C:\MinGW\bin\gcc.exe -mno-cygwin -shared -s build\temp.win32-2.5\Release\ssl\_ssl2.o build\temp.win32-2.5\Release\ssl\_ssl2.def "-LC:\Program Files (x86)\GnuWin32\lib" -LC:\Python25\libs -LC:\Python25\PCBuild -lssl -lcrypto -lwsock32 -lgdi32 -lgw32c -lole32 -luuid -lpython25 -lmsvcr71 -o build\lib.win32-2.5\ssl\_ssl2.pyd -static
build\temp.win32-2.5\Release\ssl\_ssl2.o:_ssl2.c:(.text+0x1724): undefined reference to `_wassert'
collect2: ld returned 1 exit status
error: command 'gcc' failed with exit status 1

As you can notice, the linker is complaining about _wassert.

After some googling, I ended up opening the ssl\_ssl.c file in the ssl source package and added the following lines at the beginning (after Python.h include).


#ifdef __MINGW32__
#undef assert
#define assert(expr) ((void)0)
#endif

The build went smoothly after that.

Sunday, January 30, 2011

The Skype Problem

Skype doesn't play well by default with web developers. By default it binds to port 80 (standard HTTP port). It uses this port to listen in incoming communications from other skype users. In addition, Skype sends outgoing UDP packets from these ports.

To disable this you should open skype :

Go to File -> Options -> Advanced -> Connection
Uncheck "Use port 80 as an alternative for incoming connections" checkbox
Stop or restart Skype.

You should be good to go. Your apache/nginx web servers can now bind to port 80 without any issues.

For more information see here.

Saturday, January 29, 2011

Testing whether an object is iterable or not in Python

Following is a simple snippet for checking whether an object is iterable or not in Python:

  1: def can_loop_over(maybe):
  2:     """Test value to see if it is list like"""
  3:     try:
  4:         iter(maybe)
  5:     except:
  6:         return 0
  7:     else:
  8:         return 1

Courtesy: satchmo_utils [http://www.satchmoproject.com/]

Friday, January 28, 2011

Allocation of an array of objects lacking default constructors in C++

All of us know that in C++ when we wish to create an array of objects of some class T then T must have a default constructor. But what if, the default constructor is not available in class T? Is there any way you could create an array of objects which have parametric constructors?

Lets consider an example:

  1: class Account
  2: {
  3: private:
  4:  double m_security;
  5:  double m_principal;
  6:  float m_rate;
  7: 
  8: public:
  9:  Account(double security);
 10:  double calculateInterest(int time, float rate);
 11:  void deposit(double amount);
 12:  void withdraw(double amount);
 13:  double checkBalance();
 14: };

Can you write :
Account typeAaccount[100];

Or something like this:
Account typeBaccount[50](500);

Well second option as we all know is outrageously wrong! Nevertheless first option is wrong either...
Had there be no constructor at all, your faithful compiler would have provided a default constructor and would have made the array allocation possible...But now it will shout with a compile-time error. Well you can go all the way round, add a static variable, set its value before each array allocation and use it in your parameter-less default constructor. That is to say:

  1: class Account
  2: {
  3: private:
  4:  double m_security;
  5:  double m_principal;
  6:  static int security;
  7: public:
  8:  Account();
  9:  double calculateInterest(int time, float rate);
 10:  void deposit(double amount);
 11:  void withdraw(double amount);
 12:  double checkBalance();
 13: };
 14: double Account::security = 0.0;
 15: Account::Account()
 16: {
 17:  m_security = security;
 18:  m_principal = security;
 19: }

Now you can create the array, supplying the argument through static variable:

Account::security = 1000.0;
Account typeAaccount[100];
Account::security = 500.0;
Account typeBaccount[100];

But that said, there can be a solution as elegant as this for dynamically allocated object arrays:

  1: char* storage = new char[100 * sizeof(Account)];
  2: Account* pvui = (Account*)storage;
  3: 
  4: for(int i=0; i<100;++i)
  5: {
  6:  new(pvui + i)Account(500);
  7: }
  8: 
  9: // use the array of Account objects
 10: 
 11: // call destructors now 
 12: for(int i=0; i<100;++i)
 13: { 
 14:  (pvui + i)->~Account(); 
 15: }
 16: 
 17: // finally delete the allocated buffer. 
 18: delete[] storage;

@Shailesh: I must confess, your challenge "What's so special in this piece of C++ code" held me in awe for an hour... :)
Well if anybody like me is surprised to see the special "new", I must dive in details. That's placement new operator and takes a pointer to an already allocated memory chunk. So you can allocate a raw memory and then construct you objects over it calling the constructor in a loop. Since the objects are adjacent to each other you can use them wherever you would have used them as an array.
Happy!!!
And yes destructors can be called explicitly. Do take care and call them to clean up the mess you might have created and finally deallocate the raw memory buffer.

As easy as this!

Monday, January 24, 2011

Blogofile

Blogofile interesting tool: Blogofile is a static website compiler

Sunday, January 23, 2011

Django DRY

If you read any blog or book on Django, you will notice that they keep repeating "Don't Repeat Yourself" in all the discussions. Many a times its clear that DRY is being done but there is no need to keep chanting 'DRY' again n again in the discussions. It has sort of become a rhetoric I guess in Django world.