Worst abuse of the C preprocessor (Jim Hague 1986)
So what is this IOCCC think?
- Obfuscate: tr.v. -cated, -cating, -cates.
- To render obscure.
To darken.
2. To confuse: his emotions obfuscated his judgement.
[LLat. obfuscare, to darken : ob(intensive) + Lat. fuscare,
to darken < fuscus, dark.] -obfuscation n. obfuscatory adj
The International Obfuscated C Code Contest (abbreviated IOCCC) is a computer programming contest for the most creatively obfuscated C code, whit this rules
- write the most Obscure/Obfuscated C program within the rules.
- show the importance of programming style, in an ironic way.
- stress C compilers with unusual code.
- illustrate some of the subtleties of the C language.
- provide a safe forum for poor C code. :-)
The winner in 1986 was Jim Hague (UK) whit this code
#define DIT (
#define DAH )
#define __DAH ++
#define DITDAH *
#define DAHDIT for
#define DIT_DAH malloc
#define DAH_DIT gets
#define _DAHDIT char
_DAHDIT _DAH_[]="ETIANMSURWDKGOHVFaLaPJBXCYZQb54a3d2f16g7c8a90l?e'b.s;i,d:"
;main DIT DAH{_DAHDIT
DITDAH _DIT,DITDAH DAH_,DITDAH DIT_,
DITDAH _DIT_,DITDAH DIT_DAH DIT
DAH,DITDAH DAH_DIT DIT DAH;DAHDIT
DIT _DIT=DIT_DAH DIT 81 DAH,DIT_=_DIT
__DAH;_DIT==DAH_DIT DIT _DIT DAH;__DIT
DIT'\n'DAH DAH DAHDIT DIT DAH_=_DIT;DITDAH
DAH_;__DIT DIT DITDAH
_DIT_?_DAH DIT DITDAH DIT_ DAH:'?'DAH,__DIT
DIT' 'DAH,DAH_ __DAH DAH DAHDIT DIT
DITDAH DIT_=2,_DIT_=_DAH_; DITDAH _DIT_&&DIT
DITDAH _DIT_!=DIT DITDAH DAH_>='a'? DITDAH
DAH_&223:DITDAH DAH_ DAH DAH; DIT
DITDAH DIT_ DAH __DAH,_DIT_ __DAH DAH
DITDAH DIT_+= DIT DITDAH _DIT_>='a'? DITDAH _DIT_-'a':0
DAH;}_DAH DIT DIT_ DAH{ __DIT DIT
DIT_>3?_DAH DIT DIT_>>1 DAH:'\0'DAH;return
DIT_&1?'-':'.';}__DIT DIT DIT_ DAH _DAHDIT
DIT_;{DIT void DAH write DIT 1,&DIT_,1 DAH;}
WT…?
Think morse code when you ponder this program. Note how use of similar variables can be obfuscating! The author notes that this program implements the international morse standard
This program takes a string of ASCII characters input on the terminal and converts it to Morse code
So when we compile the program…
A lot of errors but still works (the program was written for the IOCCC and won in 1986, it is normal that some errors may appear)
Now we execute the created file
The program transforms ASCII text to morse code, Morse Code represents letters of the alphabet, numerals, and punctuation marks by an arrangement of dots, dashes, and spaces.
Now we know what he does, but not how he does it…
We need to understand macros, A macro is just a piece of code that is given a name. It is replaced in the preprocessor step of the compilation process with the value that it is given, we can do this whit this command
gcc -E file name
Let’s try to replace the macros
Now the code looks like….
….mmm that isn’t much better but let’s try to improve and comment on the code.
/* take the array of ASCII values */
char _DAH_[] = "ETIANMSURWDKGOHVFaLaPJBXCYZQb54a3d2f16g7c8a90l?e'b.s;i,d:";/* function main */
main ()
{
char *_DIT, *DAH_, *DIT_, *_DIT_, *malloc (), *gets ();/* in this loop takes the line of input and prints a new line after finish */ for (_DIT = malloc (81), DIT_ = _DIT++; _DIT == gets (_DIT); __DIT ('\n'))/* in nested for loop, check if current character is in _DAH_, if so call conversion; otherwise, print '?' */
for (DAH_ = _DIT;
*DAH_; __DIT (*_DIT_ ? _DAH (*DIT_) : '?'), __DIT (' '), DAH_++) for (*DIT_ = 2, _DIT_ = _DAH_;
*_DIT_ && (*_DIT_ != (*DAH_ >= 'a' ? *DAH_ & 223 : *DAH_));
(*DIT_)++, _DIT_++)
*DIT_ += (*_DIT_ >= 'a' ? *_DIT_ - 'a' : 0);
}/* _DAH - convert to morse code */_DAH (DIT_)
{
__DIT (DIT_ > 3 ? _DAH (DIT_ >> 1) : '\0');
return DIT_ & 1 ? '-' : '.';
}/* __DIT - print to standard output*/__DIT (DIT_)
char DIT_;
{
(void) write (1, &DIT_, 1);
}
we can see how this code works but still looks like spaghetti code, We see the three functions we expected: main
, _DAH
, and __DIT
. We also see an external variable __DAH__
, a long string. __DIT
looks like the write function _DAH
It is a recursive function, as long as the argument is a number that takes more than 2 bits to write, it calls the function again, stripping the number from its last bit. The output will be part of the argument printed as —
and .
masking for 1
and 0
, i.e. the number in binary format, and it will return the second leftmost digit. As an example, if we call _DAH(5)
, 5 being 101
in binary, it will call _DAH(2)
.
if we change the variable names, maybe it seems more familiar
char ascii_array[]="ETIANMSURWDKGOHVFaLaPJBXCYZQb54a3d2f16g7c8a90l?e'b.s;i,d:";main ( )
{char * _dot,* _dash,* _2_dot,* _3_dot,* malloc (),* gets ( );
for( _dot = malloc ( 81 ), _2_dot = _dot++; _dot == gets ( _dot ); print('\n') )
for ( _dash = _dot; *_dash; print ( *_3_dot ? convert ( * _2_dot ) : '?'), print(' '), _dash ++ )
for (* _2_dot = 2, _3_dot = ascii_array;* _3_dot && (* _3_dot != ( * _dash >= 'a' ? * _dash & 223 : * _dash ) );(* _dot ) ++, _3_dot ++ )* _2_dot+= ( * _3_dot >= 'a'? * _3_dot-'a' : 0);}convert ( _2_dot ){print (_2_dot > 3 ? convert ( _2_dot >> 1 ) : '\0');return _2_dot&1 ? '-' : '.';}print ( _2_dot ) char _2_dot;{( void ) write ( 1,&_2_dot,1 );}
in conclusion write code that looks obscure or obfuscated maybe is not a good practice, Code that overuses GOTO statements rather than structured programming constructs, resulting in convoluted and unmaintainable programs
Writing code can sometimes be the most difficult part of any software development process. If you don’t organize everything from the start — especially for big projects — the coding processes and code management afterwards may end up not just time consuming, but also a big headache.
Please follow this tips
- Use a Coding Standard
- Write Useful Comments
- Avoid Global Code
- Use Meaningful Names
- Meaningful Structures
- Try a Version Control Software
- Use a Testing Framework