Magic Bytes in Cybersecurity

Hackhoven
6 min readApr 9, 2024

--

Introduction

In the realm of cybersecurity, understanding the fundamental structures of files is akin to deciphering a secret code. At the heart of this cryptographic puzzle lie the enigmatic entities known as “magic bytes”, “magic numbers”, or more formally “file signatures”. These bytes, nestled within the headers of files, hold the key to identifying their formats and ensuring the security of digital systems. Join us on a journey as we unravel the mysteries of magic bytes and explore their significance in the world of cybersecurity.

magic bytes

___________________________________________________________

What are Magic Bytes?

Magic bytes, also referred to as magic numbers or file signatures, are sequences of bytes located at the beginning of a file. They serve as a unique identifier for the file’s format or type. Just like a fingerprint distinguishes individuals, magic bytes distinguish file formats in the digital realm.

File identification is crucial for software systems to interpret and handle different types of files correctly. Magic bytes play a pivotal role in this process by providing an initial clue about the file’s format. When a file is accessed, software examines its magic bytes to determine how to interpret its contents.

Here you can find list of file signatures: https://en.wikipedia.org/wiki/List_of_file_signatures

Common Magic Bytes and Their Significance

  1. JPEG (Joint Photographic Experts Group): The magic bytes for JPEG files are “FF D8 FF” at the beginning. Recognizing these bytes helps software identify JPEG images, allowing for appropriate rendering and processing.
  2. PDF (Portable Document Format): PDF files typically start with the magic bytes “25 50 44 46,” which indicate the file’s format. This knowledge enables software to open and display PDF documents correctly.
  3. ZIP (Compressed Archive): ZIP archives begin with the magic bytes “50 4B 03 04.” Identifying these bytes allows software to extract and decompress the contents of ZIP files accurately.

The Hacker’s Perspective

While magic bytes are invaluable for file identification, they also pose security risks if manipulated maliciously. Attackers can disguise malicious files by altering their magic bytes to masquerade as benign formats. This tactic aims to bypass security measures and deceive users into executing harmful files unknowingly.

Here are some common methods attackers employ to exploit magic bytes:

  1. Spoofing File Extensions: Attackers may modify the file extension of a malicious file to match that of a trusted format while manipulating its magic bytes. For example, a malicious php file (.php) may be renamed to appear as a harmless gif file (.gif), accompanied by altered magic bytes mimicking a legitimate JPEG format or even without that if the victim does not inspect the file contents.

Alternatively, if an attacker aims to upload a malicious php file and get a reverse shell, they might exploit weak filename validation techniques by utilizing extensions such as php5, phar, phtml, phps and etc.

2. Manipulating Content-Type Header: In web-based attacks, adversaries can modify the Content-Type header of HTTP requests to misrepresent the file’s format. By falsifying the Content-Type header to match a benign file type, coupled with altered magic bytes within the file’s content, attackers can deceive web servers and bypass security measures. For instance, an attacker might disguise a malicious script as an innocuous text file (Content-Type: text/plain), concealing its true nature and evading detection.

3. Altering Content-Disposition Header: Attackers can also tamper with the Content-Disposition header in HTTP responses to manipulate how browsers handle downloaded files. By specifying a misleading filename and accompanying it with a manipulated magic byte, adversaries can lure users into downloading and executing malicious files without suspicion. For instance, an attacker might forge a Content-Disposition header suggesting a harmless PDF document, despite the file containing malicious code disguised through altered magic bytes.

4. Adding Fake Magic Bytes: In more sophisticated attacks, adversaries may inject fake magic bytes into files to mislead file identification mechanisms. By appending counterfeit magic bytes at the beginning of a malicious file, attackers can deceive software into interpreting it as a benign format. For example, a malicious executable file may be supplemented with fabricated magic bytes resembling those of a harmless image or document format, enabling attackers to evade detection and execute malicious payloads undetected.

Defending Against Magic Byte Exploitation

To mitigate the risks associated with magic byte manipulation, security measures must be implemented at various levels:

  • File Integrity Checks: Employ checksums or cryptographic hashes to verify the integrity of files beyond relying solely on magic bytes.
  • Behavioral Analysis: Implement robust behavioral analysis techniques to detect anomalies in file behavior, even if their magic bytes appear legitimate.
  • Strict Input Validation: Ensure that software systems rigorously validate file inputs, irrespective of their magic bytes, to prevent exploitation of vulnerabilities.

Example source code taken from DVWA:

<?php

if( isset( $_POST[ 'Upload' ] ) ) {
// Check Anti-CSRF token
checkToken( $_REQUEST[ 'user_token' ], $_SESSION[ 'session_token' ], 'index.php' );


// File information
$uploaded_name = $_FILES[ 'uploaded' ][ 'name' ];
$uploaded_ext = substr( $uploaded_name, strrpos( $uploaded_name, '.' ) + 1);
$uploaded_size = $_FILES[ 'uploaded' ][ 'size' ];
$uploaded_type = $_FILES[ 'uploaded' ][ 'type' ];
$uploaded_tmp = $_FILES[ 'uploaded' ][ 'tmp_name' ];

// Where are we going to be writing to?
$target_path = DVWA_WEB_PAGE_TO_ROOT . 'hackable/uploads/';
//$target_file = basename( $uploaded_name, '.' . $uploaded_ext ) . '-';
$target_file = md5( uniqid() . $uploaded_name ) . '.' . $uploaded_ext;
$temp_file = ( ( ini_get( 'upload_tmp_dir' ) == '' ) ? ( sys_get_temp_dir() ) : ( ini_get( 'upload_tmp_dir' ) ) );
$temp_file .= DIRECTORY_SEPARATOR . md5( uniqid() . $uploaded_name ) . '.' . $uploaded_ext;

// Is it an image?
if( ( strtolower( $uploaded_ext ) == 'jpg' || strtolower( $uploaded_ext ) == 'jpeg' || strtolower( $uploaded_ext ) == 'png' ) &&
( $uploaded_size < 100000 ) &&
( $uploaded_type == 'image/jpeg' || $uploaded_type == 'image/png' ) &&
getimagesize( $uploaded_tmp ) ) {

// Strip any metadata, by re-encoding image (Note, using php-Imagick is recommended over php-GD)
if( $uploaded_type == 'image/jpeg' ) {
$img = imagecreatefromjpeg( $uploaded_tmp );
imagejpeg( $img, $temp_file, 100);
}
else {
$img = imagecreatefrompng( $uploaded_tmp );
imagepng( $img, $temp_file, 9);
}
imagedestroy( $img );

// Can we move the file to the web root from the temp folder?
if( rename( $temp_file, ( getcwd() . DIRECTORY_SEPARATOR . $target_path . $target_file ) ) ) {
// Yes!
$html .= "<pre><a href='{$target_path}{$target_file}'>{$target_file}</a> succesfully uploaded!</pre>";
}
else {
// No
$html .= '<pre>Your image was not uploaded.</pre>';
}

// Delete any temp files
if( file_exists( $temp_file ) )
unlink( $temp_file );
}
else {
// Invalid file
$html .= '<pre>Your image was not uploaded. We can only accept JPEG or PNG images.</pre>';
}
}

// Generate Anti-CSRF token
generateSessionToken();

?>

Conclusion

Magic bytes are the guardians of file identity in the digital realm, guiding software systems in understanding and processing diverse file formats. However, their significance extends beyond mere identification, encompassing security implications that demand vigilance and proactive measures. By understanding the secrets encoded within magic bytes, cybersecurity professionals can bolster defenses against emerging threats, ensuring the integrity and safety of digital ecosystems.

___________________________________________________________

Happy ethically hacking ❤

Note: This article is for educational purposes only.

--

--