Hill-DNA Encryption and Decryption: Bridging Classical Cryptography with DNA-Based Techniques

Benkaddour Racim
9 min readDec 28, 2023

--

DNA-HILL

Introduction:

In the rapidly evolving landscape of cryptography, DNA-based techniques have emerged as a groundbreaking frontier, offering unparalleled advantages in secure data storage and transmission. Harnessing the inherent properties of DNA, such as its robustness and vast information density, DNA-based cryptography presents a paradigm shift in cryptographic methodologies, transcending traditional approaches and fostering innovative applications.

A testament to the confluence of classical and contemporary cryptographic techniques is the hybrid approach that amalgamates the Hill Cipher — a seminal symmetric key encryption algorithm — with DNA concepts. This fusion, epitomized by techniques like Hill-ADN encryption, endeavors to synergize the strengths of both paradigms, engendering enhanced cryptographic robustness and resilience against potential vulnerabilities.

In this article, readers can expect to embark on a comprehensive exploration of the Hill-ADN encryption technique, encompassing its underlying principles, encryption and decryption processes, as well as its implications in the broader context of cryptographic methodologies. By elucidating the intricacies of this hybrid approach, this article aims to provide readers with a nuanced understanding of the synergistic potential inherent in the fusion of classical cryptography with DNA-based techniques.

Background:

Hill Cipher:

The Hill Cipher, devised by Lester S. Hill in 1929, represents a cornerstone in the realm of classical cryptography. Operating as a symmetric key encryption algorithm, the Hill Cipher distinguishes itself through its utilization of matrix-based transformations to encrypt and decrypt data. At its core, the Hill Cipher divides plaintext into blocks of specified length, which are subsequently transformed using a key matrix. Crucially, the key matrix employed for encryption must possess an inverse to facilitate decryption, thereby ensuring the integrity and feasibility of the cryptographic process.

DNA in Cryptography:

Deoxyribonucleic Acid (DNA), the intricate molecule encoding genetic information, has garnered attention for its potential applications in cryptography. Characterized by its robustness, vast storage capacity, and unique nucleotide sequence specificity, DNA offers a novel paradigm for secure data encoding and transmission. In DNA-based cryptographic techniques, the four nucleotide bases — Adenine (A), Thymine (T), Cytosine (‘C), and Guanine (G) — are harnessed to encode and decode information, capitalizing on the inherent properties of DNA to foster innovative cryptographic methodologies.

Hybridization: Bridging Classical Cryptography with DNA-Based Techniques:

The burgeoning interest in DNA-based cryptography has precipitated endeavors to integrate classical cryptographic algorithms, such as the Hill Cipher, with DNA concepts, giving rise to hybrid techniques like Hill-ADN encryption. This fusion seeks to capitalize on the synergistic potential inherent in the amalgamation of classical and contemporary cryptographic paradigms, thereby enhancing cryptographic robustness, resilience, and versatility.

The imperative for hybridization is underscored by the complementary strengths of classical cryptography and DNA-based techniques. While classical cryptographic algorithms offer established frameworks and methodologies, DNA-based techniques introduce novel mechanisms and capabilities that can augment existing cryptographic paradigms. By fostering interdisciplinary collaboration and innovation, hybridization endeavors aim to propel cryptographic methodologies into new frontiers, engendering enhanced security, efficiency, and applicability across diverse domains.

Methodology:

Encryption:

First Step: Conversion of plaintext to binary.

The initial phase of the encryption process involves the conversion of plaintext into an 8-bit binary representation. The function text_to_8bit_binary accomplishes this by iterating over each character in the plaintext, obtaining its ASCII value, and converting it to an 8-bit binary string using the format function.

Fig.1 Converting to 8bits

Example: For the text “foo”, the binary representation would be “011001100110111101101111”.

Second Step: Transformation to DNA form.

Subsequently, the binary data is transformed into a DNA sequence utilizing a specified mapping rule, where ‘A’ corresponds to “00”, ‘C’ corresponds to “01”, ‘G’ corresponds to “10”, and ‘T’ corresponds to “11”. The function binary_to_adn accomplishes this transformation by iterating over the binary string, grouping it into pairs, and mapping each pair to its corresponding nucleotide base.

Fig.2 Converting Binary Results to DNA form

Example: For the binary sequence “011001100110111101101111”, the DNA sequence would be “CGGTTCCGGAATT”.

Third Step: Mapping to Amino Acids Table.

The resultant DNA sequence is then mapped to an Amino Acids Table, which associates specific codons (triplets of nucleotide bases) with corresponding amino acids. The function adn_to_amino_acid iterates over the DNA sequence, extracting each codon and mapping it to its respective amino acid based on the provided Amino Acids Table.

Fig.3 Amino Acids
codon_list = [
('TCA', 'S'), ('TCC', 'S'), ('TCG', 'S'), ('TCT', 'S'),
('TTC', 'F'), ('TTT', 'F'), ('TTA', 'L'), ('TTG', 'L'),
('TAC', 'Y'), ('TAT', 'Y'), ('TAA', '*'), ('TAG', '*'),
('TGC', 'C'), ('TGT', 'C'), ('TGA', '*'), ('TGG', 'W'),
('CTA', 'L'), ('CTC', 'L'), ('CTG', 'L'), ('CTT', 'L'),
('CCA', 'P'), ('CCC', 'P'), ('CCG', 'P'), ('CCT', 'P'),
('CAC', 'H'), ('CAT', 'H'), ('CAA', 'Q'), ('CAG', 'Q'),
('CGA', 'R'), ('CGC', 'R'), ('CGG', 'R'), ('CGT', 'R'),
('ATA', 'I'), ('ATC', 'I'), ('ATT', 'I'), ('ATG', 'M'),
('ACA', 'T'), ('ACC', 'T'), ('ACG', 'T'), ('ACT', 'T'),
('AAC', 'N'), ('AAT', 'N'), ('AAA', 'K'), ('AAG', 'K'),
('AGC', 'S'), ('AGT', 'S'), ('AGA', 'R'), ('AGG', 'R'),
('GTA', 'V'), ('GTC', 'V'), ('GTG', 'V'), ('GTT', 'V'),
('GCA', 'A'), ('GCC', 'A'), ('GCG', 'A'), ('GCT', 'A'),
('GAC', 'D'), ('GAT', 'D'), ('GAA', 'E'), ('GAG', 'E'),
('GGA', 'G'), ('GGC', 'G'), ('GGG', 'G'), ('GGT', 'G')
]

Encryption Process:

Fourth Step: Hill Cipher Encryption

After mapping the DNA sequence to the Amino Acids Table, the resultant amino acid sequence serves as the input for the Hill Cipher encryption process. The Hill Cipher utilizes a key matrix for encrypting the amino acid sequence, and the process involves the following steps:

Key Matrix Formation: A key matrix is defined based on the encryption key provided. This matrix should be invertible to ensure the decryption process can retrieve the original plaintext.

Block Formation: The amino acid sequence is segmented into blocks based on the size of the key matrix. Padding may be required for the last block if it does not align with the matrix size.

Numerical Representation: Each amino acid in the block is converted to its corresponding numerical value, typically by subtracting the ASCII value of a reference amino acid (e.g., ‘A’ or ‘C’).

Matrix Multiplication: The numerical block is transformed into a matrix, and matrix multiplication is performed between the key matrix and the block matrix. The multiplication is typically performed modulo 26 to ensure the resulting values are within the range of the alphabet.

Conversion to Characters: The resultant matrix is converted back to a string of characters, which constitute the encrypted block.

Sequence Compilation: The encrypted blocks are accumulated to form the complete encrypted sequence.

Use Case Diagram:

Fig.5 Use Case Diagram for DNA-HILL

Actors:

User: Interacts with the system by providing plaintext and a matrix key.

Use Cases:

  • Input Plaintext: The user provides a plaintext input.
  • Input Matrix Key: The user supplies a matrix key for the Hill-DNA cipher.
  • Convert to Binary: The system processes the plaintext and converts it into an 8-bit binary format.
  • Convert to DNA Sequence: The binary result is further transformed into a DNA sequence representation: A (00), C (01), G (10), T (11).
  • Convert to Amino Acids: Using a codon table, the DNA sequence is translated into a sequence of amino acids.
  • Encrypt using Hill Cipher: The amino acid sequence undergoes encryption via the Hill cipher process using the provided matrix key.
  • Decrypt using Hill Cipher: The encrypted amino acid sequence is automatically decrypted using the inverse of the matrix key.

System Outputs:

  • Binary Result: The system displays the binary representation of the plaintext.
  • DNA Sequence: The DNA sequence derived from the binary result is shown.
  • Amino Acid Sequence: The sequence of amino acids obtained from the DNA sequence is displayed.
  • Encrypted Message: The system provides the encrypted amino acid sequence.
  • Decrypted Message: Post-decryption, the original amino acid sequence (plaintext) is displayed.

Class Diagram:

Fig.6 Class Diagram for HILL-DNA

The class diagram depicts a system for encrypting and decrypting messages using the Hill ADN cipher. The User provides plaintext and a matrix key for encryption. The HillADNProcessor class performs conversions: from plaintext to binary, then to DNA, and eventually to amino acids. It also manages the encryption and decryption processes using the provided matrix key. Finally, the Output class showcases the encrypted or decrypted messages.

Sequence Diagram:

The provided sequence diagram illustrates the Hill-ADN encryption and decryption process. Here’s a brief overview:

  1. Encryption Process:
  • The User provides plaintext and a matrix key to the InputProcessor.
  • The plaintext is converted into an 8-bit binary format using BinaryConverter.
  • The 8-bit binary is then transformed into a DNA sequence via DNAConverter.
  • AminoAcidTranslator translates this DNA sequence into amino acids.
  • The HillCipherEncryptor encrypts the amino acid sequence using the provided matrix key.
  • Finally, the encrypted result, along with its details, is presented back to the User.
  1. Decryption Process:
  2. Automatic Initiation: The System automatically initiates the decryption process.
  3. Receiving Encrypted Result: The HillCipherDecryptor component receives the encrypted message for decryption.
  4. Decryption Execution: Utilizing the inverse matrix key and the Hill decryption process, the HillCipherDecryptor component processes the encrypted message sequence.
  5. Decrypted Result: The output of the decryption process is the amino acid result, not the original plaintext. The system presents this decrypted amino acid sequence back to the user.

Overall, this sequence diagram depicts the flow of data and interactions between the User and various components of the system during both the encryption and decryption phases of the Hill-ADN algorithm

Summary:

The encryption process post-mapping to the Amino Acids Table involves segmenting the amino acid sequence into blocks, converting the amino acids to numerical values, performing matrix multiplication with the key matrix, and converting the resulting matrix back to characters. The modular arithmetic ensures that the encryption process operates within the constraints of the alphabet, preserving the cyclical nature of the Hill Cipher.

Key Features of Hill-ADN Cipher:

Hill-ADN Cipher integrates the Hill Cipher algorithm with DNA-based encryption, creating a hybrid cryptographic technique that combines matrix-based algebra with biological sequences.

  1. Polygraphic Substitution with DNA Sequences: Hill-ADN Cipher substitutes groups of letters with DNA sequences, leveraging the unique properties of DNA for encryption.
  2. Symmetric Key Encryption: Similar to Hill Cipher, Hill-ADN uses a symmetric key for both encryption and decryption processes, ensuring secure communication between parties.
  3. Matrix-Based DNA Transformation: The algorithm utilizes matrix-based transformations to convert plaintext into DNA sequences and vice versa, merging mathematical operations with biological sequences.
  4. Adaptability in Block Sizes: Hill-ADN Cipher offers flexibility in choosing block sizes, accommodating varying data sizes and encryption requirements.

Strengths of Hill-ADN Cipher:

  1. Enhanced Security with DNA Integration: The incorporation of DNA sequences adds a layer of security, making the cipher more robust against traditional cryptographic attacks.
  2. Biological Encryption Complexity: The utilization of DNA sequences introduces biological complexity, enhancing the encryption’s resilience against decryption attempts.
  3. Efficiency and Versatility: Hill-ADN Cipher maintains the efficiency of the Hill Cipher, ensuring rapid encryption and decryption processes suitable for diverse applications.

Weaknesses of Hill-ADN Cipher:

  1. Potential for Biological Vulnerabilities: The biological nature of DNA sequences may introduce vulnerabilities related to DNA synthesis errors or biological degradation.
  2. Complexity in Key Management: Managing the symmetric keys for both the matrix-based and DNA-based encryption components may introduce challenges in key distribution and management.
  3. Limited Understanding and Analysis Tools: The unique fusion of mathematical and biological components may require specialized tools and expertise for comprehensive analysis and validation.

Conclusion:

Hill-ADN Cipher represents an innovative convergence of mathematical cryptography with biological encryption techniques. By leveraging the inherent complexities of DNA sequences, it offers a novel approach to secure communication and data protection. While presenting enhanced security features and versatility, it necessitates a nuanced understanding and careful implementation to address potential vulnerabilities and ensure effective key management. As research advances in both cryptography and genetics, the Hill-ADN Cipher holds promising potential for further exploration and application in diverse domains, ranging from healthcare to information security.

FAQs

  1. Is the Hill-DNA Cipher unbreakable?
    No encryption method is entirely unbreakable. However, the Hill-DNA Cipher’s polygraphic nature and matrix operations add complexity that enhances security.
  2. Can the Hill-DNA Cipher be used for large blocks of text?
    While the Hill-DNA Cipher can handle larger blocks, it might become less efficient due to increased computation.
  3. What happens if the key matrix is not invertible?
    An invertible key matrix is essential for both encryption and decryption. If the matrix is not invertible, decryption becomes impossible.
  4. Can the Hill-DNA Cipher be used with languages other than English?
    Yes, the Hill-DNA Cipher can be adapted to work with other languages by mapping characters to appropriate numerical values.
  5. Are there other encryption methods that combine matrix operations?
    Yes, methods like the Playfair Cipher and the Matrix Transposition Cipher also use matrix operations for encryption.

Thank you for Reading

Happy Cryptography

--

--