Fundamentals
A Primer on Protein Engineering
What does it take to engineer the building blocks of life?
Proteins are responsible for most of the work done in cells. From the structure and function to the repair and regulation of tissues and organs, proteins are incredible agents of biology, often considered nano-machines. As you’re reading this, proteins in your body are firing neurons, digesting food, carrying oxygen through your bloodstream, and defending you from infection.
Each protein is designed to perform a specific task, engineered by nature and recently, by biotechnology. As a young and bright discipline, the market of protein engineering is projected to reach USD 3.9 billion by 2024. To understand current challenges and future trends, it’s important to get a grasp of how and why proteins are engineered in the first place.
Branching out of genetic engineering, protein engineering is the development of valuable proteins to get a desirable result. The goal is usually to engineer proteins with new properties or functions, which don’t exist in nature. Although these proteins are then often considered unnatural, they can actually treat natural diseases, such as cancer, autoimmune inflammations, and infections.
Read more about proteins here.
Although DNA contains the instructions to create proteins, ribosomes in the cytoplasm actually make them. The specific sequence of DNA nucleotides (A, C, G, T) determines the sequence of amino acids, which hence dictates the chemical nature and function of a protein. After the translation of the nucleotide sequences by ribosomes, amino acids form peptides and thus a complex, three-dimensional molecule, also known as a protein. A protein is a chain composed of 20 different kinds of amino acids with elaborate interactions.
Read more about protein synthesis here.
After being subjected to engineering practices, a protein should be able to fold correctly and to efficiently do the intended function.
Any change to the protein’s constituent of amino acid and that results in the modification of protein structure and design falls under protein engineering.
Protein engineers take many approaches to changing proteins in a favourable way, but they tend to use rational design, directed evolution (irrational design), or both (semi-rational design).
RATIONAL DESIGN
Rational protein design is increasingly popular due to the advancements in bioinformatics and the upgrades of software for protein analysis. For this approach, there is a need for detailed knowledge of the protein structure and function to make changes and guide its amino acids.
Rational design makes protein-sequence predictions that will fold to specific structures and have certain functionality. Thus, such approaches predict how the protein’s structure will affect its behaviour. These sequence predictions can be validated through peptide synthesis, site-directed mutagenesis, or artificial gene synthesis experiments.
Peptide synthesis is the production of a peptide between 2 amino acids, short chains of 2–50 amino acids, linked by peptide bonds. Peptide synthesis is the condensation reaction of the carboxyl group of an amino acid to the amino group of another.
Peptide synthesis has encouraged the development of epitope (the part that binds to an antigen receptor on the surface of a B cell)-specific antibodies against pathogenic proteins, the study of protein functions, and the identification and characterization of proteins. Synthetic peptides can resemble natural peptides and act as drugs against major diseases.
Site-directed mutagenesis is used to bring about intentional, targeted changes to the DNA sequence and study the structural and functional properties of a protein. When the DNA is manipulated through site-directed mutagenesis, it’s possible to study the changes in protein activity.
Artificial gene synthesis encompasses the methods that are used in synthetic biology to assemble genes from nucleotides de novo. To synthesize genes, DNA oligonucleotides must first be chemically synthesized, then assembled. As the cost to produce quality synthetic DNA lowers, more biological hypotheses can be explored and the DNA read-write cost gap can shrink.
Rational design is not only inexpensive, but also technically easy, but requires thorough structural knowledge of a protein, which is often unavailable and doesn’t provide accurate predictions on the effects of various mutations.
DIRECTED EVOLUTION
In contrast to rational protein design, directed evolution is a method that mimics natural selection to guide proteins or nucleic acids toward a particular goal. Either in vivo or in vitro, directed evolution subjects genes to multiple rounds of mutagenesis, selection, and amplification. Random mutagenesis is applied to a protein and variants with desired traits are selected.
Random mutagenesis helps generate enzymes, proteins, entire metabolic pathways, and genomes with desired properties.
In general, directed evolution presents superior results to rational design. Additionally, this approach doesn’t require prior structural knowledge of a protein and doesn’t need to predict the impact of mutations. Nonetheless, directed evolution needs high throughput screening to increase the chance of obtaining desired properties, which isn’t possible for all proteins. High throughput screening allows for the rapid testing of thousands to millions of samples for biological activity.
In 2018, this invention won Frances Arnold, a professor of chemical engineering at CalTech, a Nobel prize in Chemistry.
“Natural enzymes didn’t spontaneously appear, they evolved to allow organisms they were part of to take advantage of new conditions or food sources, to occupy new niches,” says Frances Arnold. “Recently, new enzymes have evolved to take advantage of man-made chemicals in the environment, to allow bacteria to eat herbicides or pesticides, for example. In directed evolution we provide a new niche in the laboratory, so to speak, and encourage evolution of enzymes to catalyse commercially useful reactions.”
SEMI-RATIONAL DESIGN
For semi-rational protein design, information about protein sequence, structure, and function is used with predictive algorithms to identify what most influences protein function. Mutations of amino acid residues (the part of an amino acid that makes it unique) hence create libraries of mutant proteins with enhanced properties.
APPLICATIONS
Pharmaceutical and biotechnology companies have taken an interest in protein-based drug development and heavily invested in synthetic biology. The market of protein engineering is divided into many segments, for which the amount of research and funding greatly varies.
- Monoclonal antibodies
Made in the laboratory, monoclonal antibodies mimic the immune system’s ability to fight off harmful antigens. Monoclonal antibodies can be directed against the spike protein of SARS-CoV-2 to block the virus’ attachment and entry into human cells.
There has been an increase in the demand of monoclonal antibodies for the treatment of cancer, neurological diseases, and infectious diseases.
- Insulin
Made by the pancreas, insulin controls the amount of glucose in your bloodstream, helps store it in your liver, fat, and muscles, and regulates your body’s metabolism of carbohydrates, fats, and proteins.
- Erythropoietin
Erythropoietin is a hormone that is produced by the kidneys and liver and that plays a role in the production of red blood cells by the bone marrow.
- Interferons
When several viruses are detected, interferons are made and released by host cells to defend them. Interferons notify your immune system about any germs or cancerous cells in your body.
- Vaccines
Protein vaccines are constituted of purified or recombinant antigens from a bacterium or virus. A protective immune response is triggered against the pathogen when this type of vaccine is administered.
- Colony-stimulating factors
Colony-stimulating factors control the production and some functions of granulocytes and macrophages, which are the immune cells responsible for protecting the body against infections. To treat low white blood cell levels following chemotherapy in cancer patients, colony-stimulating factors are of great importance.
- Growth hormones
Produced by the pituitary gland in the brain and then secreted into the bloodstream, growth hormones fuel childhood growth and help maintain tissues and organs.
- Coagulation factors
These are proteins in the blood that help control bleeding and work together to form a blood clot. Coagulation factors are essential and help prevent your body from losing too much blood after an injury.
- Other proteins
“Other proteins” include interleukins, transforming growth factors, epidermal growth factors, tumor necrosis factors, and stem cell factors. Interleukins, transforming growth factors, tumor necrosis factors, and stem cell factors are cytokines, which are small proteins that contribute to cell signaling. Epidermal growth factors, on the other hand, stimulate cellular proliferation.
TOOLS
Mass spectrometry is a technique that measures characteristics of individual molecules. How?
- Vaporize a small sample to allow it to move along the mass spectrometer
- Ionize the sample, so that it only has cations
- Accelerate cations through an electric field of negatively-charged plates which the ions are attracted to
- Separate ions when they move along a magnetic field, which will not greatly affect atoms with a higher mass-to-charge (m/z) ratio
- Detect ions that hit the detector
This entire process reveals the m/z ratio of each charged particle, which can then let us know more about the molecular mass. Mass spectrometry is frequently used with other spectrometric methods to corroborate the proposed protein structure and uncover the structure of peptides and chemical compounds.
X-ray crystallography identifies the molecular structure by growing solid crystals of the proteins that are studied.
Cryo-electron microscopy is used to interpret and visualize proteins at cryogenic temperatures by electron microscopy. With this technique, an electron beam bombards the sample and the resulting image is the outcome of the interaction of the sample with the beam.
Because of the high cost of the aforementioned instruments, the growth of protein engineering is, unfortunately, limited.
The popularization of computational approaches to protein engineering is in part due to novel protein design algorithms, advancements in structural bioinformatics, and the growing availability of datasets about 3D protein structures. AlphaFold2, DeepMind’s computational model, shows great promise as a tool to determine protein structure.
A better understanding of protein folding and design will undoubtedly contribute to the development of protein engineering.
Thanks for reading A Primer to Protein Engineering! If you enjoyed my article or would like to connect, you can find me on LinkedIn.
Follow Bioeconomy.XYZ, in order to learn more about all the ways biotech, is shaping the world around us.