A reference guide to the core concepts in bioinformatics, bridging biology and computation.
The series of events that take place in a cell that cause it to divide into two daughter cells.
The ability of cells to receive, process, and transmit signals with their environment and with themselves.
The process where a cell changes from one cell type to another (e.g., a stem cell becoming a muscle cell).
Proteins are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Their function is determined by their 3D shape.
Enzymes are proteins that act as biological catalysts. Binding refers to the interaction between the enzyme and its substrate.
The physical contacts with high specificity established between two or more protein molecules.
Folding is the physical process by which a protein chain acquires its native 3-dimensional structure. Misfolding can lead to inactive or toxic proteins.
The flow of genetic information: DNA makes RNA, and RNA makes Protein.
A codon is a sequence of three DNA or RNA nucleotides that corresponds with a specific amino acid.
The process by which information from a gene is used in the synthesis of a functional gene product.
Proteins produced by the immune system to identify and neutralize foreign objects like bacteria and viruses.
The immense variety of antibodies and T-cell receptors produced by the immune system to recognize a wide range of pathogens.
The complete set of genes or genetic material present in a cell or organism.
Methods used to determine the exact sequence of bases in a DNA molecule (e.g., Illumina, Nanopore).
Differences in DNA sequence between individuals or populations.
A statistic used in breeding and genetics that estimates the degree of variation in a phenotypic trait in a population that is due to genetic variation between individuals in that population.
An observational study of a genome-wide set of genetic variants in different individuals to see if any variant is associated with a trait.
A collection of molecular regulators that interact with each other and with other substances in the cell to govern the gene expression levels of mRNA and proteins.
A linked series of chemical reactions different occurring within a cell.
A system structure that causes output from one node to eventually influence input to that same node.
A way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity.
A diagram that represents the evolutionary relationships among organisms.
Cancer is a disease caused by uncontrolled division of abnormal cells in a part of the body.
A genetic problem caused by one or more abnormalities in the genome.
Diseases caused by a combination of genetic, environmental, and lifestyle factors.
Bioinformatics relies on algorithms to process biological data. Here are examples of standard operations using TypeScript.
Transcription replaces Thymine (T) with Uracil (U).
const transcribe = (dna: string): string => dna.replace(/T/g, "U");
// Example
const dna = "ATCGATCG";
console.log(transcribe(dna)); // AUCGAUCG
DNA is double-stranded. The reverse complement finds the sequence of the opposite strand. pairs: A-T, C-G.
const reverseComplement = (dna: string): string => {
const pairs: Record<string, string> = { A: "T", T: "A", C: "G", G: "C" };
return dna
.split("")
.reverse()
.map((base) => pairs[base] || base)
.join("");
};
// Example
console.log(reverseComplement("ATCG")); // CGAT
The percentage of nitrogenous bases on a DNA or RNA molecule that are either Guanine (G) or Cytosine (C). High GC content indicates high stability.
const getGCContent = (sequence: string): number => {
const matches = sequence.match(/[GCgc]/g) || [];
return (matches.length / sequence.length) * 100;
};
// Example
console.log(getGCContent("ATCG")); // 50
console.log(getGCContent("GGCC")); // 100
Measures the number of substitutions required to change one string into another. Useful for finding Single Nucleotide Polymorphisms (SNPs).
const hammingDistance = (seq1: string, seq2: string): number => {
if (seq1.length !== seq2.length) throw new Error("Sequences must be equal length");
return seq1.split("").reduce((acc, base, i) =>
base !== seq2[i] ? acc + 1 : acc, 0
);
};
// Example
console.log(hammingDistance("GAGCCTACTAACGGGAT", "CATCGTAATGACGGCCT")); // 7