Date of Award


Document Type

Honors Thesis

Degree Name

Bachelor of Science



Advisor/Committee Chair

Keith Derbyshire



Committee Member

Todd Gray

Committee Member

Gabriele Fuchs


The study examines the roles of a subset of short open reading frames (sORFs) in M. smegmatis using targeted mutagenesis and the subsequent examination of phenotypes associated with sORF mutation or overexpression. sORFs are defined as stretches of nucleic acid encoding a protein with a maximum of 50 amino acids. Genome annotation pipelines overlook sORFs encoding small proteins (sproteins) and, thus, sproteins in bacteria have gone unnoticed, even those that are expressed at levels similar to larger, well-described proteins. As a result of recent advances in ribosome profiling and mass spectrometry, hundreds of previously unannotated sORFs have been identified, increasing the number of annotated genes in both the M. tuberculosis and M. smegmatis genomes. Previously studied sproteins have been found to carry out diverse functions within the cell, ranging from modulating enzymatic activity to stress response signaling. In the present study, annotated sORFs were inspected using a JBrowse genome viewer that displays RNA-seq and Ribo-seq data mapped to mycobacterial reference genomes to determine active gene boundaries. The sORFs chosen for study are thought to be physiologically relevant based on their high expression level and conservation within the M. smegmatis genome and across other mycobacterial species. Utilizing targeted mutagenesis protocols, mutant sORFs were created and then assayed for a reproducible phenotypic effect using a variety of molecular and physical assays that aim to accurately address the function of the synthesized sprotein itself, as well as examining the potential of the sprotein to participate in fundamental cellular processes. Studying sORFs and their encoded sproteins has the potential to provide insight into how they modulate biological functions, and to identify novel functions not previously considered due to their seemingly negligible size.

Included in

Biology Commons