Similarity matrix of a wildtype versus mutated version of a protein. Diagonal squares represent low similarities and therefore domain boundaries.
QGen Lab
• Paid position researching bioinformatics • Using Evolutionary Scale Modeling to tokenize primary structures of protein • First author project creating an ensemble model-RepIBP-able to predict protein origin and function with state-of-the-art performance and without the need for resource intensive methods like multiple sequence alignment • Leading and optimizing performance on an LSTM model to predict protein domain regions given a chain of amino acids. Testing on a benchmark dataset of ~5000 protein.