My project is a cancer drug response model that predicts the effect of a cancer drug on cell lines, xenografts, and patient samples. By learning from known effectiveness data between cancer cell lines and cancer drugs, the model designed can aid with drug repurposing and development efforts. The model also correctly predicted several hundred clinical outcomes, indicates its potential for clinical use. I was inspired to pursue cancer research and target the drug development pipeline because of family members who struggled with cancer and from seeing firsthand the high price of specialized drugs.
This bioinformatics project centered on understanding biological data and manipulating it with computer science and math algorithms. My experience in the wet lab allowed me to conceptualize the project’s framework with relative ease, but I ran into many roadblocks when transitioning to the mathematical and computer science implementation. Starting from a place of limited computer science background, this project proved challenging as I needed to understand very complex algorithms while also building a foundation. This led to several simple errors while writing complex preprocessing code and I remember spending weeks banging my head against the wall trying to understand TensorFlow, a software library for machine learning. Ultimately, this experience proved valuable as I learned how to both delve deep and wide into a subject. Returning to the project’s origins, I focused on this endeavor during the pandemic as an independent project, but my ability to even attempt such a difficult project was only possible due to the support of various mentors. Since freshman year, I have been part of the Burlingame Lab and under the mentorship of Dr. Nancy Phillips, I learned how to critically read scientific papers, conduct strong, replicable research, and view scientific processes from both a molecular level and cell-wide level. And my senior year, Jason Maynard helped me build a project from the ground level and learn about the various validation steps necessary for a biology project. This opportunity was granted by Al Burlingame who allowed me to learn in a professional setting at just fifteen. In high school, Nicholas Jackson, my English teacher, helped me improve as a scientific writer and the San Ramon Valley High School administration under Jason Krolikowski, Ann-Marie Walters, and Kirsten Drake created a flexible schedule to help me pursue research during the school year.
I started this project to improve the drug development pipeline by focusing on the environmental impact, high cost, and long timeframe of drug development. This project has evolved to solve these pain points from multiple perspectives. In regards to drug repurposing the model identified and continues to identify several FDA-approved drugs for repurposing in cancer treatment; these drugs have gone through Phase 1 safety testing allowing for an advanced timeline and lower cost compared to traditional drug development. Similarly, the model can filter through new drug candidates by providing an estimated effectiveness on various cancer cell lines, reducing the environmental footprint and high cost of manually validating the effectiveness of various candidates. After training, the model only requires two pieces of data to estimate effectiveness: a drug structure and gene expression profile. Thus, the model can predict the effectiveness of a cancer drug on patient tissue samples and patient-derived xenografts. Initial results are promising and with more data, this project can help doctors assess the best course of treatment for cancer patients based on their individual gene expression profile, contributing to the dream of personalized medicine.