Inert DNA sequences Transcription factor binding Gene editing Clustering
Abstract:
It has become increasingly important to be able to identify DNA sequences that are inert, or do not bind to any transcription factor proteins. This is especially important in the realm of gene editing, where it is useful to be able to replace problematic DNA sequences without accidentally adding binding sites to the genome. Here, I have used techniques such as clustering to identify DNA sequences from Protein Binding Microarray data that seem to bind to transcription factor proteins relatively less often. I then took those DNA sequences and analyzed where they occurred most often on the genome. Then, I investigated the biological significance of those regions on the genome. I had expected to see a low overlap between the regions I selected and known transcription factor binding sites, but I ended up seeing the opposite. In the future, this phenomenon should be explored further to determine why there appears to be a higher incidence of inert DNA sequences in transcription factor protein binding sites.