3.3 Discussion
Our analysis of rearrangements between human and 12 diverse species shows that TADs are largely stable units of genomes, which are often reshuffled as a whole instead of disrupted by rearrangements. Furthermore, the decreased expression correlation with orthologs in mouse and human in rearranged TADs shows that disruptions of TADs are associated with changes in gene regulation over large evolutionary time scales.
TADs exert their influence on gene expression regulation by determining the set of possible interactions of cis-regulatory sequences with their target promoters (Nora et al. 2012; Symmons et al. 2014; Schoenfelder et al. 2015). This might facilitate the cooperation of several sequences that is often needed for the complex spatiotemporal regulation of transcription (Andrey and Mundlos 2017). The disruption of these enclosed regulatory environments enables the recruitment of other cis-regulatory sequences and might prevent formerly established interactions (Montavon et al. 2012). The detrimental effects of such events have been shown in the study of diseases (Redin et al. 2017; Zepeda-Mendoza et al. 2017). There are also incidences where pathogenic phenotypes could be specifically attributed to enhancers establishing contacts to promoters that were formerly out of reach because of intervening TAD boundaries (Ibn-Salem et al. 2014; Lupiáñez et al. 2015; Spielmann et al. 2012). This would explain the selective pressure to maintain TAD integrity over large evolutionary distances and why we observe higher gene expression conservation for human genes within TADs compared to genes outside TADs.
Disruptions of TADs by large-scale rearrangements change expression patterns of orthologs across tissues and these changes might be explained by the altered regulatory environment which genes are exposed to after rearrangement (Farré et al. 2015).
Our results are largely consistent with the reported finding that many TADs correspond to clusters of conserved non-coding elements (GRBs) (Harmston et al. 2017). We observe a strong depletion of evolutionary rearrangements in GRBs and enrichment at GRB boundaries. This is consistent with comparative genome analysis revealing that GRBs largely overlap with micro-syntenic blocks in Drosophila (Engström et al. 2007) and fish genomes (Dimitrieva and Bucher 2013). However, over 60% of human hESC TADs do not overlap GRBs (Harmston et al. 2017), raising the question of whether only a small subset of TADs are conserved. Interestingly, we find also depletion of rearrangements in non-GRB-TADs. This indicates that our rearrangement analysis identifies conservation also for TADs that are not enriched for CNEs. High expression correlation of orthologs in conserved TADs suggestss that the maintenance of expression regulation is important for most genes and probably even more crucial for developmental genes which are frequently found in GRBs.
Previous work using comparative Hi-C analysis in four mammals revealed that insulation of TAD boundaries is robustly conserved at syntenic regions, illustrating this with a few examples of rearrangements between mouse and dog genomes, which were located in both species at TAD boundaries (Vietri Rudan et al. 2015). The results of our analysis of thousands of rearrangements between human and 12 other species confirmed and expanded these earlier observations.
The reliable identification of evolutionary genomic rearrangements is difficult. Especially for non-coding genomic features like TAD boundaries, it is important to use approaches that are unbiased towards coding sequence. Previous studies identified rearrangements by interrupted adjacency of ortholog genes between two organisms (Vietri Rudan et al. 2015; Pevzner and Tesler 2003). However, such an approach assumes equal inter-genic distances, which is violated at TAD boundaries, which have in general higher gene density (Dixon et al. 2012; Hou et al. 2012). To avoid this bias we used whole-genome-alignments. However, low quality of the genome assembly of some species might introduce alignment problems and potentially false positive rearrangement breakpoints.
Rearrangements are created by DNA double strand breaks (DSBs), which are not uniquely distributed in the genome. Certain genomic features, such as open chromatin, active transcription and certain histone marks are shown to be enriched at DSBs in somatic translocation sites (Roukos and Misteli 2014) and evolutionary rearrangements (Murphy et al. 2005; Hinsch and Hannenhalli 2006). Furthermore, induced DSBs and somatic translocation breakpoints are enriched at chromatin loop anchors (Canela et al. 2017). This opens the question of whether our finding of significantly enriched evolutionary rearrangement breakpoints at TAD boundaries could be explained by the molecular properties of the chromatin at TAD boundaries, rather than by the selective pressure to keep TAD function. Although, we cannot distinguish the two explanations entirely, our gene expression analysis indicates stronger conservation of gene expression in conserved TADs and more divergent expression patterns in rearranged TADs. This supports a model in which disruption of TADs are most often disadvantageous for an organism. Structural variations disrupting TADs can lead to miss regulation of neighboring genes as shown for genetic diseases (Ibn-Salem et al. 2014; Lupiáñez et al. 2015; Redin et al. 2017; Franke et al. 2016) and cancers (Hnisz et al. 2016b; Northcott et al. 2014; Weischenfeldt et al. 2016).
Interestingly, we observed higher gene expression conservation for human genes within TADs compared to genes outside TADs. The larger syntenic structure of TADs might conserve the regulation likely by maintaining the proximity of promoters and cis-regulatory sequences while genes outside such frameworks are more exposed to changing genomic landscapes, presumably resulting in a greater susceptibility to the recruitment of regulatory sequences.
Apart from the described detrimental effects, our results suggest that TAD rearrangements occurred between genomes of human and mouse and led to changes in expression patterns of many orthologous genes. Since this is likely attributed to changing regulatory environments, it is also conceivable that some rearrangements led to a gain of function. Hence, TAD rearrangements might also provide a vehicle for evolutionary innovation. A single TAD reorganization has the potential to affect the regulation of a whole set of genes in contrast to the more confined consequences of other types of mutations (Acemel et al. 2017). Since it is also believed that changes in cis-regulatory sequences of developmental genes play a big part in evolutionary innovation (Carroll 2008), the development of the enormous diversity of animal traits in evolution might have been promoted by the rearrangement of structural domains. This is consistent with a model in which new genes can arise by tandem-duplication and during evolution are then re-located to other environments (Ibn-Salem et al. 2017). These changes might have facilitated significant leaps in morphological evolution explaining the emergence of features that could not appear in small gradual steps. Following this hypothesis, TADs would not only constitute structural entities that perform the function of maintaining an enclosed regulatory landscape but could also be a driving force for change by exposing many genes at once to different genomic environments following single events of genomic rearrangement.