1.8 Aims of this thesis

The recent advances in genome-wide ligation proximity mapping revealed that interphase chromosomes are highly organized and structurally segregate into TADs. TADs were already shown to associate with diverse genomic functions such as histone modifications (Dixon et al. 2012; Sexton et al. 2012), replication timing (Pope et al. 2014), and gene expression correlation (Nora et al. 2012; Le Dily et al. 2014). We previously showed a significant association of TAD disruptions by large chromosomal deletions with clinical phenotypes, likely caused by an enhancer adaption mechanism (Ibn-Salem et al. 2014). However, it is still unclear how exactly the genome folds into TADs and what consequence this has for gene regulation during evolution and in genetic diseases. For example, it is not clear to which extends genes within the same TAD are expressed and regulated in a coordinated manner. Despite initial evidence of evolutionary conservation of TADs in homologous regions between human and mouse (Dixon et al. 2012; Vietri Rudan et al. 2015), there was no systematic analysis of the stability of TADs during evolution. Furthermore, it is unclear if TAD disruptions during evolution correlate with gene expression divergence between species. The accumulating evidence of the critical gene regulatory function of TADs leads to the question if TADs play an essential role in genetic diseases (Spielmann and Mundlos 2016). More specifically, TADs might be used together with other genomic annotations, such as enhancers and their interactions with regulated genes, to interpret structural variations in patient genomes. Furthermore, increasing molecular and mechanistic understanding of chromatin looping and TAD formation could be used to improve genome-wide contact maps and predict computationally long-range interactions in diverse tissues and conditions.

Therefore, this thesis addresses the question, whether TADs represent only structural units of genomes or also essential functional building block, in which genes can be regulated in a coordinated manner. By computationally integrating genome-wide chromatin interaction maps with diverse genomic datasets, including sequence conservation, enhancer activity, protein binding, gene expression and clinical phenotypes, this work addresses the following questions.

Is the three-dimensional folding structure of genomes associated with co-regulation of functionally related genes?

  • How are paralog genes distributed in the linear genome and the three-dimensional genome architecture?
  • Are paralogs co-regulated with shared enhancers and located in the same TAD in genomes of human and other species?
  • Can we learn about the evolutionary history of genes and how they are created within regulatory environments of TADs?

Are TADs functional building blocks of genomes and subjected to selective pressure during evolution?

  • Are human TAD regions conserved during evolution or disrupted by rearrangements when compared to other vertebrate genomes?
  • Have genes within TADs a more conserved expression profile across different tissues?
  • Are disruptions of TADs during evolution associated with changes in gene expression profiles?

Can clinical phenotypes be explained by rearrangements affecting TADs and the regulation of relevant genes?

  • Can TADs be used to interpret gene regulatory effects of balanced chromosomal rearrangements in whole-genome sequenced patients?
  • How can we quantify the similarity of phenotypes observed in patients and phenotypes associated with genes to prioritize candidate genes?
  • Can we provide a computational tool to integrate functional genomic elements, chromatin interaction data, and TADs with phenotype data of patients to predict pathomechanism of structural variations?

Can protein binding profiles and genomic sequence features predict chromatin looping interactions?

  • Does the cross-linking effect in ChIP-seq experiments provide characteristic signals at interacting chromatin loop anchors?
  • Does the genomic sequence encode features that are predictive for long-range chromatin interactions?
  • Can we provide a computational method to predict chromatin looping interactions in specific cell-types and conditions of interest?
  • Which transcription factors are most predictive and eventually functionally involved in chromatin looping?