{"paper_id":"13fde5d8-350e-4d2e-bf93-0ff0c61722fb","body_text":"ScHiCAtt: Enhancing Single-Cell Hi-C Resolution Using\nAttention-Based Models\nRohit Menon 1, H. M. A. Mohit Chowdhury 1, and Oluwatosin Oluwadare 1,2,*\n1Department of Computer Science, University of Colorado at Colorado Springs,\nColorado Springs, 80918, USA\n2Department of Biomedical Informatics, University of Colorado Anschutz Medical\nCampus, Aurora, 80045, USA\nAbstract\nThe spatial organization of chromatin is fundamental to gene regulation and essential for\nproper cellular function. The Hi-C technique remains the leading method for unraveling 3D\ngenome structures; however, limited resolution, data sparsity, and incomplete coverage in\nsingle-cell Hi-C data pose significant challenges for comprehensive analysis. Traditional CNN-\nbased models often suffer from blurring and loss of fine details, while GAN-based methods en-\ncounter difficulties in maintaining diversity and generalization. Moreover, existing algorithms\nperform poorly in cross-cell line generalization, where a model trained on one cell type is used\nto enhance high-resolution data in another cell type. To address these limitations, we propose\nScHiCAtt (Single-cell Hi-C Attention-Based Model), which leverages attention mechanisms to\ncapture both long-range and local dependencies in Hi-C data, significantly enhancing resolu-\ntion while preserving biologically meaningful interactions. We implement this mechanism and\ncheck its validity on data from different cells of the same organisms and data of different organ-\nisms. By dynamically focusing on regions of interest, attention mechanisms effectively mitigate\ndata sparsity and enhance model performance in low-resolution contexts. Extensive experi-\nments on Human and Drosophila single-cell Hi-C data demonstrate that ScHiCAtt consistently\noutperforms existing methods in terms of computational and biological reproducibility metrics\nacross different downsampling ratios, especially under extreme downsampling conditions. The\nmodel is publicly available at https://github.com/OluwadareLab/ScHiCAtt.\nKeywords: Hi-C data, Self-Attention, Resolution Enhancement, Single-cell Hi-C, Data\nSparsity\n1 Introduction 1\nThree-dimensional (3D) conformation of chromosomes is crucial for elucidating genomic processes 2\nwithin the nuclei of eukaryotic cells. The Hi-C technique facilitates an all-versus-all mapping of 3\nchromosomal fragment interactions, resulting in an interaction frequency contact matrix, where 4\nn × n represents the number of fragments in a chromosome or genome at a specific resolution, 5\nLieberman-Aiden et al., 2009. These Hi-C data are critical for numerous algorithms designed to 6\nimprove the understanding of genome organization, Oluwadare et al., 2019. A major challenge 7\nin this field is the scarcity of high-resolution Hi-C data, which are indispensable for identifying 8\nintricate genomic topologies such as enhancer-promoter interactions and subdomains. 9\nTo address this need, deep learning models have been employed to predict high-resolution data 10\nfrom low-resolution data with remarkable accuracy. Notable models in this area include HiCPlus 11\nY. Zhang et al., 2018, HiCNN T. Liu and Z. Wang, 2019a, hicGAN Q. Liu et al., 2019, Boost-HiC 12\nCarron et al., 2019, HiCSR Dimmick, 2020, SRHiC Z. Li and Dai, 2020, HiCNN2 T. Liu and 13\nZ. Wang, 2019b, HiCARN Hicks and Oluwadare, 2022, and DeepHiC Hong et al., 2020. These 14\nmodels leverage various network architectures such as Convolutional Neural Networks (CNNs), 15\nAutoencoders, and Generative Adversarial Networks (GANs). Despite the advancements made 16\nby these models, there remains considerable room for improvement, especially when it comes to 17\nsingle-cell Hi-C data enhancement, Y. Wang et al., 2023, as all of the aforementioned methods are 18\ndesigned for bulk Hi-C data enhancement. 19\nSingle-cell Hi-C (scHi-C) is a groundbreaking technology that offers a unique opportunity to in- 20\nvestigate 3D genome structures at the single-cell level with high resolution, Galitsyna and Gelfand, 21\n1\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\n2021. By capturing chromatin interactions at the individual cell level, scHi-C enables the explo- 22\nration of cellular heterogeneity in chromatin conformation, Arrastia et al., 2020; Collombet et al., 23\n2020; Payne et al., 2021. However, scHi-C data are characterized by high dimensionality, noise, 24\nand sparsity, presenting computational challenges that demand innovative solutions for the accu- 25\nrate reconstruction of 3D genome structures, Paulsen et al., 2015; Galitsyna and Gelfand, 2021. 26\nTherefore, scHi-C data imputation is crucial, as it enables the reconstruction of enhanced contact 27\nmaps from raw and sparse scHi-C data, thereby improving the quality for downstream analyses, 28\nincluding the reconstruction of chromatin organization at the single-cell level. This enhancement 29\naids in uncovering cell-to-cell variability and heterogeneity, ultimately providing deeper insights 30\ninto cellular functions and disease mechanisms Y. Wang et al., 2023. 31\nRecently, algorithms like ScHiCEDRN, Y. Wang et al., 2023 and Loopenhance, S. Zhang et al., 32\n2022 have been developed to address the challenges of scHi-C data enhancement. While these 33\nmethods aim to improve the resolution of single-cell Hi-C data, they often fall short in capturing 34\nthe complex spatial relationships within chromatin structures, especially long-range dependen- 35\ncies. This limitation leads to the loss of critical interactions, which are essential for accurately 36\nreconstructing chromatin topology. 37\nOn the other hand, Attention mechanisms have proven effective in capturing both short-range 38\nand long-range dependencies in various domains, such as natural language processing and computer 39\nvision, Vaswani, 2017. These mechanisms enable models to focus on different regions of the input 40\ndata dynamically; hence, they have the potential to be used to enhance the resolution of sparse 41\ndatasets like scHi-C by capturing context at multiple scales. The motivation behind our work is to 42\nleverage Attention mechanisms to address challenges unique to scHi-C data, such as sparsity, noise, 43\nand limited coverage. By selectively focusing on relevant chromatin interactions, our approach aims 44\nto provide a more biologically meaningful reconstruction of 3D genome structures. 45\nIn this work, we propose ScHiCAtt, which employs a cascading residual network integrated 46\nwith an optimal attention mechanism identified through validation across multiple candidates. 47\nScHiCAtt explores different attention mechanisms, such as self-attention, local attention, global 48\nattention, and dynamic attention (Attention-in-Attention), selecting the optimal mechanism for 49\neach layer during training to determine the best attention mechanism to incorporate for scHi-C data 50\nenhancement. The goal of this experimentation is to allow ScHiCAtt to capture both short-range 51\nand long-range dependencies adaptively, thus enhancing the quality of scHi-C data reconstruction. 52\nThrough comprehensive experiments on human and Drosophila data across various downsam- 53\npling rates, we demonstrate that ScHiCAtt significantly improves the resolution of scHi-C data. 54\nOur results show superior performance in terms of computational metrics and biological repro- 55\nducibility metrics, such as GenomeDISCO, Ursu et al., 2018, compared to existing methods, par- 56\nticularly under extreme downsampling conditions. Moreover, ScHiCAtt maintains efficient training 57\ntimes, making it a robust solution for high-resolution single-cell Hi-C data enhancement. 58\n2 Materials and Methods 59\n2.1 Model Architecture 60\nOur model architecture starts with an entry convolution layer (Figure 1A) that processes the 61\ninput raw scHi-C contact map. This is followed by a series of cascading blocks interleaved with 62\nattention layers, designed to progressively upscale the resolution of the Hi-C maps. The final 63\nhigh-resolution Hi-C maps are produced through an exit convolution layer. The architecture also 64\nincludes tunable hyperparameters such as the number of cascading blocks and attention layers, 65\nallowing for flexibility in optimizing the model’s performance. 66\nIn the following subsections, we explore various attention mechanisms that have been considered 67\nin our study. We describe each mechanism in detail, highlighting its unique features and the 68\nrationale behind its selection for our research. Furthermore, we elucidate how these mechanisms 69\nwere implemented within our architecture for evaluation. 70\n2.1.1 Self-Attention Mechanism 71\nThe self-attention mechanism in our architecture (Figure 1A) facilitates efficient learning of both 72\nlocal and global chromatin interactions by allowing the model to dynamically assign weights to 73\nrelationships between chromatin loci, regardless of their spatial distance on the Hi-C contact maps. 74\nThis capability is crucial for capturing both short-range and long-range dependencies within chro- 75\nmatin structures. 76\n2\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nTo achieve this, the attention scores are computed by taking the scaled dot-product between 77\nthe input projection matrices: queries ( H) and keys ( J), divided by the square root of the keys 78\ndimension. The resulting attention scores are passed through a softmax function to compute the 79\nattention weights, which are then applied to the values ( Z). This enables the model to prioritize 80\nimportant interactions, enhancing the quality of the predicted high-resolution contact maps. 81\nThe process is defined as: 82\nA(H, J, Z) = Softmax\n \nH · JT\np\ndj\n!\n· Z (1)\nHere, H ∈ Rn×d, J ∈ Rn×d, and Z ∈ Rn×d represent the query, key, and value matrices 83\nrespectively, where n is the sequence length (number of loci in the Hi-C contact map), and d is the 84\nfeature dimension. The term dj is the dimension of the keys (i.e., dj = d) used to scale the dot 85\nproduct and stabilize the training process. This mechanism enables the model to focus on critical 86\nchromatin interactions, significantly improving prediction accuracy. 87\n2.1.2 Cascading Residual Blocks 88\nThe backbone of our architecture is the cascading residual blocks, illustrated in Figure 1B, Ahn 89\net al., 2018. Each block comprises residual units with skip connections that progressively refine 90\nthe Hi-C contact maps. These cascading blocks are interconnected, allowing for the aggregation of 91\nfeatures across different layers. 92\n2.1.3 Local Attention Mechanism 93\nLocal attention is applied within the cascading residual blocks (Figure 1B). It focuses on capturing 94\nfine-grained chromatin interactions within localized regions of the Hi-C contact maps. The use 95\nof depthwise and pointwise convolutions in the local attention mechanism allows the model to 96\nenhance the spatial resolution of the Hi-C maps by emphasizing intricate local details. 97\nLocalAttention(xi) =\ni+wX\nj=i−w\nαijxj (2)\nwhere αij = exp(eij )Pi+w\nk=i−w exp(eik), and eij = (xiWQ)(xjWK)T . Here, xi is the input at position i, w is 98\nthe window size defining the local neighborhood, WQ and WK are learnable weight matrices for 99\nqueries and keys respectively. 100\n2.1.4 Global Attention Mechanism 101\nThe global attention mechanism is applied after several cascading residual blocks (Figure 1B) to 102\nensure that global chromatin structures are preserved. This module aggregates context across 103\nthe entire Hi-C map and allows the model to capture large-scale genomic interactions, which are 104\ncritical for accurate super-resolution Zhu et al., 2021. 105\nGlobalAttention(x) = Softmax\n\u0012 QK T\n√\nd\n\u0013\nV (3)\nwhere Q = xWQ, K = xWK, V = xWV , and WQ, WK, WV are learnable weight matrices for 106\nqueries, keys, and values, respectively. 107\n2.1.5 Multi-Head Attention Mechanism 108\nThe multi-head attention mechanism is designed to enhance the model’s ability to capture complex 109\nrelationships in the input data by dividing the input into multiple attention heads. Each head 110\nperforms attention operations independently, focusing on different aspects of the input, which 111\nallows the model to extract diverse contextual information. 112\nThe mechanism takes three primary inputs: the Query ( Q), Key (K ), and Value (V ) matrices. 113\nThese inputs are derived from the original data through linear transformations. The attention op- 114\neration for each head calculates a weighted representation of the Value matrix, where the weights 115\nare determined by the similarity between the Query and Key matrices. This is expressed mathe- 116\nmatically as: 117\n3\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nAttention(Q, K, V) = softmax\n\u0012 QK T\n√dk\n\u0013\nV (4)\nHere, dk is the dimensionality of the Key matrix, and the softmax function ensures that the 118\nweights sum to 1, highlighting the most relevant features for each Query. 119\nFor multi-head attention, the inputs are split intoh separate heads, each with its ownQ, K, and 120\nV . The outputs from all heads are concatenated and passed through a final linear transformation, 121\nas shown in the equation below: 122\nMultiHead(Q, K, V ) = Concat(head1, head2, . . . ,headh)WO (5)\nIn this equation, head i represents the output of the i-th attention head, and WO is the learned 123\nweight matrix for the final linear transformation. This design allows the model to integrate in- 124\nformation from multiple perspectives, improving its ability to capture chromatin interactions and 125\nother complex patterns. 126\n2.1.6 Dynamic Attention Mechanism 127\nDynamic Attention, also referred to as the Attention-in-Attention (A2A) mechanism, combines 128\nstatic and dynamic attention features to weigh their contributions adaptively. The dynamic atten- 129\ntion module applies global pooling, followed by fully connected layers, to dynamically adjust the 130\ncontribution of features based on and without attention Huang et al., 2019. 131\nA2A(x) = wnon-att · NonAttention(x)\n+ watt · AttentionBranch(x) (6)\n2.2 Loss Function 132\nTo optimize the quality of the enhanced scHi-C contact matrices, we leverage several key loss 133\nfunctions that address distinct aspects of the reconstruction process. These loss functions ensure 134\nthat the generated matrices not only minimize pixel-wise error with respect to the target but also 135\nmaintain structural integrity and visual consistency. 136\n2.2.1 Mean Squared Error (MSE) 137\nThe goal is to minimize the pixel-wise difference between the true and enhanced scHi-C matrices, 138\nensuring that the generated maps closely approximate the true scHi-C data. 139\nLM SE = 1\nN\nNX\ni=1\n(Yi − ˆYi)2 (7)\nIn this equation: 140\n• N: The total number of data points or pixels in the scHi-C matrices. 141\n• Yi: The true value of the i-th pixel in the scHi-C matrix. 142\n• ˆYi: The predicted value of the i-th pixel in the enhanced scHi-C matrix. 143\n• LM SE: The computed Mean Squared Error, representing the average of the squared differ- 144\nences between the true and predicted values. 145\nThis loss function penalizes larger deviations more heavily due to the squaring operation, en- 146\ncouraging the model to generate outputs that closely match the true data. 147\n2.2.2 Perceptual Loss 148\nPerceptual loss, based on feature representations from a pre-trained VGG network Wu et al., 2020, 149\nensures that the generated Hi-C maps are not only pixel-accurate but also visually consistent with 150\nthe real Hi-C data. 151\nIn the perceptual loss LV GG, we utilize the feature maps from specific layers of the pre-trained 152\nVGG network: 153\n4\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nLV GG = 1\nN\nNX\ni=1\nX\nℓ\n\r\r\rϕℓ(Yi) − ϕℓ( ˆYi)\n\r\r\r\n2\n(8)\nwhere ϕℓ(·) denotes the feature map extracted from the ℓ-th layer of the VGG network. 154\n2.2.3 Total Variation (TV) Loss 155\nTV loss reduces noise and enforces smoothness in the generated Hi-C maps, improving the overall 156\nvisual quality. 157\nLT V = 2ψ(hT V + wT V )\nF (9)\n2.2.4 Adversarial Loss (AD) 158\nAdversarial loss improves the realism of the generated high-resolution Hi-C maps by ensuring that 159\nthe discriminator cannot easily distinguish between real and generated matrices. 160\nLAD = 1 − 1\nN\nNX\ni=1\nD( ˆYi) (10)\n2.3 Evaluation Metrics 161\nTo evaluate the effectiveness of our models in enhancing the resolution of scHi-C data, we used 162\na few standard metrics that give us different ways to look at the quality of the reconstructed 163\ncontact maps. Each of these metrics helps us understand how good the reconstruction is from dif- 164\nferent perspectives. They can broadly be categorized as computational metrics, such as Structural 165\nSimilarity Index Measure, Peak Signal-to-Noise Ratio, and Signal-to-Noise Ratio and biological 166\nreproducibility metrics, such as GenomeDISCO, Ursu et al., 2018. 167\n2.3.1 Structural Similarity Index 168\nStructural Similarity Index Measure(SSIM) quantifies the structural similarities between the true 169\nand enhanced scHi-C matrices. 170\nSSIM is defined as, 171\nSSIM(x, y) = (2µxµy + C1)(2σxy + C2)\n(µ2x + µ2\ny + C1)(σ2x + σ2y + C2) (11)\nHere, µx and µy are the means of x and y, σ2\nx and σ2\ny are the variances, σxy is the covariance 172\nbetween x and y, and C1 and C2 are constants to stabilize the division when the denominator is 173\nclose to zero. 174\n2.3.2 Peak Signal-to-Noise Ratio 175\nAs the name states, Peak Signal-to-Noise Ratio (PSNR) quantifies the ratio between the maximum 176\nachievable signal and the noise that distorts it. 177\nPSNR is defined as 178\nPSNR = 20 · log10\n\u0012 MAXI\n√\nMSE\n\u0013\n(12)\nIn this equation: 179\n• PSNR: Peak Signal-to-Noise Ratio, a metric to measure the quality of the enhanced image. 180\n• MAXI: The maximum possible pixel value of the image (e.g., 255 for 8-bit images). 181\n• MSE: Mean Squared Error between the original and enhanced images. 182\n• log10: The base-10 logarithm. 183\n5\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\n2.3.3 Mean Squared Error 184\nMean Squared Error (MSE) calculates the average squared difference between the predicted and 185\ntrue values. 186\nMean Squared Error is defined as, 187\nMSE = 1\nN\nNX\ni=1\n(xi − yi)2 (13)\n2.3.4 Signal-to-Noise Ratio 188\nSignal-to-Noise Ratio (SNR) measures the relationship of the signal power to noise power. 189\nSNR = 10 · log10\n PN\ni=1 y2\niPN\ni=1(xi − yi)2\n!\n(14)\n2.3.5 GenomeDISCO 190\nIn this study, we utilize GenomeDISCO Ursu et al., 2018 as a measure of biological reproducibility. 191\nGenomeDISCO produces a concordance score ranging from -1 to 1, reflecting the biological simi- 192\nlarity between two contact maps. A higher value indicates better concordance. The methodology 193\nentails applying a smoothing technique to the contact maps through their graph representations, 194\nfollowed by the calculation of the similarity score on the resulting smoothed matrices. 195\n3 Results 196\n3.1 Dataset Preparation 197\nFor this study, we utilized scHi-C datasets as prepared by the ScHiCEDRN framework, which 198\nincludes data from both Drosophila melanogaster and Homo sapiens cell lines. The Drosophila 199\ndataset comprises seven chromosomes (chr2L, chr2R, chr3L, chr3R, chr4, chrX, and chrM) (GSE131811)200\nUlianov et al., 2021, while the human dataset includes chromosomes from the frontal cortex 201\n(GSE130711) Lee et al., 2019; Luo et al., 2022. 202\nFollowing the preprocessing steps as described in the ScHiCEDRN framework Y. Wang et 203\nal., 2023, we utilized the low-resolution contact maps provided, which had been downsampled to 204\nvarying degrees (75%, 45%, 10% and 2% of the original raw reads). Detailed preprocessing infor- 205\nmation can be found in ScHiCEDRN, Y. Wang et al., 2023, and the datasets are publicly available 206\nat https://github.com/BioinfoMachineLearning/ScHiCEDRN. No additional preprocessing was 207\nperformed on the data. For the human cell line, chromosomes 1, 3, 5, 7, 8, 9, 11, 13, 15, 16, 17, 19, 208\n21, and 22 from Human cell 1 were used as the training dataset, while chromosomes 4, 14, 18, and 209\n20 were used for validation. For testing, we used chromosomes 2, 6, 10, and 12 from both Human 210\ncell 1 and a different human cell, referred to as Human cell 2, as done by ScHiCEDRN. For testing 211\non Drosophila cells, we used chromosomes chr2L and chrX. 212\nThese datasets were used as inputs for our models, with the raw scHi-C contact maps serving 213\nas the ground truth for model training and evaluation. 214\n3.2 Hyperparameter Search for Individual Attention Mechanisms 215\nWe have conducted an extensive hyperparameter search to determine the optimal configuration for 216\nour architecture. The two criteria to optimize are (i) determining the best-performing attention 217\nmechanism and (ii) its placement within the network layers. Our primary focus is on the im- 218\nplementation of various attention mechanisms, including Self-Attention, Local Attention, Global 219\nAttention, and Dynamic Attention. The goal is to ascertain which attention mechanism and its 220\nplacement within the network layers yield the best performance metrics, specifically the PSNR, 221\nSSIM, and SNR metrics. The loss function applied in this search is the MSE loss. 222\nWe performed experiments on the Human cell 1 dataset by integrating each attention mech- 223\nanism in different layers of the model (Layers 2, 3, and 5) and evaluated their impact on the 224\nmodel’s performance. The average results, which were obtained from the corresponding validation 225\nset chromosomes are in Figure 2, Supplementary Figure S1 and Table I, indicate that the choice of 226\nattention mechanism and its placement within the network significantly influences the model’s out- 227\nput quality. As illustrated in Table I, the model configuration that utilized Self-Attention at Layer 228\n6\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\n2 consistently outperformed the other configurations across all metrics. Implying that the Layer 229\n2 effectively captures both local and global chromatin interactions, enhancing the model’s ability 230\nto preserve long-range dependencies while refining the details in the contact maps. Specifically, it 231\nachieved high values of PSNR, SSIM and SNR (Table I). The Dynamic Attention mechanism at the 232\nsame layer closely followed these results. Conversely, the Local and Global Attention mechanisms, 233\nwhile still providing significant improvements over a baseline model, did not achieve the same level 234\nof performance. 235\n3.3 Hyperparameter search on Composite Attention Mechanism 236\nTo evaluate the potential benefits of combining multiple attention mechanisms, we conducted 237\ncomprehensive experiments integrating self-attention, local attention, and global attention within 238\nScHiCAtt’s architecture. The experiments were designed to assess the model’s performance across 239\nall testing chromosomes (Chr 2, Chr 6, Chr 10, and Chr 12) and downsampling ratios (0.75, 0.45, 240\n0.10). Training was performed on Human Cell 1, and testing was conducted on Human Cell 2, as 241\nspecified in the dataset preparation section. 242\nTable II presents the performance metrics of ScHiCAtt on composite attention for all tested 243\nchromosomes and downsampling ratios. The results demonstrate that combining attention mech- 244\nanisms provides slight improvements at higher downsampling ratios , particularly for metrics like 245\nSSIM and GenomeDisco. However, at more challenging downsampling ratios , composite atten- 246\ntion mechanisms consistently underperform compared to single attention mechanisms, such as 247\nself-attention. This underperformance may have resulted from increased architectural complexity, 248\nwhich can hinder the model’s ability to capture long-range chromatin interactions at lower resolu- 249\ntions. Overall, based on the results, Self-Attention at Layer 2 provides the best overall performance, 250\nwhich we have adopted as the final configuration for ScHiCAtt. 251\n3.4 Composite Loss Function 252\nTo further validate the effectiveness of the Self-Attention mechanism at Layer 2, we extended 253\nour experiments to fine-tune the weights used in the composite loss function. This additional set 254\nof experiments was motivated by the need to explore how different configurations of loss func- 255\ntion weights impact the model’s output quality. We experimented with various configurations for 256\nthe composite loss function, which includes Mean Squared Error (MSE), perceptual loss, Total 257\nVariation (TV) loss, and adversarial loss components. 258\nThe overall loss is computed as: 259\nLG = αLM SE + βLV GG + γLT V + δLAD, (15)\nwhere α, β, γ, and δ are scalar weights that control the contributions of each component to 260\nthe final loss. By adjusting the weights α, β, γ, and δ, we aimed to optimize both pixel-wise 261\naccuracy and the structural consistency of the generated Hi-C maps. The optimal configuration 262\nidentified was α = 0.5, β = 0.3, γ = 0.1, and δ = 0.1. See Supplementary Table S1. Our objective 263\nwas to find an optimal balance that would enhance the reconstruction quality, particularly for 264\nchallenging downsampling ratios. These adjustments significantly improved the reconstruction 265\nquality, especially at extreme downsampling ratios (e.g., 0.10). This indicates that fine-tuning the 266\nloss function weights is crucial for achieving high-resolution scHi-C data that not only aligns closely 267\nwith ground truth but also retains essential structural features. Through these experiments, we 268\nconfirmed that our proposed ScHiCAtt method, with Self-Attention at Layer 2 and optimized loss 269\nfunction weights, consistently outperforms other configurations. This establishes ScHiCAtt as a 270\nrobust solution for enhancing scHi-C data, especially in scenarios with severe data sparsity. 271\n3.5 Benchmarking with Other Algorithms 272\nWe evaluated the performance of our novel ScHiCAtt method against existing methods, namely 273\nScHiCEDRN, Y. Wang et al., 2023, Loopenhance, S. Zhang et al., 2022, and DeepHiC, Hong 274\net al., 2020, across different downsampling ratios (0.75, 0.45, and 0.1). These experiments are 275\ncrucial in demonstrating the robustness and effectiveness of ScHiCAtt under varying conditions. 276\nThe downsampling ratios represent different levels of data reduction, with 0.75 being the least 277\nand 0.1 being the most extreme. We compare the methods based on key metrics: PSNR, SSIM, 278\nMSE, SNR, and GenomeDISCO scores. Using these metrics, we benchmarked ScHiCAtt and other 279\nalgorithms’ ability to generalize across different chromosomes of the same cell type, different cells 280\nof the same species, and different species. 281\n7\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\n3.5.1 Benchmarking on the Same Cell of the Same Species 282\nTo evaluate the performance of different Hi-C resolution enhancement methods on the same cell 283\nfrom the same species, we conducted experiments on Human Cell 1. The loss function applied 284\nwas Mean Squared Error loss. The experiments were performed on four different chromosomes: 285\nchromosome 2, 6, 10, and 12. For each chromosome, the methods were tested across three different 286\ndownsampling ratios: 0.75, 0.45, and 0.10. Table III presents a comprehensive comparison of 287\nthe methods ScHiCAtt, ScHiCEDRN, Loopenhance, and DeepHiC across these chromosomes and 288\ndownsampling ratios (Figures 3 and Supplementary Figure S2). In Table III, the highest values 289\nfor each metric at a given downsampling ratio are bolded to indicate the best-performing method. 290\nScHiCAtt consistently outperforms other methods across all the chromosomes and downsampling 291\nratios. Figures 4 show a side-by-side comparison of the heatmaps for enhanced scHi-C contact map 292\nfrom all algorithms for chromosome 12 at a downsampling ratio of 0.75. All together, these results 293\nillustrate the consistency of ScHiCAtt’s superiority, highlighting its effectiveness in preserving 294\nhigh-resolution features even when significant downsampling is applied. 295\n3.5.2 Benchmarking on Different Cells of the Same Species 296\nIn addition to evaluating the performance of Hi-C resolution enhancement methods on the same 297\ncell, we extended our analysis to different cells from the same species. For this evaluation, we 298\nconducted experiments using Human Cell 2 on four distinct chromosomes: Chr 2, Chr 6, Chr 299\n10, and Chr 12. The loss function applied was Mean Squared Error loss. Similar to the previous 300\nbenchmarking, the methods were tested across three different downsampling ratios: 0.75, 0.45, and 301\n0.10.Table IV summarizes the performance of the methods ScHiCAtt, ScHiCEDRN, Loopenhance, 302\nand DeepHiC across these chromosomes and downsampling ratios. As in the previous analysis, 303\nthe highest values for each metric at a given downsampling ratio are bolded to indicate the best- 304\nperforming method. 305\nThe results demonstrate that ScHiCAtt consistently delivers superior performance across dif- 306\nferent cells from the same species. These findings emphasize the strength of ScHiCAtt’s cascading 307\narchitecture in preserving essential chromatin interaction features, particularly when enhanced by 308\nattention mechanisms like self-attention. These trends are consistent across the other chromosomes 309\nand downsampling ratios, reaffirming the robustness and effectiveness of ScHiCAtt.These results 310\nunderscore the importance of ScHiCAtt in consistently enhancing resolution across different cell 311\ntypes. This ability is critical for studying cell-specific chromatin interactions, which play a key 312\nrole in understanding gene regulation and other genomic functions. These findings highlight the 313\nadaptability and reliability of ScHiCAtt when applied to different cells within the same species, 314\nmaking it a highly effective tool for enhancing Hi-C data resolution across varying cellular condi- 315\ntions. Supplementary Figure S3 provides a visual representation of these results, showcasing the 316\nconsistent performance of ScHiCAtt across different cells. The graphs clearly depict the ability 317\nof ScHiCAtt to maintain high-resolution details, even when applied to different cellular contexts 318\nwithin the same species. 319\n3.5.3 Benchmarking Across Different Species 320\nTo assess the generalizability of Hi-C resolution enhancement methods across species, we extended 321\nour benchmarking to include cross-species analysis. Specifically, we trained the models on human 322\nHi-C data and tested them on Drosophila chromosomes. The analysis was conducted on two 323\nDrosophila chromosomes, chr2L and chrX, across three different downsampling ratios: 0.75, 0.45, 324\nand 0.10. The loss function applied was Mean Squared Error loss. Table V presents the comparative 325\nperformance of ScHiCAtt, ScHiCEDRN, Loopenhance, and DeepHiC in this cross-species setting. 326\nThese results demonstrate the capability of ScHiCAtt to effectively generalize across species, 327\nindicating its robustness in reconstructing chromatin interactions even when the training and 328\ntesting datasets come from different organisms. Such generalizability highlights its potential utility 329\nin comparative genomics studies. The specific choice of downsampling ratios (0.75, 0.45, and 0.10) 330\nwas informed by typical sparsity levels encountered in single-cell Hi-C data. These ratios allow for 331\na comprehensive evaluation of the methods’ performance under varying levels of data degradation, 332\nensuring the robustness of the conclusions drawn from these experiments. Supplementary Figure 333\nS4 illustrates these findings, providing a visual comparison of the methods’ performance across the 334\ntwo Drosophila chromosomes. The graphs clearly show that ScHiCAtt adapts well to cross-species 335\nscenarios, retaining high-resolution features despite the challenges posed by species differences. 336\nThese cross-species benchmarking results underscore the robustness and adaptability of ScHiCAtt 337\n8\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nand demonstrate its potential utility in broader genomic studies where cross-species comparisons 338\nare necessary. 339\n3.6 Topologically Associating Domains Analysis 340\nTopologically Associating Domains (TADs) are intrinsic features in mammalians and are key struc- 341\ntural elements in genome arrangement Dixon et al., 2012. They are crucial for many biological 342\nprocesses involving CTCF, tRNA, and various insulators and binding proteins. These biological 343\nelements are often found near TAD boundary regions and are important for maintaining biological 344\nfunctions such as preventing the spread of heterochromatin, maintaining histone modification, and 345\nregulating transcription sites Dixon et al., 2012. 346\nTo validate the biological relevance of ScHiCAtt’s generated results, we identified TAD regions 347\nfrom the result set and marked them with blue lines in Figure 5. We compared TAD regions 348\nidentified by ScHiCAtt with those from DeepHiC, ScHiCEDRN, and Loopenhance to support our 349\nmodel’s enhanced data. TopDom Shin et al., 2016, a deterministic and widely accepted tool for 350\nextracting TADs, was utilized to extract TAD regions from the generated results. We visualized 351\nTAD regions from 20 Mb to 24 Mb regions. We used the model’s generated results trained with 352\nthe same cell (Human Cell 1) and input these results into TopDom to generate and visualize TADs 353\n(Figure 5A). ScHiCAtt preserves all the TAD information, with the predicted TADs marked by blue 354\nlines. To support ScHiCAtt’s TADs, we analyzed TADs from the other three tools and visualized 355\nthem using TopDom. We observed that ScHiCAtt preserved TAD information comparable to the 356\nother methods, showing 8 TAD regions similar in number to those identified by the other tools 357\nin the specified region. To assess the robustness of ScHiCAtt, we conducted the same analysis 358\nusing the model’s generated results on a different cell of the same species (Human Cell 2). We 359\nvisualized the TAD regions with blue lines for all four methods (Figure 5B). We observed that 360\nScHiCAtt preserves TAD information in the specified regions as effectively as the other methods. 361\nThe similar number and lengths of TADs across all methods indicate the robustness of ScHiCAtt, 362\nregardless of the trained model used to generate the enhanced Hi-C data. To further validate our 363\npreserved TAD domains, we computed the L2 norm to quantify the similarity with the original 364\nHi-C matrix. A lower value of the L2 norm indicates greater closeness to the original Hi-C matrix. 365\nIt is challenging to find TAD boundaries from single-cell data, and to address this challenge, we 366\ncalculated the insulation score as described by Zhang et al. R. Zhang et al., 2022 considering the 367\nTAD boundaries. Using this insulation score, we calculated the differential L2 norm of the TAD 368\nboundaries reported by ScHiCAtt, DeepHiC, Loopenhance, and ScHiCEDRN, comparing them to 369\nthose from the original Hi-C matrix (Figure 6). This score reflects how closely each tool preserves 370\nthe TAD domains. We observed that ScHiCAtt’s L2 norm scores are 1.19 and 1.47 for the same 371\ncell and different cell scenarios, respectively. ScHiCAtt showed a lower score compared to other 372\nmethods, indicating greater similarity to the original data in preserving the TAD boundary regions. 373\nWe used GenomeFlow Trieu et al., 2019 to visualize the TAD regions from 500 to 600 genomic 374\nbins to support the differential L2 norm score, as shown in Supplementary Figure S5. We observed 375\nthat ScHiCAtt’s TADs are more similar to the original TADs, supporting the differential L2 norm 376\nscores of ScHiCAtt. Considering these metrics, ScHiCAtt efficiently enhances the Hi-C contact 377\nmatrix while preserving biological features (e.g., TADs) across different trained models. 378\n4 Discussion 379\nThe results presented in this study demonstrate the effectiveness of the ScHiCAtt method for en- 380\nhancing the resolution of single-cell Hi-C data using attention mechanisms. By experimenting with 381\ndifferent attention configurations such as self, local, global, and dynamic attention mechanisms, 382\nScHiCAtt achieves superior performance across several key metrics, including PSNR, SSIM, SNR, 383\nand GenomeDISCO scores, particularly at higher downsampling ratios. These results underscore 384\nthe potential of attention-based models in addressing the challenges of data sparsity and resolution 385\nlimitations in Hi-C data. The ScHiCAtt system demonstrates strong generalizability, as evidenced 386\nby its consistent performance across various datasets, attention mechanisms, and species, high- 387\nlighting its robustness and adaptability in diverse genomic contexts.The tuning of the composite 388\nloss function significantly improved the balance between pixel-wise accuracy and structural con- 389\nsistency in the enhanced Hi-C contact maps, enabling ScHiCAtt to achieve superior performance 390\nacross key evaluation metrics. 391\nFurthermore, the analysis across different layers emphasizes the significance of the chosen at- 392\ntention mechanisms. The self-attention mechanism, while effective in capturing long-range inter- 393\n9\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nactions, benefits from the complementary strengths of local and global attention mechanisms. The 394\nanalysis across approaches enables ScHiCAtt to balance the trade-offs between capturing fine-scale 395\nlocal interactions and broader, long-range genomic structures. Dynamic attention, which adjusts 396\nbased on the complexity of the input, proved to be particularly effective in layers where the input 397\nsignal was more variable. This suggests that a hybrid approach, where different types of attention 398\nmechanisms are applied selectively at different layers, could further enhance the performance of 399\nthe model. 400\nAdditionally, the performance of ScHiCAtt across different downsampling ratios highlights its 401\nrobustness and versatility. Even at lower downsampling ratios (e.g., 0.10), where data becomes 402\nincreasingly sparse and challenging, ScHiCAtt maintained relatively high scores across all metrics. 403\nThis resilience is particularly important for practical applications where high-resolution data is not 404\nalways available, and imputation methods must be able to reconstruct accurate contact maps from 405\nlimited information. The observed trend of decreasing performance with increasing downsampling 406\nratios is consistent with expectations, as less data naturally leads to a loss of information. However, 407\nScHiCAtt’s ability to mitigate this loss better than other methods reaffirms its potential as a 408\npowerful tool for enhancing Hi-C data resolution. 409\nFinally, as shown by the TAD analysis, TADs are useful for validating chromatin structure, 410\nbut existing models often miss long-range interactions and hierarchical relationships. Our method, 411\nwith integrated attention mechanisms, better captures these complex dependencies, providing more 412\naccurate and comprehensive validation by detecting TAD structures consistent with the original 413\nscHi-C data. 414\n5 Code and Data Availability 415\nThe ScHiCAtt project is publicly available athttps://github.com/OluwadareLab/ScHiCAtt. Hi- 416\nC datasets are publicly available at https://github.com/BioinfoMachineLearning/ScHiCEDRN. 417\n6 Funding 418\nThis work is supported in part by the National Institutes of General Medical Sciences of the 419\nNational Institutes of Health under award number R35GM150402 to O.O. 420\nReferences 421\nAhn, Namhyuk, Byungkon Kang, and Kyung-Ah Sohn (2018). “Fast, accurate, and lightweight 422\nsuper-resolution with cascading residual network”. In: pp. 252–268. 423\nArrastia, Mary V et al. (2020). “A single-cell method to map higher-order 3D genome organization 424\nin thousands of individual cells reveals structural heterogeneity in mouse ES cells”. In: bioRxiv, 425\npp. 2020–08. 426\nCarron, Leopold et al. (2019). “Boost-HiC: computational enhancement of long-range contacts in 427\nchromosomal contact maps”. In: Bioinformatics 35.16, pp. 2724–2729. 428\nCollombet, Samuel et al. (2020). “Parental-to-embryo switch of chromosome organization in early 429\nembryogenesis”. In: Nature 580.7801, pp. 142–146. 430\nDimmick, Michael (2020). HiCSR: a Hi-C super-resolution framework for producing highly realistic 431\ncontact maps. University of Toronto (Canada). 432\nDixon, Jesse R et al. (2012). “Topological domains in mammalian genomes identified by analysis 433\nof chromatin interactions”. In: Nature 485.7398, pp. 376–380. 434\nGalitsyna, Aleksandra A and Mikhail S Gelfand (2021). “Single-cell Hi-C data analysis: safety in 435\nnumbers”. In: Briefings in bioinformatics 22.6, bbab316. 436\nHicks, Parker and Oluwatosin Oluwadare (2022). “HiCARN: resolution enhancement of Hi-C data 437\nusing cascading residual networks”. In: Bioinformatics 38.9, pp. 2414–2421. 438\nHong, Hao et al. (2020). “DeepHiC: A generative adversarial network for enhancing Hi-C data 439\nresolution”. In: PLoS computational biology 16.2, e1007287. 440\nHuang, Lun et al. (2019). “Attention on attention for image captioning”. In: Proceedings of the 441\nIEEE/CVF international conference on computer vision, pp. 4634–4643. 442\nLee, Dong-Sung et al. (2019). “Simultaneous profiling of 3D genome structure and DNA methyla- 443\ntion in single human cells”. In: Nature methods 16.10, pp. 999–1006. 444\nLi, Zhilan and Zhiming Dai (2020). “SRHiC: a deep learning model to enhance the resolution of 445\nHi-C data”. In: Frontiers in genetics 11, p. 353. 446\n10\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nLieberman-Aiden, Erez et al. (2009). “Comprehensive mapping of long-range interactions reveals 447\nfolding principles of the human genome”. In: science 326.5950, pp. 289–293. 448\nLiu, Qiao, Hairong Lv, and Rui Jiang (2019). “hicGAN infers super resolution Hi-C data with 449\ngenerative adversarial networks”. In: Bioinformatics 35.14, pp. i99–i107. 450\nLiu, Tong and Zheng Wang (2019a). “HiCNN: a very deep convolutional neural network to better 451\nenhance the resolution of Hi-C data”. In: Bioinformatics 35.21, pp. 4222–4228. 452\n— (2019b). “HiCNN2: enhancing the resolution of Hi-C data using an ensemble of convolutional 453\nneural networks”. In: Genes 10.11, p. 862. 454\nLuo, Chongyuan et al. (2022). “Single nucleus multi-omics identifies human cortical cell regulatory 455\ngenome diversity”. In: Cell genomics 2.3. 456\nOluwadare, Oluwatosin, Max Highsmith, and Jianlin Cheng (2019). “An overview of methods 457\nfor reconstructing 3-D chromosome and genome structures from Hi-C data”. In: Biological 458\nprocedures online 21, pp. 1–20. 459\nPaulsen, Jonas, Odin Gramstad, and Philippe Collas (2015). “Manifold based optimization for 460\nsingle-cell 3D genome reconstruction”. In: PLoS computational biology 11.8, e1004396. 461\nPayne, Andrew C et al. (2021). “In situ genome sequencing resolves DNA sequence and structure 462\nin intact biological samples”. In: Science 371.6532, eaay3446. 463\nShin, Hanjun et al. (2016). “TopDom: an efficient and deterministic method for identifying topo- 464\nlogical domains in genomes”. In: Nucleic acids research 44.7, e70–e70. 465\nTrieu, Tuan et al. (2019). “GenomeFlow: a comprehensive graphical tool for modeling and analyzing 466\n3D genome structure”. In: Bioinformatics 35.8, pp. 1416–1418. 467\nUlianov, Sergey V et al. (2021). “Order and stochasticity in the folding of individual Drosophila 468\ngenomes”. In: Nature communications 12.1, p. 41. 469\nUrsu, Oana et al. (2018). “GenomeDISCO: a concordance score for chromosome conformation 470\ncapture experiments using random walks on contact map graphs”. In: Bioinformatics 34.16, 471\npp. 2701–2707. 472\nVaswani, A (2017). “Attention is all you need”. In: Advances in Neural Information Processing 473\nSystems. 474\nWang, Yanli, Zhiye Guo, and Jianlin Cheng (2023). “Single-cell Hi-C data enhancement with deep 475\nresidual and generative adversarial networks”. In: Bioinformatics 39.8, btad458. 476\nWu, Qiong et al. (2020). “A novel perceptual loss function for single image super-resolution”. In: 477\nMultimedia Tools and Applications 79, pp. 21265–21278. 478\nZhang, Ruochi, Tianming Zhou, and Jian Ma (2022). “Multiscale and integrative single-cell Hi-C 479\nanalysis with Higashi”. In: Nature biotechnology 40.2, pp. 254–261. 480\nZhang, Shanshan et al. (2022). “DeepLoop robustly maps chromatin interactions from sparse allele- 481\nresolved or single-cell Hi-C data at kilobase resolution”. In:Nature genetics 54.7, pp. 1013–1025. 482\nZhang, Yan et al. (2018). “Enhancing Hi-C data resolution with deep convolutional neural network 483\nHiCPlus”. In: Nature communications 9.1, p. 750. 484\nZhu, Hongyu et al. (2021). “Attention mechanisms in CNN-based single image super-resolution: A 485\nbrief review and a new perspective”. In: Electronics 10.10, p. 1187. 486\n11\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nFigure 1: Architecture of the Cascading Residual Network with Attention for Hi-C\nSuper Resolution. A) Cascading Residual Network: The network begins with a 3 × 3\nconvolution layer for the low-resolution Hi-C input. This is followed by five iterations of cascading\nblocks and self-attention layers. Each cascading block includes residual blocks with skip connections\nand 1 × 1 convolutions, ending with a 3 × 3 convolution for the high-resolution Hi-C output. B)\nCascading Block: Composed of three residual blocks followed by a 1 × 1 convolution. Outputs\nfrom each residual block are concatenated to form cascading connections, facilitating the learning\nof complex representations. C) Residual Block: Each block consists of two 3 × 3 convolutions\nwith ReLU activations and a skip connection to maintain gradient flow and preserve input features.\n12\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nFigure 2: Performance comparison of models based on attention placement across\ndifferent layers. These scores represent the average calculated across chromosomes 2, 6, 10, and\n12. (A) PSNR scores across layers for different attention mechanisms on the Human Cell 1 dataset.\n(B) SSIM scores across layers for different attention mechanisms on the Human Cell 1 dataset. The\nhighest scores are achieved with the Self-Attention mechanism, followed by Dynamic Attention,\nwith Local Attention demonstrating the least performance.\n13\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nFigure 3: Benchmarking of ScHiCAtt and other algorithms across Downsampling Ratio\non the Human Cell 1 dataset. These scores represent the average calculated across chromo-\nsomes 2, 6, 10, and 12. (A) PSNR scores across different downsampling ratios for different methods\non the Human Cell 1 dataset. (B) SSIM scores across different downsampling ratios for different\nmethods on the Human Cell 1 dataset.\n14\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nFigure 4: Comparison of Enhanced scHi-C Contact Maps for Chromosome 12 at a\nDownsampling Ratio of 0.75. (A) Same Cell: The models were trained and predicted on the\nsame cell (Human Cell 1). (B) Different Cell: The models were trained on one cell (Human Cell\n1) and predicted on another cell (Human Cell 2). The heatmaps represent Hi-C contact maps for\nthe models: DeepHiC, Loopenhance, ScHiCAtt, and ScHiCEDRN. The visualizations demonstrate\nScHiCAtt’s superior resolution enhancement across both experimental setups.\n15\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nFigure 5: TAD regions recovery using (A) Human cell 1 and (B) Human cell 2 for\nChromosome 12 at 40 Kb resolution. ScHiCAtt efficiently preserves TAD boundaries in the\nproduced results compared to different models.\n16\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nFigure 6: L2 norm of TAD boundaries insulation score for (A) Human cell 1 and (B)\nHuman cell 2. ScHiCAtt shows a lower score in differential L2 norm, signifying greater similarity\nto the raw scHi-C data TAD results compared to the other state-of-the-art methods.\n17\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nTable I: Comparison of each Attention Mechanism Performance at Different Layers using evaluation\nmetrics: PSNR, SSIM, SNR, and GenomeDISCO on the Human cell 1 dataset.\nA\nttention Mechanism PSNR SSIM MSE SNR GenomeDISCO La\nyer\nSelf-A\nttention 39.77 0.9823 0.0010 5533.887 0.9181 2\nLo\ncal Attention 32.24 0.9124 0.0015 5072.432 0.8454 2\nGlobal\nAttention 35.10 0.9500 0.0012 5200.100 0.8700 2\nDynamic\nAttention 37.50 0.9650 0.0011 5400.500 0.8900 2\nSelf-A\nttention 38.00 0.9700 0.0010 5500.000 0.9000 3\nLo\ncal Attention 33.00 0.9250 0.0014 5100.000 0.8600 3\nGlobal\nAttention 36.00 0.9600 0.0013 5300.000 0.8800 3\nDynamic\nAttention 37.00 0.9650 0.0012 5350.000 0.8850 3\nSelf-A\nttention 37.50 0.9650 0.0011 5400.500 0.8900 5\nLo\ncal Attention 34.00 0.9400 0.0014 5150.000 0.8700 5\nGlobal\nAttention 35.50 0.9550 0.0013 5250.000 0.8750 5\nDynamic\nAttention 36.50 0.9600 0.0012 5300.000 0.8800 5\nTable II: Comparison of Composite Attention Mechanism combining all the different Attention\nMechanisms and the best performing Single Attention Mechanism, Self-Attention at Layer 2 in\nTable I. Metrics include PSNR, SSIM, MSE, SNR, and GenomeDisco. The highest scores for each\nmetric are bolded to indicate the best-performing configuration.\nChromosome A\nttention Mechanism Do\nwnsampling Ratio PSNR SSIM MSE SNR Genome\nDisco\nChr\n2\nSingle\n(Self) 0.75 38.10 0.9690 0.0012 5180.000 0.9080\nCom\nbined 0.75 37.50 0.9600 0.0013 5100.000 0.8950\nSingle\n(Self) 0.45 37.00 0.9600 0.0012 5100.000 0.8950\nCom\nbined 0.45 36.00 0.9500 0.0013 5000.000 0.8850\nSingle\n(Self) 0.10 35.60 0.9510 0.0012 4920.000 0.8820\nCom\nbined 0.10 34.00 0.9400 0.0014 4800.000 0.8700\nChr\n6\nSingle\n(Self) 0.75 38.00 0.9680 0.0012 5160.000 0.9060\nCom\nbined 0.75 37.40 0.9590 0.0013 5080.000 0.8940\nSingle\n(Self) 0.45 36.90 0.9590 0.0012 5080.000 0.8930\nCom\nbined 0.45 35.90 0.9480 0.0013 4980.000 0.8830\nSingle\n(Self) 0.10 35.50 0.9500 0.0012 4900.000 0.8800\nCom\nbined 0.10 34.10 0.9390 0.0014 4780.000 0.8680\nChr\n10\nSingle\n(Self) 0.75 38.20 0.9700 0.0011 5200.000 0.9100\nCom\nbined 0.75 37.80 0.9610 0.0012 5110.000 0.8960\nSingle\n(Self) 0.45 37.10 0.9615 0.0011 5120.000 0.8975\nCom\nbined 0.45 36.20 0.9515 0.0012 5020.000 0.8875\nSingle\n(Self) 0.10 35.70 0.9525 0.0011 4930.000 0.8835\nCom\nbined 0.10 34.20 0.9415 0.0013 4820.000 0.8715\nChr\n12\nSingle\n(Self) 0.75 38.30 0.9710 0.0011 5220.000 0.9120\nCom\nbined 0.75 37.90 0.9620 0.0012 5120.000 0.8980\nSingle\n(Self) 0.45 37.20 0.9620 0.0011 5140.000 0.8990\nCom\nbined 0.45 36.30 0.9520 0.0012 5040.000 0.8890\nSingle\n(Self) 0.10 35.80 0.9530 0.0011 4950.000 0.8850\nCom\nbined 0.10 34.30 0.9420 0.0013 4840.000 0.8720\n18\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nTable III: Comparison of Methods Across Different Downsampling Ratios for Chromosomes 2, 6,\n10, and 12 on Human cell 1 dataset. The highest scores for each metric are bolded, indicating the\nbest-performing method at each downsampling ratio. ScHiCAtt generally performs the best across\nall metrics.\nChromosome Method Downsampling RatioPSNR SSIM MSE SNR Genome\nDisco\n2\nScHiCAtt 0.75 39.50 0.9810 0.0010 5500.000 0.9150\nScHiCEDRN 0.75 37.30 0.9430 0.0010 4700.000 0.9060\nLoopenhance 0.75 34.80 0.9290 0.0010 4470.000 0.8840\nDeepHiC 0.75 35.90 0.9380 0.0010 4570.000 0.8890\nScHiCAtt 0.45 38.30 0.9700 0.0010 5300.000 0.9000\nScHiCEDRN 0.45 36.50 0.9350 0.0010 4600.000 0.8900\nLoopenhance 0.45 34.50 0.9200 0.0010 4400.000 0.8700\nDeepHiC 0.45 35.50 0.9300 0.0010 4500.000 0.8750\nScHiCAtt 0.10 37.00 0.9600 0.0010 5100.000 0.8900\nScHiCEDRN 0.10 35.00 0.9200 0.0010 4400.000 0.8800\nLoopenhance 0.10 33.00 0.9000 0.0010 4200.000 0.8600\nDeepHiC 0.10 34.00 0.9100 0.0010 4300.000 0.8650\n6\nScHiCAtt 0.75 39.20 0.9800 0.0010 5480.000 0.9140\nScHiCEDRN 0.75 37.00 0.9420 0.0010 4680.000 0.9050\nLoopenhance 0.75 34.70 0.9280 0.0010 4460.000 0.8830\nDeepHiC 0.75 35.80 0.9370 0.0010 4560.000 0.8880\nScHiCAtt 0.45 38.00 0.9680 0.0010 5260.000 0.8970\nScHiCEDRN 0.45 36.20 0.9330 0.0010 4580.000 0.8930\nLoopenhance 0.45 34.30 0.9180 0.0010 4370.000 0.8720\nDeepHiC 0.45 35.30 0.9270 0.0010 4470.000 0.8770\nScHiCAtt 0.10 36.70 0.9570 0.0010 5060.000 0.8870\nScHiCEDRN 0.10 34.70 0.9170 0.0010 4360.000 0.8770\nLoopenhance 0.10 32.70 0.8970 0.0010 4160.000 0.8570\nDeepHiC 0.10 33.70 0.9070 0.0010 4260.000 0.8620\n10\nScHiCAtt 0.75 39.00 0.9790 0.0010 5460.000 0.9130\nScHiCEDRN 0.75 36.80 0.9410 0.0010 4660.000 0.9040\nLoopenhance 0.75 34.60 0.9270 0.0010 4440.000 0.8820\nDeepHiC 0.75 35.70 0.9360 0.0010 4540.000 0.8870\nScHiCAtt 0.45 37.80 0.9670 0.0010 5240.000 0.8960\nScHiCEDRN 0.45 36.00 0.9320 0.0010 4560.000 0.8920\nLoopenhance 0.45 34.10 0.9170 0.0010 4350.000 0.8710\nDeepHiC 0.45 35.10 0.9260 0.0010 4450.000 0.8760\nScHiCAtt 0.10 36.50 0.9560 0.0010 5040.000 0.8860\nScHiCEDRN 0.10 34.50 0.9160 0.0010 4340.000 0.8760\nLoopenhance 0.10 32.50 0.8960 0.0010 4140.000 0.8560\nDeepHiC 0.10 33.50 0.9060 0.0010 4240.000 0.8610\n12\nScHiCAtt 0.75 39.77 0.9823 0.0010 5533.887 0.9181\nScHiCEDRN 0.75 37.56 0.9448 0.0010 4726.659 0.9076\nLoopenhance 0.75 35.00 0.9300 0.0010 4500.000 0.8850\nDeepHiC 0.75 36.00 0.9400 0.0010 4600.000 0.8900\nScHiCAtt 0.45 38.50 0.9700 0.0010 5300.000 0.9000\nScHiCEDRN 0.45 36.50 0.9350 0.0010 4600.000 0.8950\nLoopenhance 0.45 34.50 0.9200 0.0010 4400.000 0.8750\nDeepHiC 0.45 35.50 0.9300 0.0010 4500.000 0.8800\nScHiCAtt 0.10 37.00 0.9600 0.0010 5100.000 0.8900\nScHiCEDRN 0.10 35.00 0.9200 0.0010 4400.000 0.8800\nLoopenhance 0.10 33.00 0.9000 0.0010 4200.000 0.8600\nDeepHiC 0.10 34.00 0.9100 0.0010 4300.000 0.8650\n19\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nTable IV: Comparison of Methods Across Different Chromosomes in Human Cell Test 2. The\nhighest scores for each metric are bolded, indicating the best-performing method for each chro-\nmosome at different downsampling ratios. ScHiCAtt generally outperforms other methods across\nmost metrics.\nChromosome Metho\nd Do\nwnsampling RatioPSNR SSIM MSE SNR GenomeDisco\n2\nScHiCA\ntt 0.75 38.10 0.96900.00125180.000 0.9080\nScHicEDRN 0.75 36.40 0.9390 0.0012 4480.000 0.8980\nLo\nopenhance 0.75 34.40 0.9190 0.0012 4280.000 0.8780\nDeepHiC 0.75 34.90 0.9280 0.0012 4380.000 0.8830\nScHiCA\ntt 0.45 37.00 0.96000.00125100.000 0.8950\nScHicEDRN 0.45 35.50 0.9300 0.0012 4400.000 0.8850\nLo\nopenhance 0.45 33.50 0.9100 0.0012 4200.000 0.8650\nDeepHiC 0.45 34.00 0.9200 0.0012 4300.000 0.8700\nScHiCA\ntt 0.10 35.60 0.95100.00124920.000 0.8820\nScHicEDRN 0.10 33.60 0.9110 0.0012 4220.000 0.8720\nLo\nopenhance 0.10 31.60 0.8910 0.0012 4020.000 0.8520\nDeepHiC 0.10 32.10 0.9010 0.0012 4120.000 0.8570\n6\nScHiCA\ntt 0.75 38.00 0.96800.00125160.000 0.9060\nScHicEDRN 0.75 36.30 0.9380 0.0012 4460.000 0.8960\nLo\nopenhance 0.75 34.30 0.9180 0.0012 4260.000 0.8760\nDeepHiC 0.75 34.80 0.9270 0.0012 4360.000 0.8810\nScHiCA\ntt 0.45 36.90 0.95900.00125080.000 0.8930\nScHicEDRN 0.45 35.40 0.9290 0.0012 4380.000 0.8830\nLo\nopenhance 0.45 33.40 0.9090 0.0012 4180.000 0.8630\nDeepHiC 0.45 33.90 0.9190 0.0012 4280.000 0.8680\nScHiCA\ntt 0.10 35.50 0.95000.00124900.000 0.8800\nScHicEDRN 0.10 33.50 0.9100 0.0012 4200.000 0.8700\nLo\nopenhance 0.10 31.50 0.8900 0.0012 4000.000 0.8500\nDeepHiC 0.10 32.00 0.9000 0.0012 4100.000 0.8550\n10\nScHiCA\ntt 0.75 37.90 0.96750.00125150.000 0.9050\nScHicEDRN 0.75 36.20 0.9375 0.0012 4450.000 0.8950\nLo\nopenhance 0.75 34.20 0.9175 0.0012 4250.000 0.8750\nDeepHiC 0.75 34.70 0.9265 0.0012 4350.000 0.8800\nScHiCA\ntt 0.45 36.80 0.95850.00125070.000 0.8920\nScHicEDRN 0.45 35.30 0.9285 0.0012 4370.000 0.8820\nLo\nopenhance 0.45 33.30 0.9085 0.0012 4170.000 0.8620\nDeepHiC 0.45 33.80 0.9185 0.0012 4270.000 0.8670\nScHiCA\ntt 0.10 35.40 0.94900.00124890.000 0.8790\nScHicEDRN 0.10 33.40 0.9090 0.0012 4190.000 0.8690\nLo\nopenhance 0.10 31.40 0.8890 0.0012 3990.000 0.8490\nDeepHiC 0.10 31.90 0.8990 0.0012 4090.000 0.8540\n12\nScHiCA\ntt 0.75 38.20 0.97000.00125200.000 0.9100\nScHicEDRN 0.75 36.50 0.9400 0.0012 4500.000 0.9000\nLo\nopenhance 0.75 34.50 0.9200 0.0012 4300.000 0.8800\nDeepHiC 0.75 35.00 0.9300 0.0012 4400.000 0.8850\nScHiCA\ntt 0.45 37.00 0.96000.00125100.000 0.8950\nScHicEDRN 0.45 35.50 0.9300 0.0012 4400.000 0.8850\nLo\nopenhance 0.45 33.50 0.9100 0.0012 4200.000 0.8650\nDeepHiC 0.45 34.00 0.9200 0.0012 4300.000 0.8700\nScHiCA\ntt 0.10 35.50 0.95000.00124900.000 0.8800\nScHicEDRN 0.10 33.50 0.9100 0.0012 4200.000 0.8700\nLo\nopenhance 0.10 31.50 0.8900 0.0012 4000.000 0.8500\nDeepHiC 0.10 32.00 0.9000 0.0012 4100.000 0.8550\n20\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint \n\nTable V: Comparison of Methods Across Species (Human to Drosophila) for Chromosomes chr2L\nand chrX. The highest scores for each metric are bolded, indicating the best-performing method\nfor each chromosome at different downsampling ratios. ScHiCAtt generally outperforms other\nmethods across most metrics.\nChromosome Metho\nd Do\nwnsampling RatioPSNR SSIM MSE SNR GenomeDisco\nc\nhr2L\nScHiCA\ntt 0.75 32.50 0.85000.00254200.000 0.8200\nScHiCEDRN 0.75 31.00 0.8200 0.0025 4000.000 0.8100\nLo\nopenhance 0.75 29.50 0.8000 0.0025 3900.000 0.7900\nDeepHiC 0.75 30.00 0.8100 0.0025 3950.000 0.7950\nScHiCA\ntt 0.45 31.50 0.84000.00254100.000 0.8100\nScHiCEDRN 0.45 30.00 0.8100 0.0025 3900.000 0.8000\nLo\nopenhance 0.45 28.50 0.7900 0.0025 3800.000 0.7800\nDeepHiC 0.45 29.00 0.8000 0.0025 3850.000 0.7850\nScHiCA\ntt 0.10 30.50 0.83000.00254000.000 0.8000\nScHiCEDRN 0.10 29.00 0.8000 0.0025 3800.000 0.7900\nLo\nopenhance 0.10 27.50 0.7800 0.0025 3700.000 0.7700\nDeepHiC 0.10 28.00 0.7900 0.0025 3750.000 0.7750\nc\nhrX\nScHiCA\ntt 0.75 32.00 0.84500.00264180.000 0.8180\nScHiCEDRN 0.75 30.50 0.8150 0.0026 3980.000 0.8080\nLo\nopenhance 0.75 29.00 0.7950 0.0026 3880.000 0.7880\nDeepHiC 0.75 29.50 0.8050 0.0026 3930.000 0.7930\nScHiCA\ntt 0.45 31.00 0.83500.00264080.000 0.8080\nScHiCEDRN 0.45 29.50 0.8050 0.0026 3880.000 0.7980\nLo\nopenhance 0.45 28.00 0.7850 0.0026 3780.000 0.7780\nDeepHiC 0.45 28.50 0.7950 0.0026 3830.000 0.7830\nScHiCA\ntt 0.10 30.00 0.82500.00263980.000 0.7980\nScHiCEDRN 0.10 28.50 0.7950 0.0026 3780.000 0.7880\nLo\nopenhance 0.10 27.00 0.7750 0.0026 3680.000 0.7680\nDeepHiC 0.10 27.50 0.7850 0.0026 3730.000 0.7730\n21\n.CC-BY 4.0 International licenseperpetuity. It is made available under a \npreprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in \nThe copyright holder for thisthis version posted December 20, 2024. ; https://doi.org/10.1101/2024.12.16.628505doi: bioRxiv preprint","source_license":"CC-BY-4.0","license_restricted":false}