{"file": "/docs/source/data_availability_and_access.rst", "title": "SMaHT Data Availability and Access", "status": "open", "options": {"filetype": "rst", "collapsible": false, "default_open": true, "convert_ext_links": true}, "consortia": [{"@id": "/consortia/358aed10-9b9d-4e26-ab84-4bd162da182b/", "uuid": "358aed10-9b9d-4e26-ab84-4bd162da182b", "status": "open", "@type": ["Consortium", "Item"], "display_title": "SMaHT", "principals_allowed": {"view": ["system.Everyone"], "edit": ["group.admin"]}}], "identifier": "data_availability_and_access", "date_created": "2025-10-10T16:21:10.768348+00:00", "section_type": "Page Section", "submitted_by": {"error": "no view permissions"}, "last_modified": {"modified_by": {"error": "no view permissions"}, "date_modified": "2026-05-06T16:24:42.132869+00:00"}, "schema_version": "1", "submission_centers": [{"uuid": "9626d82e-8110-4213-ac75-0a50adf890ff", "status": "open", "@type": ["SubmissionCenter", "Item"], "@id": "/submission-centers/9626d82e-8110-4213-ac75-0a50adf890ff/", "display_title": "HMS DAC", "principals_allowed": {"view": ["system.Everyone"], "edit": ["group.admin"]}}], "@id": "/static-sections/550230b3-34fe-42cf-8fb8-72fad10b9861/", "@type": ["StaticSection", "UserContent", "Item"], "uuid": "550230b3-34fe-42cf-8fb8-72fad10b9861", "principals_allowed": {"view": ["system.Everyone"], "edit": ["group.admin"]}, "display_title": "SMaHT Data Availability and Access", "content_as_html": "<div class=\"rst-container\"><h3>What are SMaHT Data?</h3><p>The Somatic Mosaicism across Human Tissues (SMaHT) Network aims to create a reference catalog of somatic mutations and their patterns across a full spectrum of tissue types from 150 non-diseased donors, applying multiple state-of-the-art experimental assays and computational methods.</p><p>The data generated from the SMaHT Network provides a comprehensive and rich public resource for genomic studies that aim to characterize somatic mosaicism and understand its role in human biology and pathology.</p><p>The primary molecular assays of the SMaHT Network include bulk whole genome sequencing (WGS) using both Illumina-based short-read, and PacBio- or ONT-based long-read sequencing platforms, as well as bulk whole transcriptome sequencing on Illumina (total RNA-Seq) and PacBio (Kinnex) platforms.</p><h3>SMaHT Data Access</h3><p>All data/metadata files generated from the SMaHT Network are available for download from the SMaHT Data Portal to the portal users with appropriate access. For self-registered Data Portal users who are not part of the SMaHT Network, the data are available after official data releases.</p><p><strong>Open Access</strong>: The open-access data/metadata files are available for download after a login as a SMaHT Network member as well as a self-registered Data Portal user who is not part of the SMaHT Network.</p><p><strong>Protected Access</strong>: All sequence data (DNA and RNA), inherited germline variant data, and full donor metadata files are protected-access data under dbGaP. Please see our <a class=\"reference external\" href=\"https://data.smaht.org/docs/access/getting-dbgap-access\" target=\"_blank\" rel=\"noopener noreferrer\">Protected Data Access page</a> for instructions on how to request access for the protected-access SMaHT data under dbGaP.</p><div class=\"admonition tip\"><p class=\"first admonition-title\">Data Release</p><p class=\"last\">Production Data are only available to members of the SMaHT Consortium at this time. Please check back for upcoming releases.</p></div><div class=\"line-block\"><div class=\"line\"><br/></div></div><br/><br/><div class=\"table-responsive\"><table class=\"data-availability-table table-sm\"><thead><tr><th></th><th><b>COLO829</b></th><th><b>HapMap Mixture</b></th><th><b>LB-LA Fibroblasts and Derived iPSCs</b></th><th><b>Benchmarking Donors</b></th><th><b>Production Donors</b></th></tr></thead><tbody><tr><td class=\"row-label\"><b>Sequence Data</b><br/> (FASTQ, BAM/CRAM)</td><td class=\"open\">open</td><td class=\"open\">open</td><td class=\"protected\">protected</td><td class=\"protected\">protected</td><td class=\"protected\">protected</td></tr><tr><td class=\"row-label\"><b>Germline Variants</b><br/></td><td class=\"open\">open</td><td class=\"open\">open</td><td class=\"protected\">protected</td><td class=\"protected\">protected</td><td class=\"protected\">protected</td></tr><tr><td class=\"row-label\"><b>Somatic Variants</b><br/>(all variant types, from individual donor or aggregated samples)</td><td class=\"open\">open</td><td class=\"open\">open</td><td class=\"protected\">protected</td><td class=\"protected\">protected</td><td class=\"open\">open</td></tr><tr><td class=\"row-label\"><b>Gene Expression Profile</b><br/></td><td class=\"open\">open</td><td class=\"open\">open</td><td class=\"protected\">protected</td><td class=\"protected\">protected</td><td class=\"open\">open</td></tr><tr><td class=\"row-label\"><b>Epigenetic Profile</b><br/></td><td class=\"open\">open</td><td class=\"open\">open</td><td class=\"protected\">protected</td><td class=\"protected\">protected</td><td class=\"open\">open</td></tr><tr><td class=\"row-label\"><b>Limited Donor Metadata</b><br/> (i.e., age, sex, hardy scale)</td><td class=\"open\">open</td><td class=\"open\">open</td><td class=\"open\">open</td><td class=\"open\">open</td><td class=\"open\">open</td></tr><tr><td class=\"row-label\"><b>Full Donor Metadata</b><br/> (e.g., smoking status, environmental exposure, prior clinical history)</td><td class=\"not-applicable\">N/A</td><td class=\"not-applicable\">N/A</td><td class=\"not-applicable\">N/A</td><td class=\"protected\">protected</td><td class=\"protected\">protected</td></tr><tr><td class=\"row-label\"><b>Reference Files</b><br/> (e.g., Human genome reference, gene models, genome stratification into easy/difficult/extreme-to-map regions)</td><td class=\"open\">open</td><td class=\"open\">open</td><td class=\"open\">open</td><td class=\"open\">open</td><td class=\"open\">open</td></tr><tr><td class=\"row-label\"><b>Histology</b><br/> (Amperio SVS images, histology and pathology data)</td><td class=\"not-applicable\">N/A</td><td class=\"not-applicable\">N/A</td><td class=\"not-applicable\">N/A</td><td class=\"not-applicable\">N/A</td><td class=\"open\">open</td></tr></tbody></table></div></div>", "content": ".. raw:: html\n    \n    <h3>What are SMaHT Data?</h3>\n\nThe Somatic Mosaicism across Human Tissues (SMaHT) Network aims to create a reference catalog of somatic mutations and their patterns across a full spectrum of tissue types from 150 non-diseased donors, applying multiple state-of-the-art experimental assays and computational methods. \n\nThe data generated from the SMaHT Network provides a comprehensive and rich public resource for genomic studies that aim to characterize somatic mosaicism and understand its role in human biology and pathology.   \n\nThe primary molecular assays of the SMaHT Network include bulk whole genome sequencing (WGS) using both Illumina-based short-read, and PacBio- or ONT-based long-read sequencing platforms, as well as bulk whole transcriptome sequencing on Illumina (total RNA-Seq) and PacBio (Kinnex) platforms.\n\n.. raw:: html\n    \n    <h3>SMaHT Data Access</h3>\n\nAll data/metadata files generated from the SMaHT Network are available for download from the SMaHT Data Portal to the portal users with appropriate access. For self-registered Data Portal users who are not part of the SMaHT Network, the data are available after official data releases.\n\n**Open Access**: The open-access data/metadata files are available for download after a login as a SMaHT Network member as well as a self-registered Data Portal user who is not part of the SMaHT Network.\n\n**Protected Access**: All sequence data (DNA and RNA), inherited germline variant data, and full donor metadata files are protected-access data under dbGaP. Please see our `Protected Data Access page <https://data.smaht.org/docs/access/getting-dbgap-access>`_ for instructions on how to request access for the protected-access SMaHT data under dbGaP.\n\n\n.. admonition:: Data Release\n   :class: tip\n\n   Production Data are only available to members of the SMaHT Consortium at this time. Please check back for upcoming releases.\n\n|\n\n.. raw:: html\n\n    <br />\n    <br />\n    <div class=\"table-responsive\">\n        <table class=\"data-availability-table table-sm\">\n            <thead>\n                <tr>\n                    <th></th>\n                    <th><b>COLO829</b></th>\n                    <th><b>HapMap Mixture</b></th>\n                    <th><b>LB-LA Fibroblasts and Derived iPSCs</b></th>\n                    <th><b>Benchmarking Donors</b></th>\n                    <th><b>Production Donors</b></th>\n                </tr>\n            </thead>\n            <tbody>\n                <tr>\n                    <td class=\"row-label\"><b>Sequence Data</b><br/> &lpar;FASTQ, BAM/CRAM&rpar;</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"protected\">protected</td>\n                    <td class=\"protected\">protected</td>\n                    <td class=\"protected\">protected</td>\n                </tr>\n                <tr>\n                    <td class=\"row-label\"><b>Germline Variants</b><br/></td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"protected\">protected</td>\n                    <td class=\"protected\">protected</td>\n                    <td class=\"protected\">protected</td>\n                </tr>\n                <tr>\n                    <td class=\"row-label\"><b>Somatic Variants</b><br/>&lpar;all variant types, from individual donor or aggregated samples&rpar;</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"protected\">protected</td>\n                    <td class=\"protected\">protected</td>\n                    <td class=\"open\">open</td>\n                </tr>\n                <tr>\n                    <td class=\"row-label\"><b>Gene Expression Profile</b><br/></td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"protected\">protected</td>\n                    <td class=\"protected\">protected</td>\n                    <td class=\"open\">open</td>\n                </tr>\n                <tr>\n                    <td class=\"row-label\"><b>Epigenetic Profile</b><br/></td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"protected\">protected</td>\n                    <td class=\"protected\">protected</td>\n                    <td class=\"open\">open</td>\n                </tr>\n                <tr>\n                    <td class=\"row-label\"><b>Limited Donor Metadata</b><br/> &lpar;i.e., age, sex, hardy scale&rpar;</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                </tr>\n                <tr>\n                    <td class=\"row-label\"><b>Full Donor Metadata</b><br/> &lpar;e.g., smoking status, environmental exposure, prior clinical history&rpar;</td>\n                    <td class=\"not-applicable\">N/A</td>\n                    <td class=\"not-applicable\">N/A</td>\n                    <td class=\"not-applicable\">N/A</td>\n                    <td class=\"protected\">protected</td>\n                    <td class=\"protected\">protected</td>\n                </tr>\n                <tr>\n                    <td class=\"row-label\"><b>Reference Files</b><br/> &lpar;e.g., Human genome reference, gene models, genome stratification into easy/difficult/extreme-to-map regions&rpar;</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                    <td class=\"open\">open</td>\n                </tr>\n                <tr>\n                    <td class=\"row-label\"><b>Histology</b><br/> &lpar;Amperio SVS images, histology and pathology data&rpar;</td>\n                    <td class=\"not-applicable\">N/A</td>\n                    <td class=\"not-applicable\">N/A</td>\n                    <td class=\"not-applicable\">N/A</td>\n                    <td class=\"not-applicable\">N/A</td>\n                    <td class=\"open\">open</td>\n                </tr>\n            </tbody>\n        </table>\n    </div>", "filetype": "rst", "@context": "/terms/", "aggregated-items": {}, "validation-errors": []}