Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-1422

Find example data sets where soft clipping reveals polymorphisms

    Details

    • Type: Improvement
    • Status: Closed (View Workflow)
    • Priority: Minor
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Story Points:
      0.5
    • Sprint:
      Fall 2018 Sprint 3

      Description

      Find real soft clip data for testing purposes. The ideal dataset would have the following:
      -paired-end data
      -data with known structural variants
      -preferably aligned in BAM format

      One user who uses soft clipping in their research suggested the following links:
      http://jimb.stanford.edu/giab-resources/
      https://toolbox.google.com/datasetsearch

        Attachments

          Issue Links

            Activity

            Hide
            aduong Anh Moss (Inactive) added a comment -

            Here is a link to the following dataset:
            https://figshare.com/articles/BAM_and_BAI_files/4530707/1

            Data posted by: Reetta Holmila: Massively parallel deep sequencing of plasma cfDNA methylation in HCC and controls (categories: biomarkers)

            This page lists both the BAM and BAI files.

            Show
            aduong Anh Moss (Inactive) added a comment - Here is a link to the following dataset: https://figshare.com/articles/BAM_and_BAI_files/4530707/1 Data posted by: Reetta Holmila: Massively parallel deep sequencing of plasma cfDNA methylation in HCC and controls (categories: biomarkers) This page lists both the BAM and BAI files.
            Hide
            ann.loraine Ann Loraine added a comment -

            Quick followup:

            Is this DNA-Seq data or bisulfite sequencing data?

            Show
            ann.loraine Ann Loraine added a comment - Quick followup: Is this DNA-Seq data or bisulfite sequencing data?
            Hide
            aduong Anh Moss (Inactive) added a comment -

            I looked over the data information, and it does appear to be bisulfite sequencing data. I found this paragraph detailing the data:

            "Massively parallel deep bisulfite sequencing of plasma cfDNA methylation in HCC
            Published on 09 Jan 2017 - 00:14 by Reetta Holmila
            In this study, we applied targeted massively parallel semiconductor sequencing to assess methylation on a panel of genes (FBLN1, HINT2, LAMC1, LTBP1, LTBP2, PSMA2, PSMA7, PXDN, TGFB1, UBE2L3, VIM and YWHAZ) in plasma circulating cell-free DNA (cfDNA) and to evaluate the potential of these genes as HCC biomarkers in two different series, one from France (42 HCC cases and 42 controls) and one from Thailand (42 HCC cases, 26 chronic liver disease cases and 42 controls). We also analyzed a set of HCC and adjacent tissues and liver cell lines to further compare with ‘The Cancer Genome Atlas’ (TCGA) data."

            Link to that page: https://figshare.com/projects/Massively_parallel_deep_bisulfite_sequencing_of_plasma_cfDNA_methylation_in_HCC/18185

            Show
            aduong Anh Moss (Inactive) added a comment - I looked over the data information, and it does appear to be bisulfite sequencing data. I found this paragraph detailing the data: "Massively parallel deep bisulfite sequencing of plasma cfDNA methylation in HCC Published on 09 Jan 2017 - 00:14 by Reetta Holmila In this study, we applied targeted massively parallel semiconductor sequencing to assess methylation on a panel of genes (FBLN1, HINT2, LAMC1, LTBP1, LTBP2, PSMA2, PSMA7, PXDN, TGFB1, UBE2L3, VIM and YWHAZ) in plasma circulating cell-free DNA (cfDNA) and to evaluate the potential of these genes as HCC biomarkers in two different series, one from France (42 HCC cases and 42 controls) and one from Thailand (42 HCC cases, 26 chronic liver disease cases and 42 controls). We also analyzed a set of HCC and adjacent tissues and liver cell lines to further compare with ‘The Cancer Genome Atlas’ (TCGA) data." Link to that page: https://figshare.com/projects/Massively_parallel_deep_bisulfite_sequencing_of_plasma_cfDNA_methylation_in_HCC/18185
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Also need to locate BAM files from sequencing of non-bisulfite converted samples from individuals with structural variations.

            Show
            ann.loraine Ann Loraine added a comment - - edited Also need to locate BAM files from sequencing of non -bisulfite converted samples from individuals with structural variations.
            Hide
            ann.loraine Ann Loraine added a comment -

            Un-assigning as Ahn is moving on to her next rotation. Good luck with everything!

            Show
            ann.loraine Ann Loraine added a comment - Un-assigning as Ahn is moving on to her next rotation. Good luck with everything!
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Data sets from genome sequencing saved in IGB Dropbox under "Jira issues" folder for this issue.
            Data are from twitter thread w/ Brent Pedersen from Aaron Quinlan lab University of Utah. See attached.
            Marking this as closed (for now) as we now have a good handle on how to get more data & where to find potential reviewers/representative users:

            • Brent Pedersen, Quinlan lab (working on structural variants)
            • Greenwood Genetics (human genetics clinic in South Carolina)
            • Canadian Bioinformatics workshop on genomic medicine instructors & attendees (attended by Nowlan & Ann, 2018)
            • Wake Forest human genetics and genomics faculty
            • Steve Chervitz Trutane (Personalis)
            • Clinical variant analysts (https://www.linkedin.com/in/shelly-sorrells-phd-2a46333b/)
            Show
            ann.loraine Ann Loraine added a comment - - edited Data sets from genome sequencing saved in IGB Dropbox under "Jira issues" folder for this issue. Data are from twitter thread w/ Brent Pedersen from Aaron Quinlan lab University of Utah. See attached. Marking this as closed (for now) as we now have a good handle on how to get more data & where to find potential reviewers/representative users: Brent Pedersen, Quinlan lab (working on structural variants) Greenwood Genetics (human genetics clinic in South Carolina) Canadian Bioinformatics workshop on genomic medicine instructors & attendees (attended by Nowlan & Ann, 2018) Wake Forest human genetics and genomics faculty Steve Chervitz Trutane (Personalis) Clinical variant analysts ( https://www.linkedin.com/in/shelly-sorrells-phd-2a46333b/ )

              People

              • Assignee:
                aduong Anh Moss (Inactive)
                Reporter:
                aduong Anh Moss (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: