Tag Archives: genomics

Searching for Genetic Clues in Autism with Family Trio Sequencing

This entry was cross-posted from DNAdigest on April 22, 2015.

Amazingly, the cost of whole genome sequencing is now 100,000 times less expensive than it was a dozen years ago. If the Tesla Model S followed this trajectory, you could buy one today for less than $1 USD. This super logarithmic decline puts genomics on par with desktop publishing or 3D printing—it has become something that you can affordably do yourself.

My wife, Kimberly, and I were excited about the prospect of having our genomes sequenced.Pickard-KT-and-Kimberly Our daughter has autism, and like many parents of special needs children, we were eager to explore the underlying causes of her condition. We “got genomed” last year by enrolling in Illumina’s Understand Your Genome program. We received our whole genome sequencing (WGS) data, as well as limited predisposition and carrier screening for a number of Mendelian traits. As many DNAdigest readers know, the cost of WGS continues to drop in price, almost to the $1,000 genome that Illumina announced last year. Kimberly and I were intrigued to learn that we were both carriers of some rare genetic variants. Could our genetic idiosyncrasies be contributing to our daughter’s autism?

After being sequenced, I followed the lead of DNAdigest contributor Manuel Corpas and posted my whole genome sequence online. I decided to publish my genome without restrictions in an attempt to lead by example. In the future, platforms like Repositive will make it easier for consumers to share genomic information and maintain privacy.

Kimberly and I recently launched a project on experiment.com to crowd fund the whole genome sequencing of our adult-aged daughter. In this project, we will look for genetic clues to her autism using family trio sequencing. Family trio sequencing is a powerful technique that can explain genetic conditions by looking at differences in DNA between Mom, Dad and an affected child.

We were thrilled when the sequencing project was funded the first day. In the process, we received feedback from other parents who wanted to learn more about the technique, so we added a stretch goal to cover publishing costs in an open access journal. The research paper will document our findings, as well as explain how family trio sequencing can be used to search for answers to health conditions and rare diseases.

Information sharing can indeed be very personal, but we find the possibility of catalyzing new areas of health research compelling. With this project, we hope to find clues that will contribute, if only in a small way, to a growing body of genomics research that supports a broader explanation of autism.

Exploring Markets of Data for Personal Health Information

Consumers are willing to share health information with financial reward

Are some consumers willing to sell their personal health information? It looks like the answer is “yes.” This week, I presented a paper at the IEEE International Conference on Data Mining in Shenzhen, China. This paper summarized the results of an online survey about consumers’ willingness to share de-identified health information, and whether their attitudes would change if a financial reward was offered. Here’s the abstract:

To realize preventive and personalized medicine, large numbers of consumers must pool health information to create datasets that can be analyzed for wellness and disease trends. To date, consumers have been reluctant to share personal health information for a variety of reasons. To explore how financial rewards may influence data sharing, the concept of Markets of Data (MoDAT) is applied to health information. Results from a global online survey show that a previously uncovered group of consumers exists who are willing to sell their de-identified personal health information. Incorporating this information into existing health research databases has the potential to improve healthcare worldwide.

During the presentation, I argued that patient populations for both rare and common diseases can look similar, especially when looking at disease subtypes. When considering relatively common diseases such as diabetes, schizophrenia, and autism spectrum disorders, a single hospital in the U.S. will not see enough patients for a given disease subtype to make meaningful conclusions. On average, U.S.-based hospitals do not have enough patients to solve disease questions without sharing health information.

For this survey, a global panel of 400 participants was selected at random by AYTM, an online market research tool. Questions were based on a previous health information sharing survey, with additional questions about sharing with financial reward. I received 400 responses from 59 countries in less than two hours. U.S.-based respondents overwhelming believed that their health information was worth more than $1000, but the global average was around $250 when the U.S. was excluded. For these participants, both their motivation and the amount of data shared increased with financial reward. Keep in mind that these participants were paid to respond to the survey, so they represent a kind of self-selected group. Nevertheless, monetizing health information sharing produced a surprising result, demonstrating that an alternative source of health information may exist for research purposes.

Additional resources: Paper, Supplemental files, Slides

I uploaded my whole genome sequence data to the cloud.

i-got-genomedI got genomed by Illumina.

In March 2014, my wife and I “got genomed” by enrolling in Illumina’s Understand Your Genome (UYG) program. UYG requires participants to order this whole genome sequence (WGS) test from their physicians due to uncertainties surrounding the delivery of genomic results in the U.S. Illumina is careful to point out that the service “…has not been cleared or approved by the U.S. Food and Drug Administration” and “you will not receive medical results, or a diagnosis, or a recommendation for treatment.” Our family physician signed the request in November 2013, and we received our results in February. Fortunately, no surprises, but the UYG program only covers these Mendelian disorders for now. We flew to San Diego a few weeks later to listen to talks by genomic researchers and discuss our results with genetic counselors. As part of this one-day seminar, we each received an iPad Mini that was pre-loaded with our results, as well as a portable hard drive that contained our raw sequence data.

illumina-wgs-hard-drive I received my WGS data on this encrypted hard drive (about 100GB).

After we arrived home, the next step was to find a public “home” for my sequence data (to share without restrictions). What I learned is that uploading your genome anywhere is a challenge, mostly because the dataset is so big.

I looked at DropboxEvernote and Figshare, but their storage models do not scale well for genomic data. I tried Sage Bionetworks, but the BAM file was too large to upload. I settled on Amazon Web Services (AWS) and created an anonymous FTP server using the Amazon Elastic Compute Cloud (EC2). (I spent a bunch of time working with Amazon’s Simple Storage Service (S3) using this article, but the 5GB file size limit of s3fs nixed that.)

About my whole genome sequence data

My genome data and results are now in the public domain, freely available to download under a Creative Commons (CC0) license. Uploading the data took two days over a 3Mbps connection, so you may want to read the clinical report and sample report instead.

  • BAM file checksum: 2529521235 (78.1GB uncompressed)
  • VCF file checksum: 4165261022 (2.4GB gzip compressed)

Questions about FTP? See this FAQ.

Now that I have my genome in the cloud, I’ll start playing with analysis tools like STORMSeq. Stay tuned!

Paper: Big Desire to Share Big Health Data

health-data-sharing-model

Today I presented this paper about sharing personal health data at the 2014 AAAI Spring Symposium Series, hosted at Stanford University. The paper, co-authored with Melanie Swan, summarized the results of an online survey to gauge consumer attitudes toward sharing health information. Here’s the abstract:

Sharing personal health information is essential to create next generation healthcare services. To realize preventive and personalized medicine, large numbers of consumers must pool health information to create datasets that can be analyzed for wellness and disease trends. Incorporating this information will not only empower consumers, but also enable health systems to improve patient care. To date, consumers have been reluctant to share personal health information for a variety of reasons, but attitudes are shifting. Results from an online survey demonstrate a strong willingness to share health information for research purposes. Building on these results, the authors present a framework to increase health information sharing based on trust, motivation, community, and informed consent.

The take-home messages from the paper are:

  1. Consumers are willing to share health data under the right conditions.
  2. Education seems to play a strong role.
  3. Consumers want to be connected to their data.
  4. Develop models to encourage sharing. 

My favorite part of the talk was explaining how I repeated the survey using an online market research tool. Our respondents were extremely educated — 59% had a Master’s level education or higher — so I wondered if education played a role in their willingness to share. In less than two hours, I posted the survey and received 100 responses (compared with the nine months it took to receive 128 IRB-consented responses). This time, about 20% of the respondents had a Master’s level education or higher, still higher than the US average of 10%, according to the US Census Bureau. Nevertheless, overall attitudes toward sharing were similar. In particular, respondents who were not willing to share their health information tended to have little or no college experience. Although both surveys operated on convenience samples, the results suggest that education plays a role, perhaps because education can change our perception of the risks and benefits associated with sharing health data. Interestingly, these results and conclusions were similar to those found in a recent report published by the Health Data Exploration project sponsored by the Robert Wood Johnson Foundation. More information about this project:

The survey is ongoing! It takes just five minutes, so please add your voice here.

Trust, but verify

Working with 23andMe exome data: my CF allele and the need for verification

This informative blog post from Dr. Jung Choi at Georgia Tech discusses how to use free, publicly available bioinformatics tools to interpret new exome sequence data from 23andMe. The post includes a response from 23andMe in the comments.

Some of the bioinformatics tools that Dr. Choi uses are:

The post highlights the challenges of mapping gene-protein interactions when reporting results.

Jc_cftr_rpt