Exploring Markets of Data for Personal Health Information

Consumers are willing to share health information with financial reward

Are some consumers willing to sell their personal health information? It looks like the answer is “yes.” This week, I presented a paper at the IEEE International Conference on Data Mining in Shenzhen, China. This paper summarized the results of an online survey about consumers’ willingness to share de-identified health information, and whether their attitudes would change if a financial reward was offered. Here’s the abstract:

To realize preventive and personalized medicine, large numbers of consumers must pool health information to create datasets that can be analyzed for wellness and disease trends. To date, consumers have been reluctant to share personal health information for a variety of reasons. To explore how financial rewards may influence data sharing, the concept of Markets of Data (MoDAT) is applied to health information. Results from a global online survey show that a previously uncovered group of consumers exists who are willing to sell their de-identified personal health information. Incorporating this information into existing health research databases has the potential to improve healthcare worldwide.

During the presentation, I argued that patient populations for both rare and common diseases can look similar, especially when looking at disease subtypes. When considering relatively common diseases such as diabetes, schizophrenia, and autism spectrum disorders, a single hospital in the U.S. will not see enough patients for a given disease subtype to make meaningful conclusions. On average, U.S.-based hospitals do not have enough patients to solve disease questions without sharing health information.

For this survey, a global panel of 400 participants was selected at random by AYTM, an online market research tool. Questions were based on a previous health information sharing survey, with additional questions about sharing with financial reward. I received 400 responses from 59 countries in less than two hours. U.S.-based respondents overwhelming believed that their health information was worth more than $1000, but the global average was around $250 when the U.S. was excluded. For these participants, both their motivation and the amount of data shared increased with financial reward. Keep in mind that these participants were paid to respond to the survey, so they represent a kind of self-selected group. Nevertheless, monetizing health information sharing produced a surprising result, demonstrating that an alternative source of health information may exist for research purposes.

Additional resources: Paper, Supplemental files, Slides

I uploaded my whole genome sequence data to the cloud.

i-got-genomedI got genomed by Illumina.

In March 2014, my wife and I “got genomed” by enrolling in Illumina’s Understand Your Genome (UYG) program. UYG requires participants to order this whole genome sequence (WGS) test from their physicians due to uncertainties surrounding the delivery of genomic results in the U.S. Illumina is careful to point out that the service “…has not been cleared or approved by the U.S. Food and Drug Administration” and “you will not receive medical results, or a diagnosis, or a recommendation for treatment.” Our family physician signed the request in November 2013, and we received our results in February. Fortunately, no surprises, but the UYG program only covers these Mendelian disorders for now. We flew to San Diego a few weeks later to listen to talks by genomic researchers and discuss our results with genetic counselors. As part of this one-day seminar, we each received an iPad Mini that was pre-loaded with our results, as well as a portable hard drive that contained our raw sequence data.

illumina-wgs-hard-drive I received my WGS data on this encrypted hard drive (about 100GB).

After we arrived home, the next step was to find a public “home” for my sequence data (to share without restrictions). What I learned is that uploading your genome anywhere is a challenge, mostly because the dataset is so big.

I looked at DropboxEvernote and Figshare, but their storage models do not scale well for genomic data. I tried Sage Bionetworks, but the BAM file was too large to upload. I settled on Amazon Web Services (AWS) and created an anonymous FTP server using the Amazon Elastic Compute Cloud (EC2). (I spent a bunch of time working with Amazon’s Simple Storage Service (S3) using this article, but the 5GB file size limit of s3fs nixed that.)

About my whole genome sequence data

My genome data and results are now in the public domain, freely available to download under a Creative Commons (CC0) license. Uploading the data took two days over a 3Mbps connection, so you may want to read the clinical report and sample report instead.

  • BAM file checksum: 2529521235 (78.1GB uncompressed)
  • VCF file checksum: 4165261022 (2.4GB gzip compressed)

Questions about FTP? See this FAQ.

Now that I have my genome in the cloud, I’ll start playing with analysis tools like STORMSeq. Stay tuned!

A step forward: Consent for Clinical DNA Sequencing at the Iowa Institute of Human Genetics

During a recent podcast on Mendelspod.com, Colleen Campbell at the Iowa Institute of Human Genetics (IIHG) described the process of introducing pharmacogenomic testing and clinical exome sequencing at the University of Iowa. The project started small, but included pharmacogenomic testing for clopidogrel, as well as whole exome sequencing (WES). At IIHG, WES is intended for diagnostic odyssey patients; patients with a large list of differential diagnoses (where WES is more economical than multiple, individual genetic tests); and patients with atypical presentations of disease. (Today, WES provides a diagnostic answer about 25% of the time.)

As part of the process, patients complete this plain language informed consent form that explains the benefits and risks associated with genetic testing. The form lets patients decide how to receive information about incidental and secondary findings. More importantly, the consent form lets patients easily contribute their health information for future research. Unless patients opt-out, DNA samples and genetic data can be:

  • Compared with genetic information from others to improve future tests
  • Stored for future studies
  • Placed in a national repository (without identifying information)
  • Used to develop future products and services
  • Published in research studies (results, without identifying information)
  • Made into cell lines (from the DNA blood sample)

The consent form also includes lets the patient opt-in so that IIHG can use patients’ genetic information in future research studies (beyond the original purpose for the test).

IIHG has done an exemplary job involving an entire community to integrate genomics into clinical practice. By educating hospital staff, patients and the community, genomic medicine will slowly begin to take root.

Note: I would not be surprised to see IIHG presenting their results at conferences over the next year, including AHIMA, AMIA, ANIAASHG and HIMSS.

Paper: Big Desire to Share Big Health Data


Today I presented this paper about sharing personal health data at the 2014 AAAI Spring Symposium Series, hosted at Stanford University. The paper, co-authored with Melanie Swan, summarized the results of an online survey to gauge consumer attitudes toward sharing health information. Here’s the abstract:

Sharing personal health information is essential to create next generation healthcare services. To realize preventive and personalized medicine, large numbers of consumers must pool health information to create datasets that can be analyzed for wellness and disease trends. Incorporating this information will not only empower consumers, but also enable health systems to improve patient care. To date, consumers have been reluctant to share personal health information for a variety of reasons, but attitudes are shifting. Results from an online survey demonstrate a strong willingness to share health information for research purposes. Building on these results, the authors present a framework to increase health information sharing based on trust, motivation, community, and informed consent.

The take-home messages from the paper are:

  1. Consumers are willing to share health data under the right conditions.
  2. Education seems to play a strong role.
  3. Consumers want to be connected to their data.
  4. Develop models to encourage sharing. 

My favorite part of the talk was explaining how I repeated the survey using an online market research tool. Our respondents were extremely educated — 59% had a Master’s level education or higher — so I wondered if education played a role in their willingness to share. In less than two hours, I posted the survey and received 100 responses (compared with the nine months it took to receive 128 IRB-consented responses). This time, about 20% of the respondents had a Master’s level education or higher, still higher than the US average of 10%, according to the US Census Bureau. Nevertheless, overall attitudes toward sharing were similar. In particular, respondents who were not willing to share their health information tended to have little or no college experience. Although both surveys operated on convenience samples, the results suggest that education plays a role, perhaps because education can change our perception of the risks and benefits associated with sharing health data. Interestingly, these results and conclusions were similar to those found in a recent report published by the Health Data Exploration project sponsored by the Robert Wood Johnson Foundation. More information about this project:

The survey is ongoing! It takes just five minutes, so please add your voice here.

Autism Hackathon in San Francisco

This weekend I collaborated with Melanie Swan at the Autism Hackathon in San Francisco. Sponsored by Twilio and supported by Autism Speaks, this hackathon brought together 50+ developers and designers who created prototype applications for the autism community. At the end of the 24-hour event, a dozen teams presented 5-minute “pitches” for their ideas.

More here: http://www.autismspeaks.org/news/news-item/autism-speaks-and-twilio-team-hacking-autism

Our entry, “MindFlower,” is an “eLabor Marketplace for ASD Solvers.” Think about getting paid for solving puzzles like the ones in FoldIt–that’s the idea.

For more information about MindFlower, see these slides on slideshare.net

Note: MindFlower is just a concept, not an actual business or organization.

Image      (Image credit: Kimberly Pickard)

Restless Legs Syndrome and Niacin Study #2: Quantified Self Meetup in San Francisco

I will be presenting results from my second self-tracking study at the Quantified Self San Francisco meetup at Microsoft later tonight in San Francisco.


By participating in this crowdsourced study on Genomera, I tested niacin supplementation as a potential treatment for Restless Legs Syndrome (RLS).


This experiment had two main differences from the first one. First, I tapered off my current medication, clonazepam, after ramping up with niacin. Second, I increased the daily niacin dose from 500mg to 2000mg, which meant that the ramp-up was also much longer.


I recorded some sliding scale measurements of RLS sensation, leg jerks, etc. in a spreadsheet (see above). Aggregated measurements are also available to Genomera’s members.


Like the last experiment, niacin did not improve my RLS symptoms, even at the higher dose. However, RLS severity was less after tapering off clonazepam, perhaps due to the niacin. Since the first experiment, I also started taking an iron supplement to increase my ferritin level, which might also account for diminished RLS severity. As before, I saw my doctor after the experiment to discuss the results. We changed my medication to Mirapex, which is also commonly used to treat RLS. Compared to clonazepam, I feel more alert. The RLS symptoms remain under control, and amazingly, feeling returned to my sciatic nerve about one month ago–I can feel it all the way down to the top of my left big toe. I am unsure what this means, but after injuring my back 30 years ago it seems significant.

Finally, I wanted to mention that my psoriasis flared once I started taking niacin at 2.0g/day. Subsequently, I read several articles discouraging psoriatics from taking large doses of niacin.

Overall, this QS journey has been worth it. I learned more about my RLS, but more importantly, how to ask better questions that improved my health.

Link to slides on slideshare.net


Sage Synapse: A home for open medical data


I just posted my 23andMe data to Sage Synapse, a collaborative space that allows scientists to share and analyze data together. After authenticating with Synapse, you can access the data here: https://synapse.sagebase.org/#Synapse:syn1444765

Here’s a short video introduction to the Synapse platform:

I will be adding more data to Synapse in the near future.