Illumina short-read sequence data from targeted capture of 434 genes, including 150 chemosensory genes, in 104 individuals distributed across eight host races of the pea aphid. This data is NERC-funded but not held by the EIDC. This data is archived in the European Bioinformatics Institute SRA (Sequence Read Archive) with project accession reference PRJEB6325
Publication date: 2015-05-15
Sequencing reads sorted by individuals (Edinburgh Genomics tags) have been deposited in the EBI Sequence Read Archive (SRA) with project accession no. PRJEB6325 (individual accession numbers are given in table S4 of journal paper associated with this record). Some intermediate data sets and result files needed to reproduce parts of the results outside our pipeline (e.g. matrix of raw PRbp counts, matrix of raw alpha from the "optimalCaptureSegmentation" analysis, R data frames used in GLMMs, NJ tree showing data per capture pools and sequencing lanes, extended figure 1 including all targeted loci) can be found on DRYAD, doi:10.5061/dryad.jf29v. The analytical framework set up to analyse CNV distribution (including CN estimation from PRbp, GLMMs, RF analyses and general statistics) is publicly available on a git repository https://github.com/lduvaux/Duvaux_et_al_2014, and reusable under the conditions of the General Public License (v3). Other miscellaneous scripts and command lines used to process sequencing results can be made available by the authors on request.