|本期目录/Table of Contents|

[1]董林松,方铭,王志勇*.基因组预测中先验分布的两种超参数设定策略比较[J].厦门大学学报(自然科学版),2018,57(04):539-545.[doi:10.6043/j.issn.0438-0479.201709021]
 DONG Linsong,FANG Ming,WANG Zhiyong*.Comparison of Two Strategies for Setting Hyper-parameters of Prior Distribution in Genomic Prediction[J].Journal of Xiamen University(Natural Science),2018,57(04):539-545.[doi:10.6043/j.issn.0438-0479.201709021]
点击复制

基因组预测中先验分布的两种超参数设定策略比较(PDF/HTML)
分享到:

《厦门大学学报(自然科学版)》[ISSN:0438-0479/CN:35-1070/N]

卷:
57卷
期数:
2018年04期
页码:
539-545
栏目:
研究论文
出版日期:
2018-07-31

文章信息/Info

Title:
Comparison of Two Strategies for Setting Hyper-parameters of Prior Distribution in Genomic Prediction
文章编号:
0438-0479(2018)04-0539-07
作者:
董林松方铭王志勇*
集美大学农业部东海海水健康养殖重点实验室,福建 厦门 361021
Author(s):
DONG LinsongFANG MingWANG Zhiyong*
Key Laboratory of Healthy Mariculture for the East China Sea,Ministry of Agriculture,Jimei University,Xiamen 361021,China
关键词:
基因组选择 先验分布 超参数设定 准确性 模拟研究
Keywords:
genomic selection prior distribution hyper-parameter setting accuracy simulation study
分类号:
Q 38
DOI:
10.6043/j.issn.0438-0479.201709021
文献标志码:
A
摘要:
基因组选择是通过全基因组的标记信息估计出个体的基因组育种值并加以选择的育种方法.主要围绕最佳线性无偏预测(BLUP)和贝叶斯方法展开.这些方法均在某种先验假设下进行,因此需要对先验分布的参数进行设定.依据设定先验超参数的原理,探讨了对单核苷酸多态性(SNP)基因型进行与不进行标准化两种策略下先验超参数的设定方法,并利用QTLMAS2012的模拟数据,分别计算了7种预测方法(岭回归BLUP(RRBLUP)、BayesA、BayesB、BayesCπ、快速BayesB(FBayesB),快速混合正态分布(FMixP)和基于马尔科夫链-蒙特卡洛算法的MixP(简称MMixP))在2种策略下的基因组育种值.结果显示:当采用同一种预测方法,对SNP基因型进行标准化处理与否不影响基因组育种值估计结果.但由于对基因型进行标准化处理在方法上更具有通用性,并可以突出效应大的SNP位点,故建议进行SNP效应值估计前,先将SNP基因型标准化,再设定先验分布的参数值.
Abstract:
Genomic selection is a breeding method that uses whole-genome markers to predict genomic estimated breeding values to perform individual selection.Recently,various relevant statistical methods have been proposed,mainly including best linear unbiased prediction(BLUP)and Bayesian methods.These statistical methods are performed according to different prior assumptions,so it is necessary to set the parameters for prior distribution.This study was designed to describe the theory for setting prior hyper-parameters in detail,and discuss the prior hyper-parameters setting methods in the strategies of standardizing or not standardizing single nucleotide polymorphism(SNP)genotypes.Seven prediction methods(ridge-regression BLUP,BayesA,BayesB,BayesCπ,fast BayesB(FBayesB),fast MixP(FMixP)and Mixp based on Markov Chain-Monte Calo algorithm(MMixP)),were used to estimate genomic estimated breeding values in the two strategies using QTLMAS2012 simulated data.The results showed that the prediction accuracies were very similar when standardizing and not standardizing the SNP genotypes in a specific statistical method.As standardizing the SNP genotypes can fit various cases and highlight the SNPs with larger effects,we suggest using this strategy to set prior hyper-parameters before predicting SNP effects.

参考文献/References:

[1] MEUWISSEN T H,HAYES B J,GODDARD M E.Prediction of total genetic value using genome-wide dense marker maps[J].Genetics,2001,157(4):1819-1829.
[2] HENDERSON C R.Best linear unbiased estimation and prediction under a selection model[J].Biometrics,1975,31(2):423-447.
[3] DAETWYLER H D,VILLANUEVA B,BIJMA P,et al.Inbreeding in genome-wide selection[J].Journal of Animal Breeding and Genetics,2007,124(6):369-376.
[4] HANNEM N,ANNAK S,HOSSEIN Y,et al.Comparison of accuracy of genome-wide and BLUP breeding value estimates in sib based aquaculture breeding schemes[J].Aquaculture,2009,289(3/4):259-264.
[5] TSAI H Y,HAMILTON A,TINCH A E,et al.Genome wide association and genomic prediction for growth traits in juvenile farmed Atlantic salmon using a high density SNP array[J].BMC Genomics,2015,16(1):1-9.
[6] MUIR W M.Comparison of genomic and traditional BLUP-estimated breeding value accuracy and selection response under alternative trait and genomic parameters[J].Journal of Animal Breeding and Genetics,2007,124(6):342-355.
[7] SCHAEFFER L R.Strategy for applying genome-wide selection in dairy cattle[J].Journal of Animal Breeding and Genetics,2006,123(4):218-223.
[8] 张哲,张勤,丁向东.畜禽基因组选择研究进展[J].科学通报,2011,56(26):2212-2222.
[9] 王重龙,丁向东,刘剑锋,等.基因组育种值估计的贝叶斯方法[J].遗传,2014,36(2):111-118.
[10] VANRADEN P M.Efficient methods to compute genomic predictions[J].Journal of Dairy Science,2008,91(11):4414-4423.
[11] LI H,WANG J,BAO Z.A novel genomic selection method combining GBLUP and LASSO[J].Genetica,2015,143(3):299-304.
[12] ZHANG Z,LIU J,DING X,et al.Best linear unbiased prediction of genomic breeding values using a trait-specific marker-derived relationship matrix[J].PLoS One,2010,5(9):1-8.
[13] HABIER D,FERNANDO R L,KIZILKAYA K,et al.Extension of the Bayesian alphabet for genomic selection[J].BMC Bioinformatics,2011,12(1):1-12.
[14] CAMPOS G D L,NAYA H,GIANOLA D,et al.Predicting quantitative traits with regression models for dense molecular markers and pedigree[J].Genetics,2009,182(1):375-385.
[15] PARK T,CASELLA G.The bayesian lasso[J].Journal of the American Statistical Association,2008,103(482):681-686.
[16] YI N,GEORGE V,ALLISON D B.Stochastic search variable selection for identifying multiple quantitative trait loci[J].Genetics,2003,164(3):1129-1138.
[17] MEUWISSEN T H,SOLBERG T R,SHEPHERD R,et al.A fast algorithm for BayesB type of prediction of genome-wide estimates of genetic value[J].Genetics Selection Evolution,2009,41(1):1-10.
[18] SHEPHERD R K,MEUWISSEN T H,WOOLLIAMS J A.Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers[J].BMC Bioinformatics,2010,11(1):529.
[19] YU X,MEUWISSEN T H.Using the Pareto principle in genome-wide breeding value estimation[J].Genetics Selection Evolution,2011,43:1-7.
[20] DONG L,FANG M,WANG Z.Prediction of genomic breeding values using new computing strategies for the implementation of MixP[J].Sci Rep,2017,7(1):17200.
[21] 朱波,王延晖,牛红,等.畜禽基因组选择中贝叶斯方法及其参数优化策略[J].中国农业科学,2014,47(22):4495-4505.
[22] RONEN B.The Pareto managerial principle:when does it apply?[J].International Journal of Production Research,2007,45(10):2317-2325.
[23] MACKA Y,TRUDY F C.Introduction to quantitative genetics[M].Essex:Longman Group,1996:223-226.
[24] GIANOLA D,CAMPOS G D L,HILL W G,et al.Additive genetic variability and the Bayesian alphabet[J].Genetics,2009,183(1):347-363.
[25] SMITH S P,GRASER H U.Estimating variance components in a class of mixed models by restricted maximum likelihood[J].Journal of Dairy Science,1986,69(4):1156-1165.
[26] KANG H M,ZAITLEN N A,WADE C M,et al.Efficient control of population structure in model organism association mapping[J].Genetics,2008,178(3):1709-1723.
[27] YANG J,LEE S H,GODDARD M E,et al.GCTA:a tool for genome-wide complex trait analysis[J].American Journal of Human Genetics,2011,88(1):76-82.
[28] WANG C,PRAKAPENKA D,WANG S,et al.GVCBLUP:a computer package for genomic prediction and variance component estimation of additive and dominance effects[J].BMC Bioinformatics,2014,15(1):270.
[29] LEE S H,VAN DER WERF J H.MTG2:an efficient algorithm for multivariate linear mixed model analysis based on genomic information[J].Bioinformatics,2016,32(9):1420-1422.
[30] HABIER D,FERNANDO R L,DEKKERS J C.The impact of genetic relationship information on genome-assisted breeding values[J].Genetics,2007,177(4):2389-2397.
[31] MACCIOTTA N P,GASPA G,STERI R,et al.Pre-selection of most significant SNPs for the estimation of genomic breeding values[J].BMC Proceedings,2009,3(S1):1-4.
[32] PéREZ P,CAMPOS G D L.Genome-Wide regression and prediction with the BGLR statistical package[J].Genetics,2014,198(2):483-495.
[33] 薛佳.猪基因组预选择的优化研究[D].成都:四川农业大学,2013:1-61.
[34] USAI M G,GASPA G,MACCIOTTA N P,et al.ⅩⅥth QTLMAS:Simulated dataset and comparative analysis of submitted results for QTL mapping and genomic evaluation[J].BMC Proceeding,2014,8(S5):1-9.

备注/Memo

备注/Memo:
收稿日期:2017-09-20 录用日期:2018-05-03
基金项目:国家自然科学基金重点项目(U1705231); 国家海水鱼类产业技术体系项目(CARS-47-G04); 厦门南方海洋研究中心重大项目(14GZY70NF34)
*通信作者:zywang@jmu.edu.cn
引文格式:董林松,方铭,王志勇.基因组预测中先验分布的两种超参数设定策略比较[J].厦门大学学报(自然科学版),2018,57(4):539-545.
Citation:DONG L S,FANG M,WANG Z Y.Comparison of two strategies for setting hyper-parameters of prior distribution in genomic prediction[J].J Xiamen Univ Nat Sci,2018,57(4):539-545.(in Chinese)
更新日期/Last Update: 1900-01-01