Human Reproductive Biology. More Info.
The Science of Animals. How the World Works: Biology. The Biology of Parasites. Gasping Fish and Panting Squids. The Serengeti Rules. The Mediterranean Region. Wildlife Forensics. Principles of Thermal Ecology. Color Catalogue for Field Biologists. Other titles from WSP. Degrees of Freedom. Algal and Cyanobacteria Symbioses. Charles Darwin in Cambridge. Ocean Disposal of Wastewater. Island Life. Hawking Radiation. Synergistic Selection. Pioneers In Microbiology. The Global Monsoon System. Aridity Trend in Northern China. Browse titles from WSP.
Keep up-to-date with NHBS products, news and offers. The visualization ability can be an extremely useful feature in the application of the EF analysis to the genomic prediction of a biological contour shape. In this study, we predict the biological contour shape based on genome-wide single nucleotide polymorphisms SNPs. As a proof-of-concept study, we applied the method to the prediction of rice Oryza sativa L. A contour shape of brown rice grain was delineated by EFDs.
Then the EFDs were predicted based on genome-wide marker polymorphisms.
Biological Shape Analysis by Pete E. Lestrel - - Dymocks
We built prediction models using four methods: linear or nonlinear and single-dimensional or multi-dimensional regression methods. We compared the accuracy of the methods using multiple datasets of rice germplasm collections, which had different characteristics of marker density and sample size. The objectives of this study were 1 to propose a method for predicting rice grain shape delineated by EFDs based on genome-wide marker polymorphisms, 2 to assess the accuracy of the genomic prediction of rice grain shape, and 3 to ascertain an appropriate method for building a model for predicting rice grain shape.
We then discussed the potential of the proposed method for genomic selection of the biological contour shape. We used two independent datasets to assess the potential of genomic prediction of rice grain shape. The first one, dataset A, included of the rice accessions that had been selected as representatives of the rice germplasm at the National Institute of Agrobiological Sciences NIAS Genebank [ 27 ].
The second one, dataset B, contained of the rice accessions that had been used in a study by Zhao et al.
- Article metrics.
- Find a copy in the library;
- Protein Secretion Pathways in Bacteria.
We also analyzed the third one, dataset C, which was a subset of dataset B, used by Zhao et al. The imputation step was repeated times. The mean imputation scores were obtained by averaging imputed genotypes over the replications, as proposed by Iwata and Jannink [ 32 ]. For dataset B, we analyzed the genotype data of the 1, SNPs that had been genotyped and analyzed by Zhao et al.
For dataset C, we analyzed the genotype data of the 36, SNPs that had been genotyped and analyzed by Zhao et al. The proportions of imputed genotypic data were Brown rice grain images of the and accessions were collected, respectively, for datasets A and B. For dataset A, we used the EFD data of the accessions, which had been analyzed an earlier study [ 13 ].
The EFDs of six grains were measured and recorded for each accession. The images of four grains were available for each accession. We measured the EFDs of the rice grains using the procedure described in the following section. As described above, dataset C was a subset of dataset B. Therefore, the phenotypic data collected for dataset B were used also for dataset C. A quantitative description of the rice grain shape was conducted as described in Iwata et al.
Rice grain contours were extracted using the digital image analysis of the rice grain images. An extracted contour of each grain was represented as a sequence of x and y coordinates of boundary pixels on the contour. Assuming the x and y coordinates of the pixel at the length of the contour t from the arbitrary starting pixel, i. Therefore, we ignored both coefficients in the following analysis i. In this study, we approximated the contour coordinates of boundary pixels on the contour of a rice grain by the Fourier series with the first 20 harmonics i.
Error in the Fourier series approximation with the first N harmonics was calculated as the proportion of squared displacements between observed and approximated contour coordinates to the total sum of squares of the variations of observed contour coordinates. The numbers of pixels P differed between grains, depending on the size of grains and the scale of grain images.
For this study, we evaluated displacements between the observed and approximated coordinates equally over the entire contour.
Passar bra ihop
If one were interested in local shape variation, then one could assign a larger weight to squared displacements for the region of interest than to the remainder. Because the Fourier coefficients, a n , b n , c n , and d n , calculated as described above were not invariant to size, rotation, and the position of a starting point of the contour trace, they were standardized to be invariant to these factors according to the size and direction of the long axis of the first harmonic ellipse [ 25 , 34 ]. The standardization can be performed mathematically i.
It has been used in quantitative genetics analysis of the biological shape [ 13 , 35 — 38 ]. We set N as 20 in the present study.
Therefore the dimensionality of the vector f was Let the vector denote the average values of EFDs of six dataset A or four dataset B grains for the l- th accession of a rice germplasm collection. Based on the average values of EFDs, the average contour coordinates for the l- th accession can be calculated as and. To visualize the grain shape variation observed in datasets A and B, we overlaid the average contour coordinates of all accessions variation in each dataset.
The significance of among-accession variation against within-accession variation i. The first and second are methods that predict each coefficient of EFDs separately: we built a prediction model where f kl is the k- th entry of the vector f l i.
Join Kobo & start eReading today
Each element of x had a value of 1 or -1 depending on the genotype of each marker. To build the prediction model g x , we used ridge regression RR and non-linear kernel ridge regression KRR. In the KRR, we defined the kernel as where h is the bandwidth parameter. For this study, we chose , where d m is the median of the Euclidean distances of x among all pairs of accessions, as chosen by Crossa et al.
The third and fourth are methods that predict all the coefficients of EFDs simultaneously: we built a prediction model, where f l is the vector of EFDs of l th accession.
For building the prediction model g x , we used ordinary i. For the regression analysis, we used R scripts written by the first author. In the R scripts, we used the algorithm described by Rosipal and Trejo [ 42 ]. To determine the number of PLS components, we performed nested ten-fold cross-validation: we performed ten-fold cross-validation for determining the number of PLS components within each fold of the ten-fold cross-validation for evaluating the prediction accuracy.
We set the maximum number of PLS components to Using Eqs.
Quantitative shape analysis with weighted covariance estimates for increased statistical efficiency
Setting T as 1. To evaluate the accuracy of the genomic prediction of grain shape, we performed cross-validation for which we calculated the squared prediction error of grain shape of each accession as follows.
Integrating the displacement of x coordinates on a predicted contour from the corresponding x coordinates on an average contour in the l- th accession of a rice germplasm collection, we obtain the equations shown below. The integration of displacement in y coordinates of predicted and average contours in the l- th accession is calculable in the same way.
Consequently, we can calculate the squared prediction error of grain shape of the l- th accession as the integration of displacement in both coordinates, as. The accuracy of predicted grain shapes was then measured as 3 where , , , and respectively denote the means of , , , and over all accessions. The Q 2 represents the proportion of the variations explained by the prediction to the total variations in contour coordinates i. As described in the previous subsection, we set the value of T as 1. To evaluate the prediction accuracy based on the squared prediction error of grain shape of each accession and the Q 2 statistic, we conducted cross-validation of two types: leave-one-out cross-validation and ten-fold cross-validation.
To evaluate the variation attributable to random splits of samples in the ten-fold cross-validation, we repeated ten-fold cross-validation 10 times on different splits of samples.