|Year : 2014 | Volume
| Issue : 2 | Page : 67-80
Morphological assessment of embryo quality during assisted reproduction: A systematic review
Athanasios Papathanasiou1, Bayan Osmani2, Pek Joo Teoh3, Abha Maheshwari2
1 Department of Gynaecology, Bourn-Hall Clinic, Cambridge/Norwich, England, United Kingdom
2 Department of Embryology, Aberdeen Centre of Reproductive Medicine, Aberdeen, Scotland, United Kingdom
3 Department of Gynaecology, Concept Fertility Centre, Perth, Australia
|Date of Web Publication||13-Aug-2015|
Dr. Abha Maheshwari
Consultant Reproductive Medicine and Surgery, Aberdeen Centre of Reproductive Medicine, Aberdeen, Scotland
Source of Support: None, Conflict of Interest: None
Background: Various parameters of embryo morphology have been routinely used to select the embryo/s with maximum implantation potential during in vitro fertilization (IVF). Hence, there is a dilemma in clinical practice as to which morphological scoring system/test to use. We performed a systemic review to determine the predictive power as well as the clinical and cost-effectiveness of existing morphological tests of embryo quality described in an IVF setting. Materials and Methods: The preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines for systematic review were followed. A mixed-method analysis was performed. Qualitative and quantitative techniques were used to synthesize the final results. A narrative summary approach was used for initial data exploration and description, followed by the pooling of data, where appropriate, using Meta-DiSc software. Receiver operating characteristic (ROC) curves were plotted wherever appropriate, and the area under the curve (AUROC) was determined. Results: Day 3, day 5, and early cleavage (EC) all had similar discriminatory value for predicting implantation (AUC 0.66, 067, and 0.63 respectively). There was no evidence of improvement in pregnancy rates due to routinely doing EC. No studies were identified that determined the cost-effectiveness of any of the tests. Conclusions: All tests have low accuracy. They lack the discriminatory power to identify an embryo that will/will not lead to implantation. Appropriately designed studies are required to assess the predictive value and the clinical and cost-effectiveness of novel embryo scoring technologies.
Keywords: Embryo, implantation, pregnancy, quality, test
|How to cite this article:|
Papathanasiou A, Osmani B, Teoh PJ, Maheshwari A. Morphological assessment of embryo quality during assisted reproduction: A systematic review. Fertil Sci Res 2014;1:67-80
| Background|| |
Multiple pregnancies are the single biggest risk of assisted reproduction. Single embryo transfer (SET) has the potential to virtually eliminate multiple pregnancies. However, despite widespread promotion of SET, only 16.8% of the embryo transfers in the United Kingdom (UK) in 2011 were elective SETs (http://www.hfea.gov.uk/docs/HFEA_Fertility_Trends_and_Figures_2011_-_Annual_Register_Report.pdf). As a result, multiple pregnancy rates were still over 20%. One of the stated barriers for SET is our inability to select the optimal embryo for implantation.  By using standard morphological criteria, it may not be possible to select the best embryo at the cleavage stage. Extended culture has been suggested as a preferential method to select the best embryo. However, this has not eliminated multiple embryo transfers, and over 25% of the double embryo transfers (DETs) in the UK in 2011 were at the blastocyst stage. In addition, concerns have been recently raised about preterm labor in pregnancies subsequent to blastocyst transfer.  Moreover, the cumulative pregnancy rate per woman, after combined fresh and subsequent frozen transfers, is lower for blastocyst transfers compared to transfers at the cleavage stage.  Ideally, one would like to be able to determine the embryo with the best implantation potential by day 3, followed by transfer and freezing, in order to maximize cumulative pregnancy rates and minimize multiple pregnancy rates.
Numerous morphological parameters and scoring systems have been advocated to determine the embryo with implantation potential, a testament to the fact that there is no single best test. Theoretically, a combination of multiple scoring methods should improve a test's predictive value. However, a considerable amount of time and money may be spent on doing such tests. Moreover, there are concerns regarding the repeated handling of embryos that may be required when performing such tests: This may adversely affect the incubation and culture process and, subsequently, the outcome of in vitro fertilization (IVF). Hence, uncertainty still exists in clinical practice as to which scoring system to use and how effective these tests are.
We performed a systematic review to determine the predictive value, clinical effectiveness, and cost-effectiveness of the various embryo scoring tests based on morphology described in the literature. The purpose of this exercise was to provide evidence-based guidance on the predictive properties of individual tests or combinations of tests, so as to enable IVF practitioners to select the best embryos for transfer to uterus or freezing, with minimal disruption.
| Materials and Methods|| |
The preferred reporting items for systematic reviews and meta-analyses (PRISMA) guidelines for systematic reviews were followed. 
Data sources and literature search
The searches were performed in two steps. An initial literature search was performed (1988-February 2015) on Medline, Excerpta Medica dataBASE (EMBASE), Cochrane Central Register of Clinical Trials, Cumulative Index to Nursing and Allied Health Literature (CINAHL), and Database of Abstracts of Reviews of Effects (DARE) for published studies (key words: "embryo quality," "embryo", "scoring", "zygote scoring", "cleavage scoring", "early cleavage scoring", "cumulative scoring", "implantation", "pregnancy", "ART"). This initial exercise helped in scoping which tests have been described in the literature. Once the tests were listed, the searches were repeated using key words specific for each test. There were no language restrictions. Relevant journals in the specialty (Human Reproduction, Human Reproduction Update, RBM online, and Fertility and Sterility) were also searched for advance access publications. Cross-references from the included studies were handsearched. Two review authors (AP, PT) independently conducted the searches and selected the studies to be included, while a third author conducted the searches in advance access publications (BO). Repeat searches for each test, as identified from the first step, were undertaken by two authors (AM and AP). Articles were included according to predetermined criteria. Differences of opinion were resolved after team discussion. Data were extracted using predesigned tables. Care was taken to avoid duplication of data in case of two studies from the same authors using the same population.
The following inclusion and exclusion criteria were applied.
To determine predictive power
All published studies in which the predictive value of any morphology test of embryo quality was calculated were included if it was feasible to create a 2 × 2 table from the published data, i.e., a normal test and an abnormal test were defined, and the cases with positive and negative tests were compared with a reference standard. In studies where two tests were compared with a reference standard, data for each test were separately extracted.
To determine clinical and cost-effectiveness
All studies that compared outcomes in two groups (those who either had or did not have the test) were included.
Studies where blastocyst formation was used as the reference standard were excluded. We excluded studies evaluating invasive tests and those reporting on tests of oocyte and sperm quality. Conference abstracts and animal studies were also excluded.
Definition of reference standards
This is defined as number of gestation sacs on ultrasound per embryo transferred.
Clinical pregnancy rate
This is defined as the presence of a fetal heart beat on 7-week ultrasound per embryo transfer.
Live birth rate
This is defined as live birth per embryo transfer.
As this review addresses the predictive power of embryo grading systems, a "per transfer" denominator was considered to be appropriate.
To determine predictive power
For each test, data were extracted in 2 × 2 tables. Data were pooled if there were at least two studies that defined the positive and negative test in the same way and compared the test with the same reference standard. When implantation rates acted as the reference standard, pooling of studies was restricted to studies with SETs or where per-embryo data could be extracted. Meta-analysis was attempted wherever appropriate.
The results were organized by entering all data reported on each test from several studies together. Studies were tested for heterogeneity, I 2 index calculated. Summary receiver operating characteristic (SROC) curves were produced wherever inverse correlation was evident, and based on a Spearman correlation coefficient between sensitivity and specificity of 0.6 or more. The Moses-Littenberg linear regression model  was used. The area under the curve (AUROC) with standard error (SE) was calculated. When no SROC could be produced, positive likelihood ratios (LR+) were calculated and reported. The Meta-DiSc software was used.  Subgroup analysis was performed using specific features of the test.
To determine clinical effectiveness
For each test, data were extracted in 2 × 2 tables, and pooled if at least two studies had compared the same test. The data was pooled using Rev Man 5.2 (Review Manager 2012, Cochrane Collaboration) to calculate the odds ratio (OR), with 95% confidence interval, of pregnancy. The intervention group received the embryo scoring test of interest, while the control group did not.
Quality of studies
Quality assessment of the included studies was performed by three authors (AP, PT, and BO) using the quality assessment of diagnostic accuracy studies (QUADAS) tool. Any disagreement regarding the type and quality of the studies was resolved after discussion.
| Results|| |
[Table 1] lists the parameters and time of morphological assessment as described in the literature. For all morphology assessments, searches were simultaneously performed. Out of 56 articles, 28 studies were excluded with reasons; two studies had duplicate data. Most studies on morphological assessments were not necessarily designed to determine the predictive value of morphology, as morphology assessment is routine clinical practice in every embryology laboratory. However, we were able to extract data from these articles for prediction of implantation and/or pregnancy. We felt that it was important to assess the predictive value of morphological assessments, as this will put the newer tests into perspective. Morphology was assessed at various stages as follows:
Zygote scoring (pronuclear morphology)
A total of eight studies assessed the prediction of zygote scoring for embryo quality. Most were retrospective studies. The precise definitions of the index test varied among the included studies [Table 1]. Embryo transfers were performed on either day 2, , day 3, ,, day 5,  day 2, 3, or 5,  or day 2 or 3. 
Prediction of implantation
Six studies assessed the impact of zygote scoring systems on implantation rates. ,,,,, Except for one study,  DETs were performed. Data from these studies were pooled. No heterogeneity (I 2 = 0%) was detected. The Spearman correlation coefficient was 0.829, so a SROC curve was constructed [AUROC 0.57 (SE 0.017)].
Prediction of pregnancy
Seven studies assessed the prediction of clinical pregnancy using a zygote scoring system. ,,,,,, Data from these studies could be pooled. No heterogeneity (I 2 = 0%) was detected among the pooled studies. The Spearman correlation coefficient was 1, so a SROC curve was constructed (AUROC 0.58 (SE 0.023)) [Figure 1].
|Figure 1: SROC curve for prediction of clinical pregnancy by zygote scoring|
Click here to view
No studies that assessed the clinical or cost-effectiveness of performing zygote scoring were identified. Chen et al.  compared zygote scoring to early cleavage (EC) in a prospective randomized control trial (RCT) and reported no significant differences with regard to pregnancy rates.
Value in clinical practice
As is evident from [Table 1], there is a lack of consensus among the included studies on the exact method necessary to evaluate pronuclear morphology. Even though pronuclear scoring is statistically better than chance in order to predict pregnancy or implantation, it possesses limited accuracy (based on the low LR+) and discrimination (based on the low AUROC). Currently, there is no strong evidence for its routine use in clinical practice.
Day 2 morphology
Six studies were identified where day 2 morphology was assessed with regard to pregnancy or implantation [Table 2]. In all studies, embryo transfers were performed on day 2, except for one,  where transfers were performed on day 3. The studies assessed morphology mainly by means of blastomere numbers, fragmentation, or multinucleation. Sjoblom et al.  assessed morphology by an elaborate weighted score, which also examined other features, such as the zona pellucida thickness and the appearance of the cell cytoplasm, membrane, and perivitelline space. Holte et al.  included symmetry of cleavage in their scoring criteria. The cutoff values for blastomere cell numbers, fragmentation, and multinucleation differed between studies.
Prediction of implantation
Three studies reported data on prediction of implantation. ,, Significant heterogeneity was detected (I 2 = 91.5%), and the reported LR+ was 1.56 (95% CI 1.13-2.14). No SROC curve was constructed (Spearman correlation coefficient 0.5).
Prediction of pregnancy
Four studies reported on prediction of pregnancy. ,, Significant statistical heterogeneity (I 2 = 83.7%) was detected. A SROC curve could be generated (Spearman correlation coefficient 0.8), and the AUROC was 0.61 (SE 0.05).
Prediction of live birth
Three studies reported on live birth. ,, Significant statistical heterogeneity (I 2 = 77.2%) was detected. A SROC curve could be generated (Spearman correlation coefficient 1), and the AUROC was 0.66 (SE 0.08).
No studies that assessed the clinical effectiveness or cost-effectiveness of performing day 2 embryo morphology scoring were identified.
Value in clinical practice
Although there are no separate studies on clinical effectiveness, day 2 morphology assessment prior to embryo transfer is routine practice. However, performing day 2 morphology scoring as an extra test to select embryos for day 3 and beyond is not backed by current evidence as it possesses limited accuracy (based on the low LR+) and discrimination (based on the low AUROC).
Day 3 morphology
Data on the predictive power of day 3 morphology could be obtained from seven studies [Table 3]. Day 3 assessment was based on the number of blastomeres and the degree of fragmentation in all studies but one.  As with day 2 morphology, cutoff points varied among studies. In one study,  terminology of a good, fair and a poor embryo was used to describe embryo quality. However, various clinics used their own criteria to classify embryos into the three grades mentioned above. Embryo transfers were performed on day 3.
Prediction of implantation
Data could be extracted from five studies for prediction of implantation. ,,,, There was significant statistical heterogeneity (I 2 = 79%). The SROC was plotted (Spearman correlation coefficient 0.9) and the AUROC was 0.66 (SE 0.05).
Prediction of pregnancy
Data could be extracted from three studies for prediction of clinical pregnancy. ,, There was significant statistical heterogeneity among studies (I 2 = 90.3%). An SROC was plotted (Spearman correlation coefficient 0.8) and the AUROC was 0.68 (SE 0.08) [Figure 2].
|Figure 2: SROC curve for prediction of pregnancy by day 3 morphology scoring|
Click here to view
Prediction of live birth
Two studies reported on live birth. , Significant statistical heterogeneity (I 2 = 92.7%) was detected. The LR+ was 1.29 (95% CI 0.87-1.92).
No studies that assessed the clinical effectiveness or cost-effectiveness of performing day 3 embryo morphology scoring were identified.
Value in clinical practice
Although there are no separate studies on clinical effectiveness, day 3 morphology assessment prior to embryo transfer is routine practice in all embryology laboratories. As a test it possesses limited accuracy (based on the low LR+) and discrimination (based on the low AUROC).
Day 5 morphology
Three studies ,, were identified where the predictive value of blastocyst grading on implantation rates was assessed [Table 4]. Blastocyst grading was performed using similar parameters. Two studies were retrospective and one was prospective. All embryo transfers were performed on day 5 or day 6. SETs were exclusively done in one study.  All studies assessed the same parameters to estimate blastocyst quality: Blastocyst expansion, inner cell mass appearance, and trophectoderm appearance.
Prediction of implantation
There was significant statistical heterogeneity among the studies (I 2 = 98.6%). A SROC was plotted (Spearman correlation coefficient 1), and AUROC was 0.67 (SE 0.028) [Figure 3]. The pooled LR+ was 1.30 (0.76-2.24). There was significant heterogeneity among the studies (98.6%).
Prediction of clinical pregnancy
Data for prediction of clinical pregnancy could only be extracted from one study.  Data from this study showed a clinical pregnancy rate of 52.5% when the blastocyst or early blastocyst was transferred (considered as an embryo with implantation potential).
Prediction of live birth
Only one study  assessed live birth rates in association with blastocyst morphological grading. They found that the appearance of the trophectoderm correlates strongly with live birth rate.
No studies have separately assessed the clinical and cost-effectiveness of this test.
Value in clinical practice
Day 5 morphology assessment is performed routinely prior to embryo transfer at day 5. As a test it possesses limited accuracy (based on the low LR+) and discrimination (based on the low AUROC).
Cumulative embryo score
Six studies ,,,,, described the predictive value of performing cumulative embryo scoring (CES) [Table 5]. In only one study, the transfers were SETs.  Embryo transfers were performed on day 2, day 3, day 4, or day 5. A combination of zygote scoring, EC scoring, and day 2 and/or day 3 scoring was used. The exact methodology for calculating CES scores in each study is described [Table 5]. Qian et al. (2008) compared two systems of cumulative scoring. A different scoring system was developed by each study, with different weighting given to various components. Hence, pooling of data was not deemed appropriate.
|Table 5: Studies assessing the predictive value of cumulative embryo scoring |
Click here to view
No studies have addressed the clinical effectiveness or cost-effectiveness of performing CES. It is, therefore, not possible to determine its value in routine clinical practice.
Embryo development rate
Two studies , assessed the predictive value of embryo development rate assessment on implantation [Table 6]. In both studies, patients were treated in a natural cycle and day 3 SET was performed. However, they used entirely different criteria for identifying a good quality embryo. Hence, pooling of data was not deemed appropriate. However, data for individual study are provided in [Table 6].
|Table 6: Studies assessing the predictive value of assessing embryo development rate |
Click here to view
There are no relevant studies on clinical or cost-effectiveness. Currently, there is no clear evidence to justify the routine use of CES in clinical practice.
Shoukir et al.  were first to demonstrate that human embryos that had undergone their first cleavage cycle by 25 h post insemination achieved higher pregnancy rates during IVF. It is not clear why the time to first cell division varies among embryos; it could be related to the culture conditions as well as intrinsic factors of the oocyte and sperm, maturity issues, genetic competence, and metabolic activity. It has been suggested that metabolically fit embryos cleave earlier due to the availability of energy molecules, such as adenosine triphosphate (ATP), and their highly active mitochondria. 
Using the search terms "early cleavage"; "IVF" or "ICSI" or "Assisted Conception", and "embryo", 195 articles were identified, of which 65 abstracts were considered relevant. Full texts were obtained and, subsequently, 27 appropriate articles were identified. Twenty articles were included and seven articles were excluded. The full text article of one article could not be accessed, and it was not included. Three more studies were found by cross-searching, and 22 articles were thus included in total [Table 7]. ,,,,,,,,,,,,,,,,,,,,
The characteristics of the included studies are detailed in [Table 8]. All were observational studies. Recruitment was consecutive in some studies. Blinding was not used in any of these studies. The presence of EC was defined as first cleavage by 25-27 h post insemination by most but not all. Some authors considered the presence of two cells, blastomeres, as EC, whereas others included any type of cleavage, such as the presence of one cell or the absence of two pronuclei. Some authors only evaluated the number of cells, while others also explored the symmetry of cell division [Table 7].
In eight studies, only EC embryos were transferred in the study group, while at least one EC embryo transfer was included in the study group in the remaining studies. The non-EC groups only included transfers embryos with late cleavage. The time interval from IVF or intracytoplasmic sperm injection (ICSI) to assessment for EC was the same for all studies except for one. 
Lundin et al.  included only one cycle per woman. Yang et al.  were the only ones that explored subgroups of agonist and antagonist treatment cycles. They found no difference in their antagonist treatment cycles. There was variation among the included studies in the stimulation regimens used, the starting dose of gonadotropins, and the media used for culture.
Van Montfoort et al.  has three entries in the table, as they provided data separately for IVF and ICSI, and also for DET, where both embryos either had EC or no EC.
Prediction of implantation
In six studies, only SET was performed, and these were used to assess the test's predictive value for implantation. The Spearman correlation coefficient was considered satisfactory (0.89) for plotting a SROC curve. The AUROC was 0.63 (SE 0.02). There was significant statistical heterogeneity among the studies (I 2 = 88.5%).
Prediction of pregnancy
As all included studies made use of similar methodology to assess EC, they were appropriate for data pooling in order to determine the test's predictive value for pregnancy. The Spearman correlation coefficient was considered satisfactory (0.66) for plotting a SROC curve. The AUROC was 0.62 (SEM 0.02) [Figure 4]. There was significant statistical heterogeneity among the studies (I 2 = 85.7%).
|Figure 4: SROC curve for prediction of pregnancy in the presence of early cleavage|
Click here to view
Subgroup analysis after excluding studies that transferred both EC and non-EC embryos did not alter the results.
Four studies determined the clinical effectiveness of performing EC assessment [Table 9]. Their characteristics are summarised in [Table 9]. Two out of four studies secured prospective recruitment and random allocation for the two groups. In all four studies, baseline characteristics in both groups were similar. Pooling of data revealed no statistically significant heterogeneity. No statistical difference in the odds of achieving pregnancy was achieved, when comparing EC assessment with no EC assessment (OR 1.29 95% CI 0.98-1.70) [Figure 5].
|Table 9: Studies evaluating clinical effectiveness of doing early cleavage |
Click here to view
No studies have been identified that have evaluated the cost-effectiveness of this test.
Value in clinical practice
As a test, EC assessment possesses limited accuracy (based on the low LR+) and discrimination (based on the low AUROC). In addition, the clinical effectiveness studies suggest that it is not an effective test. Based on the available evidence, routine assessment for EC is not recommended as a routine test in IVF practice.
According to the Alpha/European Society of Human Reproduction and Embryology (ESHRE) consensus,  checking for EC should be performed 25-27 h post ICSI and 27-29 h post IVF. The included studies have used a fixed time frame for assessing for EC regardless of the fertilization technique used (IVF or ICSI). Moreover, in all studies except one, assessment was performed no later than 27 h, which is not appropriate after IVF treatment. There was also variation in the definition of EC within the included studies, ranging 0-2 cells. The latest ESHRE consensus has agreed that the presence of 2 cells is required.
| Discussion|| |
Numerous tests for assessing embryo quality have been described in the literature. Our review has shown that none of the morphological assessments described have a high accuracy to identify the embryos that have good implantation potential. At no point was morphology discriminatory to exclude embryos from transfer or freezing. The predictive capacity for implantation and pregnancy are similar for day 3 and day 5 morphology assessments (AUROC of 0.66 and 0.67 for implantation, and 0.68 and 0.67 for pregnancy, respectively). There is currently no evidence of improvement in clinical pregnancy rates by routinely performing EC assessment during assisted conception treatment.
This is the first systematic review of morphological assessment of embryo quality-predicting outcomes during IVF. Two-step searches have been performed to ensure that all tests described in the literature are included. We not only attempted to determine the predictive value of these tests but also explored clinical effectiveness and cost-effectiveness, as these aspects pertain to the application of any test in clinical practice.
This systematic review is based on observational data. Individual methodological differences, variation in design, inclusion or exclusion criteria as well as differences in the definition of the index tests and reference standards are inherent in systematic reviews of observational studies. In addition, there were a number of limitations.
Exclusion of studies
A number of studies were excluded as they had used development to blastocyst as the reference standard. Although it is assumed that the embryos that reach the blastocyst stage have proved their potential, it is an accepted fact that not all blastocysts implant. Moreover, a meta-analysis of RCTs showed that the cumulative pregnancy rate, after fresh and subsequent frozen embryo transfers, is higher if the transfer takes place on day 3, indicating that those who do not proceed to the blastocyst stage may indeed have embryos with implantation potential.  For this reason, we did not consider blastocyst development as an appropriate reference standard for this review.
An ideal study for a predictive test and its comparison to currently available studies
An ideal study testing a predictor of implantation/pregnancy should have a well-defined population, prospective and consecutive recruitment, blinding of those involved in assessing the test results and outcomes, adequate test description, predetermined normal and abnormal test values, and comparison with a gold standard such as live birth. An ideal study to determine the predictive power of any test of embryo quality in this case would have predetermined definitions of a good embryo and an inferior embryo. The ideal outcome should be live birth rate, but implantation and pregnancy rate would also be appropriate. Women should not have a combination of good and lower-quality embryos transferred at the same time. However, within this review, most of the available studies were retrospective, without consecutive involvement. In a significant proportion of them, embryos of varying quality, as determined by the index test, were transferred.
An ideal test and its comparison to currently available morphological assessment as test of embryo quality
An ideal test should be valid both internally and externally, reliable, replicable, discriminatory, cheap, easily available, simple to perform, and noninvasive. In addition, there should be a clear definition of what a normal or an abnormal test is. For any predictive test, it is important to consider what exactly is being predicted. In the present context, it would be either implantation rate or pregnancy or live birth. Assuming that a positive test result indicates a favorable prognosis, sensitivity reflects the ability of the test to identify all embryos that will result in implantation; specificity reflects its ability to exclude embryos that are not likely to implant; positive predictive value represents the probability of implantation when the index test is positive; and negative predictive value represents the probability of embryos not implanting if the index test is negative. The LR of a positive test quantifies how much more likely it is that a positive test will be found in an embryo that will implant than in an embryo that will not; the LR of a negative test indicates how much more likely it is that a negative test will be found in an embryo that will not implant than in an embryo that will. It is generally accepted that an LR+ of >10 represents a highly accurate test, an LR+ of 5-10 reflects a moderately accurate test, an LR+ of 2-5 indicates weak accuracy, an LR+ of 1-2 very weak accuracy, and a LR+ of 1 indicates no value in terms of predictive accuracy.  The LR ratios of the reviewed tests were all low (range 0-2), indicating that these tests perform poorly in terms of prediction of implantation or pregnancy. The AUROC represents the ability of a test to discriminate between a positive and a negative outcome. By definition, an AUROC of 0.5 is consistent with a test that completely lacks discrimination: No better than tossing a coin. None of the reviewed embryo scoring tests performed well in terms of discrimination, as shown by their respective AUROCs, all of which are less than 0.7 [Table 8].
Cost-effectiveness of embryo assessment
Although the cost-effectiveness of performing any embryo assessment was not addressed by the studies mentioned above, there may well be implications in terms of staff time. For example, when assessing for EC, laboratory time schedules are likely to be affected if this stage of examination is introduced into everyday practice. Oocyte collections are usually planned during morning hours, with insemination being performed in the afternoon. According to the ESHRE consensus,  the ideal time to determine EC would then be during the evening hours of the day, which may have implications for staff time and subsequent costs incurred. Assessment for EC should lead to significant improvement in pregnancy rates in order to justify the extra effort.
When considering the benefits of a test that involves additional examinations, alongside the standard visual assessments of the developing embryos it is recommended that the temporary interruption inflicted on the culture ecosystem is considered and that the detriment this may have toward the treatment outcome is taken into account.
Implication for clinical practice
Based on current available evidence, there is no justification for using extra morphological assessments that involve taking embryos out of incubators and interfere with the embryo culture system. The need for a test of embryo quality has been recently questioned as, with improved freezing, one could perform a fresh transfer and freeze the rest for subsequent transfers.  With successful freezing techniques, the only significant drawback of such an approach would be the potential time delay to achieve pregnancy.
Implication for future research
Like other health interventions, a new diagnostic test is ideally required to pass through various stages of critical assessments. It should be deemed biologically plausible. It is also necessary for the test to be clearly defined, including what constitutes a normal and an abnormal test. Appropriate reference standards should be used in the analysis of the test's sensitivity, specificity, and LRs. The clinical effectiveness and also cost-effectiveness should be demonstrated by high-quality prospective RCTs. Based on these criteria, clinical trial evidence is lacking for many tests of embryo quality. One good example is the time lapse systems, which have already been advocated in clinical practice without proven clinical or cost-effectiveness by appropriate studies. In a parallel example, preimplantation genetic screening had been shown to be of value by retrospective studies. However, when put to the test by an RCT, its usefulness was dismissed - in fact, this was found to be detrimental.  Therefore, further research in the form of appropriately designed RCTs is required before introducing such novel modalities into routine clinical practice.
| Conclusions|| |
A large number of morphological assessments of embryo quality have been described in the literature: Evidence that no ideal test exists. The accuracy of all these tests is low. Our review has also shown that none of these tests or combinations of tests has sufficient discriminatory power to exclude an embryo from embryo transfer. Newer techniques need to be further explored prior to their introduction in routine clinical practice.
Financial support and sponsorship
No external funding was sought for this systematic review.
Conflicts of interest
There are no conflicts of interest.
| References|| |
Maheshwari A, Griffiths S, Bhattacharya S. Global variations in the uptake of single embryo transfer. Hum Reprod Update 2011;17:107-20.
Dar S, Librach CL, Gunby J, Bisonnette F, Cowan L; IVF Directors Group of Canadian Fertility and Andrology Society. Increased risk of preterm birth in singleton pregnancies after blastocyst versus Day 3 embryo transfer: Canadian ART Register (CARTR) analysis. Hum Reprod 2013;28:924-8.
Glujovsky D, Blake D, Farquhar C, Bardach A. Cleavage stage versus blastocyst stage embryo transfer in assisted reproductive technology. Cochrane Database Syst Rev 2012;11:CD002118.
Moher D, Liberati A, Tetzlaff J, Altman DG; PRISMA Group. preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Med 2009;6:e1000097.
Littenberg B, Moses LE. Estimating diagnostic accuracy from multiple conflicting reports: A new meta-analytic method. Med Decis Making 1993;13:313-21.
Zamora J, Abraira V, Muriel A, Khan K, Coomarasamy A. Meta-DiSc: A software for meta-analysis of test accuracy data. BMC Med Res Methodol 2006;6:31.
Nicoli A, Capodanno F, Rondini I, Valli B, Villani MT, Morini D, et al
. Pronuclear morphology evaluation in in vitro
fertilization (IVF)/intracytoplasmic sperm injection (ICSI) cycles: A retrospective clinical review.
J Ovarian Res 2013:6:1.
Ludwig AK, Werner S, Diedrich K, Nitz B, Ludwig M. The value of pronuclear scoring for the success of IVF and ICSI-cycles.
Arch Gynecol Obstet 2006;273:346-54.
Chen C, Kattera S. Comparison of pronuclear zygote morphology and early cleavage status of zygotes as additional criteria in the selection of day 3 embryos: A randomized study.
Fertil Steril 2006;85:347-52.
Liu Q, Zhu G, Hu J, Wei Y, Ren X, Zhang H, et al
. Relationship between pronuclear scoring and embryo quality and implantation potential in IVF-ET.
J Huazhong Univ Sci Technolog Med Sci 2008;28:204-6.
Brezinova J, Oborna I, Svobodova M, Fingerova H. Evaluation of day one embryo quality and IVF outcome - A comparison of two scoring systems.
Reprod Biol Endocrinol 2009;7:9.
Balaban B, Urman B, Isiklar A, Alatas C, Aksoy S, Mercan R, Mumcu A, Nuhoglu A. The effect of pronuclear morphology on embryo quality parameters and blastocyst transfer outcome.
Hum Reprod 2001;16:2357-61.
Montag M, van der Ven H; German Pronuclear Morphology Study Group. Evaluation of pronuclear morphology as the only selection criterion for further embryo culture and transfer: Results of a prospective multicentre study.
Hum Reprod 2001;16:2384-9.
Payne JF, Raburn DJ, Couchman GM, Price TM, Jamison MG, Walmer DK. Relationship between pre-embryo pronuclear morphology (zygote score) and standard day 2 or 3 embryo morphology with regard to assisted reproductive technique outcomes.
Fertil Steril 2005;84:900-9.
Pelinck MJ, Hoek A, Simons AH, Heineman MJ, van Echten-Arends J, Arts EG. Embryo quality and impact of specific embryo characteristics on ongoing implantation in unselected embryos derived from modified natural cycle in vitro
Fertil Steril 2010;94:527-34.
Sjöblom P, Menezes J, Cummins L, Mathiyalagan B, Costello MF. Prediction of embryo developmental potential and pregnancy based on early stage morphological characteristics. Fertil Steril 2006;86:848-61.
Holte J, Berglund L, Milton K, Garello C, Gennarelli G, Revelli A, et al
. Construction of an evidence-based integrated morphology cleavage embryo score for implantation potential of embryos scored and transferred on day 2 after oocyte retrieval.
Hum Reprod 2007;22:548-57.
Lewin A, Schenker JG, Safran A, Zigelman N, Avrech O, Abramov Y, et al
. Embryo growth rate in vitro
as an indicator of embryo quality in IVF cycles.
J Assist Reprod Genet 1994;11:500-3.
Jackson KV, Ginsburg ES, Hornstein MD, Rein MS, Clarke RN. Multinucleation in normally fertilized embryos is associated with an accelerated ovulation induction response and lower implantation and pregnancy rates in in vitro
fertilization-embryo transfer cycles.
Fertil Steril 1998;70:60-6.
Visser DS, Fourie FR. The applicability of the cumulative embryo score system for embryo selection and quality control in an in-vitro
fertilization/embryo transfer programme. Hum Reprod 1993;8:1719-22.
Van Royen E, Mangelschots K, De Neubourg D, Laureys I, Ryckaert G, Gerris J. Calculating the implantation potential of day 3 embryos in women younger than 38 years of age: A new model.
Hum Reprod 2001;16:326-32.
Vernon M, Stern JE, Ball GD, Wininger D, Mayer J, Racowsky C. Utility of the national embryo morphology data collection by the Society for Assisted Reproductive Technologies (SART): Correlation between day-3 morphology grade and live-birth outcome.
Fertil Steril 2011;95:2761-3.
Qian YL, Ye YH, Xu CM, Jin F, Huang HF. Accuracy of a combined score of zygote and embryo morphology for selecting the best embryos for IVF.
J Zhejiang Univ Sci B 2008;9:649-55.
Sajko MC, Vlaisavljevic V. Determination of criteria for the assessment of embryo quality with good implantation predictability on a model of unstimulated in vitro
fertilization cycles - Prediction of implantation in unstimulated cycles.
Zdravniski Vestnik 2011;80(Suppl):I-39.
Fisch JD, Sher G, Adamowicz M, Keskintepe L. The graduated embryo score predicts the outcome of assisted reproductive technologies better than a single day 3 evaluation and achieves results associated with blastocyst transfer from day 3 embryo transfer.
Fertil Steril 2003;80:1352-8.
Check JH, Summers-Chase D, Yuan W, Horwath D, Wilson C. Effect of embryo quality on pregnancy outcome following single embryo transfer in women with a diminished egg reserve.
Fertil Steril 2007;87:749-56.
Hill MJ, Richter KS, Heitmann RJ, Graham JR, Tucker MJ, DeCherney AH, et al
. Trophectoderm grade predicts outcomes of single-blastocyst transfers.
Fertil Steril 2013;99:1283-9. e1.
Rijnders PM, Jansen CA. The predictive value of day 3 embryo morphology regarding blastocyst formation, pregnancy and implantation rate after day 5 transfer following in-vitro
fertilization or intracytoplasmic sperm injection.
Hum Reprod 1998;13:2869-73.
Rehman KS, Bukulmez O, Langley M, Carr BR, Nackley AC, Doody KM, et al
. Late stages of embryo progression are a much better predictor of clinical pregnancy than early cleavage in intracytoplasmic sperm injection and in vitro
fertilization cycles with blastocyst-stage transfer.
Fertil Steril 2007;87:1041-52.
Lan KC, Huang FJ, Lin YC, Kung FT, Hsieh CH, Huang HW, Tan PH, Chang SY. The predictive value of using a combined Z-score and day 3 embryo morphology score in the assessment of embryo survival on day 5.
Hum Reprod 2003;18:1299-306.
Meseguer M, Herrero J, Tejera A, Hilligsøe KM, Ramsing NB, Remohí J. The use of morphokinetics as a predictor of embryo implantation.
Hum Reprod 2011;26:2658-71.
Shoukir Y, Campana A, Farley T, Sakkas D. Early cleavage of in-vitro
fertilized human embryos to the 2-cell stage: A novel indicator of embryo quality and viability.
Hum Reprod 1997;12:1531-6.
Lundin K, Bergh C, Hardarson T. Early embryo cleavage is a strong indicator of embryo quality in human IVF. Hum Reprod 2001;16: 2652-7.
Giorgetti C, Hans E, Terriou P, Salzmann J, Barry B, Chabert-Orsini V, et al
. Early cleavage: An additional predictor of high implantation rate following elective single embryo transfer. Reprod Biomed Online 2007;14:85-91.
Sakkas D, Shoukir Y, Chardonnens D, Bianchi PG, Campana A. Early cleavage of human embryos to the two-cell stage after intracytoplasmic sperm injection as an indicator of embryo viability. Hum Reprod 1998;13:182-7.
Bos-Mikich A, Mattos AL, Ferrari AN. Early cleavage of human embryos: An effective method for predicting successful IVF/ICSI outcome. Hum Reprod 2001;16:2658-61.
Fenwick J, Platteau P, Murdoch AP, Herbert M. Time from insemination to first cleavage predicts developmental competence of human preimplantation embryos in vitro
. Hum Reprod 2002;17:407-12.
Salumets A, Hydén-Granskog C, Mäkinen S, Suikkari AM, Tiitinen A, Tuuri T, Early cleavage predicts the viability of human embryos in elective single embryo transfer procedures. Hum Reprod 2003;18:821-5.
Van Montfoort AP, Dumoulin JC, Kester AD, Evers JL. Early cleavage is a valuable addition to existing embryo selection parameters: A study using single embryo transfers. Hum Reprod 2004 19:2103-8.
Emiliani S, Fasano G, Vandamme B, Vannin AS, Verdoodt M, Biramane J, et al
. Impact of the assessment of early cleavage in a single embryo transfer policy. Reprod Biomed Online 2006;13:255-60.
Yang WJ, Hwu YM, Lee RK, Li SH, Fleming S. Early-cleavage is a reliable predictor for embryo implantation in the GnRH agonist protocols but not in the GnRH antagonist protocols.
Reprod Biol Endocrinol 2009;7:20.
Tsai YC, Chung MT, Sung YH, Tsai TF, Tsai YT, Lin LY. Clinical value of early cleavage embryo. Int J Gynecol Obstet 2002;76:293-7.
Lee MJ, Lee RK, Lin MH, Hwu YM. Cleavage speed and implantation potential of early-cleavage embryos in IVF or ICSI cycles. J Assist Reprod Genet 2012;29:745-50.
Hammoud I, Vialard F, Casasnovas P, Lefebvre G, Vauthier-Brouzes D, Poirot C. How viable are zygotes in which the PN are still intact at 25 hours? Impact on the choice of embryo for transfer. Fertil Steril 2008;90:551-6.
Sakkas D, Percival G, D′Arcy Y, Sharif K, Afnan M. Assessment of early cleaving in vitro
fertilized human embryos at the 2-cell stage before transfer improves embryo selection. Fertil Steril 2001;76:1150-6.
Hesters L, Prisant N, Fanchin R, Méndez Lozano DH, Feyereisen E, Frydman R, et al
. Impact of early cleaved zygote morphology on embryo development and in vitro
fertilization-embryo transfer outcome: A prospective study. Fertil Steril 2008;89:1677-84.
Fu J, Wang XJ, Wang YW, Sun J, Gemzell-Danielsson K, Sun XX. The influence of early cleavage on embryo developmental potential and IVF/ICSI outcome. J Assist Reprod Genet 2009;26:437-41.
Ciray HN, Karagenç L, Ulug U, Bener F, Bahçeci M. Early cleavage morphology affects the quality and implantation potential of day 3 embryos. Fertil Steril 2006;85:358-65.
Fancsovits P, Toth L, Takacs ZF, Murber A, Papp Z, Urbancsek J. Early pronuclear breakdown is a good indicator of embryo quality and viability. Fertil Steril 2005;84:881-7.
Ciray HN, Ulug U, Bahçeci M. Transfer of early-cleaved embryos increases implantation rate in patients undergoing ovarian stimulation and ICSI-embryo transfer. Reprod Biomed Online 2004;8:219-23.
Isiklar A, Mercan R, Balaban B, Alatas C, Aksoy S, Urman B. Early cleavage of human embryos to the two-cell stage. A simple, effective indicator of implantation and pregnancy in intracytoplasmic sperm injection. J Reprod Med 2002;47:540-4.
Alpha Scientists in Reproductive Medicine and ESHRE Special Interest Group of Embryology. The Istanbul consensus workshop on embryo assessment: Proceedings of an expert meeting. Hum Reprod 2011;26:1270-83.
Jaeschke R, Guyatt GH, Sackett DL. Users′ guides to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? The Evidence-Based Medicine Working Group. JAMA 1994:271;703-7.
Mastenbroek S, van der Veen F, Aflatoonian A, Shapiro B, Bossuyt P, Repping S. Embryo selection in IVF. Hum Reprod 2011;26: 964-6.
Mastenbroek S, Twisk M, van der Veen F, Repping S. Preimplantation genetic screening: A systematic review and meta-analysis of RCTs. Hum Reprod Update 2011;17:454-66.
[Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5]
[Table 1], [Table 2], [Table 3], [Table 4], [Table 5], [Table 6], [Table 7], [Table 8], [Table 9]