Page 53

CEP template 2012

S c i e n c e o f d e l i v e r y Commonwealth Governance Handbook 2014/15 51 The same ratings are also declining in the education sector and have been since financial year 2001. In addition to the decline in monitoring and evaluation quality, there were many other factors judged to be more important, about half (seven of the 15) concerning design and half implementation/supervision. These included design issues such as: over ambition in relation to the strength of political commitment; over ambition in relation to the time period; inadequate readiness for implementation; weaknesses in technical design, including prior analytical work; and over ambition or excessive complexity with regard to a country’s institutional or implementing capacity. Analysis Using the recently constructed Independent Evaluation Group (IEG) database of World Bank projects and their components, an initial analysis of the projects within the database was taken in order to see whether there is a correlation between high monitoring and evaluation ratings and high outcome ratings in World Bank education projects, as validated in the implementation completion report reviews.2 The results were compared using a two-by-two table in which projects with IEG outcome ratings that were moderately satisfactory3 and above were grouped, and those that were moderately unsatisfactory and below were grouped. These were compared to monitoring and evaluation quality ratings (quality ratings) with categories of modest4 and above, and the category of negligible. In this case there were a total of 63 projects with both IEG outcome ratings, and monitoring and evaluation ratings. As seen in Figure 1, the analysis showed that the majority of the projects had quality ratings of modest and above, and outcome ratings that were moderately satisfactory or above, while less than half of projects with a modest and above monitoring and evaluation rating had moderately unsatisfactory and below outcome ratings. A breakdown of the 41 projects with moderately satisfactory and satisfactory outcome ratings, and modest and substantial quality ratings, shows that only ten projects had an outcome rating of satisfactory and a monitoring and evaluation rating of substantial, as shown in Figure 2. Although the highest monitoring and evaluation rating possible is ranked high, there were no projects with a high monitoring and evaluation rating in the 63 analysed. To test the statistical significance of the correlation between the quality rating and the IEG outcome rating, a chi-squared test was run and found that the results were statistically significant at the less than 95 per cent level (see Figure 3). A chi-square test is a commonly used statistical test which tests the correlation between two variables of interest. The test looks at whether deviations in the data from theoretical proportions have occurred by chance or not. For that purpose the test uses the chi-square distribution to estimate the theoretical proportions of how the data should behave. In addition, a breakdown of the monitoring and evaluation ratings by regions for this data set showed East Asia and the Pacific and Europe and Central Asia to have the highest ratings (see Figure 4). Of the 41 projects with moderately satisfactory and above outcome ratings, and modest and above quality ratings, 95 per cent of the component records associated with these had monitoring and evaluation as a component or mentioned monitoring and evaluation explicitly as part of their component descriptions. Conclusions The results that emerged from using a recently constructed database of World Bank projects and their components, and linking to IEG ratings, show that better performing projects do have higher monitoring and evaluation ratings, and that the difference is statistically significant. However, even the better performing projects mainly have only a modest rating for monitoring and evaluation, suggesting room for improvement. Initial case study analysis found specific design features that are common to the high performing projects with effective monitoring and evaluation. There are also added benefits when projects are designed, encouraging experimentation and taking into account distinctive features of the local context. This analysis needs to be extended to the rest of the sample to see if these features are common, and if so, the extent to which they help to explain the improved performance. Endnotes 1 Impact evaluations point to three approaches that are most effective in improving learning outcomes. First of all, information reforms, such as providing school and student test scores, help to make better comparisons among schools, improve student performance and reduce school fees (Andrabi et al., 2014). Second, school-based management reforms, such as tracking students based on prior achievement into separate classes, increasing school autonomy and empowering parents can be effective, although there are only a small number of rigorous studies and the metrics vary in different studies. Third, teacher incentive reforms can have an impact, making teachers more accountable for results by linking tenure and/or pay to performance. Again, the evidence base on this is small, but promising (Bruns et al., 2011; Duflo et al., 2008). 2 The analysis was conducted by using all the education projects with monitoring and evaluation quality ratings, and outcome ratings that we found in both the IEG components database (privately available) and in IEG’s publicly available database of World Bank project ratings. Only projects with both monitoring and evaluation quality ratings and IEG outcome ratings that overlapped in both databases were used for the analysis. There were a total of 63 projects that fell into this category. Only this population of projects was used in the analysis to ensure that other researchers could replicate this exercise if desired. The completion dates for the implementation completion report reviews all fall between 2007 and 2012. 3 Outcome ratings are on a six-point scale: high, satisfactory, moderately satisfactory, moderately unsatisfactory, unsatisfactory and highly unsatisfactory. 4 Monitoring and evaluation ratings are on a four-point scale: high, substantial, modest and negligible. References Andrabi, T., Das, J. and Khwaja, A. I., 2014. Report Cards: The Impact of Providing School and Child Test Scores on Educational Markets pdf Harvey Kennedy School, Harvard. Available at: www.hks.harvard.edu/fs/akhwaja/papers/RC_June2014.pdf Accessed 1 December 2014. Argyris, C. and Schön, D., 1974. Theory in Practice: Increasing Professional Effectiveness. San Francisco: Jossey-Bass. pp. 2–3. Banerjee, A. and Duflo, E., 2011. Poor Economics: A Radical Rethinking of the Way to Fight Global Poverty. New York: Public Affairs.


CEP template 2012
To see the actual publication please follow the link above