NG Guat Tin, PhD
Department of Applied Social Sciences
The Hong Kong Polytechnic University
An emphasis on programme evaluation among human service organizations occurs within the context of “the era of accountability,” in which public and private sectors spearhead the drive towards accounting for the use of public resources and funding for results (Kettner, Moroney, & Martin, 2013, p. 4). In America, it dates back to the 1960s and 1970s where evaluation serves to justify new or continuation of public policies or programmes, whether facing opposition to its implementation or experiencing doubts about its effectiveness in addressing various social problems, such as poverty (Datta, 2011). Though evaluation is rooted in social science research methods, it was given an additional boost by new public management (also referred to as managerialism). Royse, Thyer, Padgett, and Logan (2006) referred to programme evaluation as “applied research used as part of the managerial process” (p. 11). Managerialism that was adopted by public sector organizations in many western nations, since the late 1970s, heralded management concepts and tools that were designed for “auditing, control, regulation, assessment, inspection and evaluation” (Diefenbach, 2009, p. 899).
But beginning in the 1990s there is a movement away from only measuring inputs (e.g. number of staff and volunteers) and outputs of human services (e.g. frequency of activities and number of clients) to measuring client or participant outcomes (referring to changes in knowledge, attitudes, skills, or conditions of clients and programme participants). More recent models shift the line of accountability even further to measuring impact, that is, changes to the wider community (see logic model in Kettner et al., 2013). Social workers and other professionals working in human service organizations are subject to the imperative to engage in evaluating the programmes they have designed, implemented, and managed and in more recent years, to tracking of KPIs (key performance indicators) as well, through what is known as performance measurement. However, performance measurement usually includes indicators of outputs, quality, and financial aspects, besides outcome indicators.
This paper focuses on outcome evaluation and outcome measurement, among nonprofit agencies, which is touted to provide many benefits: (i) Funders are able to know if funds have been used effectively to assist programme participants and decide whether to continue funding; (ii) Agency management is able to track whether desirable targets have been attained and account to funders and donors how different programmes are performing; (iii) Programme staff may use outcome data to modify programme implementation and improve services to their clients; (iv) Programme participants can provide feedback as to whether they are being helped or not; and (v) Donors will have information with which to decide whether to support financially a particular programme or agency.
This paper seeks to consider whether these benefits are realized, within the overarching goals of establishing accountability and programme effectiveness: a reality check on the hype generated by proponents of outcome evaluation and programme measurement, of which the United Way of America (UWA) is instrumental in spearheading its development among local united ways and nonprofit agencies. The UWA produced a manual on “Measuring program outcomes: A practical approach,” together with related training resources, in 1996 (widely disseminated but has not been updated since then). There is less literature on “whether outcome evaluation is really beneficial,” as compared to the more extensive literature on “why do outcome evaluation” and “how to do outcome evaluation.” Part of the reason for the initial emphasis on why and how to do outcome evaluation is that many agency executives and programme staff do not have the expertise to conduct outcome evaluation or to develop outcome measurement. Perhaps a more critical reason is that many executive and programme staff already have a heavy workload in management and frontline work; finding slack time to collect, analyze, and utilize client data is not easy. There is often no extra funding or human resources to engage in evaluation work. It is not surprising that there is resistance to taking time and effort away from what these agencies and professional staff prefer to do: provide direct services to those seeking help. A study by UWA (2000) showed that 46% of those who responded (N=298 agencies) agreed that implementing outcome measurement has “diverted resources from existing activities” (p. 6). Unfortunately, the study was not replicated in subsequent years, so it is not possible to ascertain changes in attitudes (if any) since then.
Perhaps, nonprofit agencies would be more inclined to initiate and engage in outcome measurement or outcome evaluation if the benefits are substantive or the benefits are greater than the costs involved. Just as we ask agency executives and programme staff to demonstrate that their programmes are producing results we should also ask if outcome measurement and outcome evaluation have proven to be beneficial. Adopting a stakeholders’ perspective, the rest of this paper reviews and discusses the evidence (or lack of) that programme outcome measurement produces benefits to the various stakeholders: funders, agency managers, programme staff, programme participants, and donors?
Funders: Government agencies (e.g. through the United States Government Performance and Results Act of 1993 in America), funding bodies (e.g. united ways/community chests), and national organizations have been urging human service agencies to engage in evaluation, either in the form of programme evaluation or programme outcomes measurement. The former attempts to assess the cause-effect relationship of programmes and are conducted less frequently, but in greater depth, whereas the latter tracks changes in clients on a regular, monitoring basis, rather than what caused the changes (Hatry, 1997). Nonprofit agencies that depend on government contracts and donations from philanthropic foundations are finding that nowadays funders require them to include programme monitoring or indicators of programme effectiveness in their funding proposals (Carman & Fredericks, 2008). How funding bodies use the evaluation data to guide them in making funding decisions is not entirely clear, though one study suggests that it can be elaborate and tedious.
Green, Ellis, and Lee (2005) documented an evaluation project in the city of Oakland, California, to develop a performance ranking tool (includes items such as service costs, client satisfaction data, client outcomes, etc) to assist a planning and oversight committee in making decisions on renewed funding. They reported that of the 53 agencies being funded; only two-thirds were selected for continued funding the following year. They found, however, that the funding decision was moderately correlated with the agencies’ performance scores. The implication of this finding is that performance scores do not provide a strong indication of whether funding will be continued. It then raises several questions: what other criteria do funders use to make funding decisions? What happens to the clients of programmes when funding is discontinued? What service responsibilities do funders, especially government agencies, have towards existing and potential clients in terms of alternative services, if they were to cut funding? Funders should keep their focus on service recipients and not just the funded agencies.
Though the study cited above and other evaluation studies (see Chelimsky, 1997) show that the threat of funding cuts is real, Hendricks and colleagues (2008), drawing on their extensive experience in programme evaluation and united ways, observed that termination of programme funding by united ways (in America) happens infrequently, because most of them do not use outcome results as the only factor in decision making. United ways have come to realize the many challenges in initiating and implementing outcome measurement: difficulties in imparting evaluation skills to both agency and united way staff; requirement for ongoing technical assistance to funded agencies; effective use of the outcome data; and identification of relevant outcome indicators (Hendricks et al., 2008). Furthermore, funding criteria should include whether funded agencies are learning from and making improvements to their programmes and not attainment of performance targets only (United Way of America, 2003). Since funders are major stakeholders of outcome measurements they have several responsibilities in this era of better management of resources: (i) To reconsider what types of data and reports they require grantees to collect and submit; (ii) Provide training related to database management and not just training on evaluation concepts; (iii) Provide training and technical assistance as ongoing support, rather than offer one-off activity, since evaluation skills require extensive period of acquisition; and (iv) Allocate specific funding to agencies for evaluation work (Carman & Frederick, 2008; Hendricks et al., 2008; United Way of America, 2003). More important, funders need training as well, on how to use outcome data and having a system in place to receive outcome data (United Way of America, 2003).
Agency boards. Given resource constraints, agency boards are traditionally more concerned with financial viability than with programme outcomes. But with the increasing expectation to operate programmes that produce favourable outcomes, at lower costs, agency boards have to pay attention to evaluation data as well. Besides outcome data, there are various other types of data that may be more useful in monitoring and assessing programme effectiveness, such as community coverage, equity, process, effort (or outputs), cost-efficiency, and cost-effectiveness (see Kettner et al., 2013, for a review of data coverage). However, one of the downside of implementing comprehensive data collection is the additional resources incurred.
In this regard, the use of client satisfaction surveys may be quite useful as it is easy to administer and provides quick feedback. In a survey of nonprofit agencies (N=189) in Indiana state (USA), it was found that 67% reported the regular collection of client/participant satisfaction data (Carman & Fredericks, 2008). But relying on this source of data only is inadequate as they do not indicate whether clients have improved or not. Furthermore clients invariably rate high level of satisfaction with services provided, possibly motivated by gratitude for services received, fear of repercussions, or some other reasons (Royse et al., 2006). As such, satisfaction surveys that only ask for quantitative ratings are not so useful for programme improvements, unless they are supplemented with open-ended questions asking for qualitative feedback (see Royse et al. 2006 for suggestions on how to improve the usage of satisfaction survey).
Programme staff. One of the benefits of programme evaluation that has been articulated is that developing programme logic seems to be helpful in thinking through what is it that the programme aims to achieve and how to go about achieving the aims (Hendricks et al., 2008). The UWA (2000) survey showed that 88%, 86%, and 76% of the respondents, respectively, agreed that implementing outcome measurement helped them to (i) focus staff on shared goals; (ii) clarify purpose of programme; and (iii) improve delivery of services. In a separate survey of nonprofit agencies (referred to earlier), it was found that the most frequent use of evaluation data was to make changes in existing programmes (93%), submit reports to the board (82%), establish programmes, make decisions about staffing (68%), and report to funders (67%) (Carman & Fredericks, 2008). The researchers categorized the attitudes of nonprofit agencies toward evaluation into three groups: (i) evaluation was seen to be a resource drain and distraction; (ii) evaluation was viewed as an external, promotional tool; and (iii) evaluation was considered a strategic management tool. There seems to be some agency and staff characteristics associated with these different views; for example, those who consider evaluation to be a drain and a distraction tend to be organizations with inadequate organizational capacity and require more specific training and a low-cost, user friendly way of collecting evaluation data. Those that have a more positive view may be in a better position to put in place a systematic and purposeful database management system that is able to generate meaningful reports for planning and regular programme reviews (Carman & Frederick, 2008).
Programme staff, including social workers, should not simply follow the directions of funders in implementing outcome measurement but argue for general and specific evaluation measures that are appropriate and meaningful for service improvement. It is not an easy task because many aspects of social work and social services are difficult to measure and it is tempting to opt for what is easy to count or assess. Nonetheless, social workers should be mindful of their professional mandate, besides funding requirements, to engage in fruitful evaluation work. Historically, social workers have been called to evaluate their practice. As far back as 1931, Dr Richard C. Cabot (a medical doctor, educator, and pioneer of medical social services) appealed to social workers to “measure, evaluate, estimate, appraise” their results so as to provide better services to their clients (cited in Royse et al., 2006, p.3). Hence, social workers should be more reflective about what to measure, how to measure, and when to measure, so that data collection and analysis become part of the work flow process and professional activity. Traditionally, record keeping has been part of social work professional activity, particularly in casework, and with the availability technology, social workers should be able to maintain this good practice.
Programme participants. Making outcome data available to potential programme users presumably will enable them to make more informed decisions as to whether to participate in a programme. The extent to which outcome data are disseminated to the general public, and hence, potential and current programme participants, is not well documented. One small study (N=36 organizations) in America reported that few organizations provide outcome information to clients, volunteers, or the public (Morley, Vinson, & Hatry, 2001). Even if it is known that a particular programme or agency provides poor service, there may not be many choices for clients. For example, for local residents seeking services from Family Service Centres in Singapore, their choice is limited by the geographical location in which they are living in. If there are more choices, then clients can make better use of such information.
Donors. The UWA study (2000), referred to above, also reported that 88% of their respondents agreed that outcome measurement had helped them to communicate programme results to board members, donors, etcetera. It should not be assumed, therefore, that all donors require hard data with which to decide whether to give or how much to give. For some, the cause itself is enough to consider giving, for example, providing shelter and meals for homeless people. For others, personal testimonies of clients and their families of how a particular programme has helped them may serve to provide a “human element” to outcome data. Quantitative data, however, may be useful for hard-nosed donors, who require a report card for decision-making. It may be more helpful to ask donors what type of data will be of decision-making value, so as not to waste human, technical and financial resources that could be put to better use.
The discussion thus far seems to indicate that the extent to which benefits of outcome evaluation and outcome measurement are realized depend on the categories of stakeholders; the benefits are more apparent for programme staff and to some extent, funders and agency boards. Whatever the benefits, it is prudent to count the costs of evaluation training, collecting and disseminating evaluation data, and managing a database, to justify the work involved, which invariably consumes more resources and requires more expertise than what funders may have foreseen.
Looking back at programme evaluation over the past decades, it is important to recognize that there are wide variations in forms, design, complexity, and intensity of evaluation. It is therefore not surprising that funding agencies have varying requirements of evaluation to be undertaken. Among the 1,300 local united ways in America, only an estimated 450 of them require agencies to measure programme outcomes; some started implementing outcome measurement and gave up whist others persevered (Hendricks et al., 2008). The concepts and practices of programme evaluation in general and outcome evaluation in particular have been diffused beyond America and adopted, in varying degrees, by nonprofit agencies in other countries. It is noteworthy that the client outcome management framework (and the enhanced programme evaluation system) of the National Council of Social Service in Singapore has been influenced by UWA, requiring Community Chest-funded agencies to abide by the standard of outcome management.
Whatever evaluation framework that national organizations or funding bodies chooses, it is important for them to be realistic: establish what is within the scope and expertise of nonprofit agencies, recognizing that they range in staff size from less than 10 to a few thousand, and have different organizational capacities. They should not expect direct service nonprofit agencies to function as research institutions, producing evidence that their programme works and to distill programmatic influence on the lives of their programme participants. Even if funders do not require agencies to use rigorous research designs (such as experimental and control groups, longitudinal data, etc) to conclude that indeed it was the programme and not something else that produced the results, still there are difficulties involved in selecting reliable and valid outcome indicators and tracking client changes over time (especially long-term results where non-programmatic influences such as the health of the economy would come into greater play). As such, the UWA’s approach is NOT to promote “outcome evaluation” but “outcome measurement,” with no attempt to determine causality or to explicate different levels of outcome achievement (Hendricks et al., 2008). The outcome indicators show whether clients’ condition, behavior, knowledge or skills have improved, gotten worse, or remained the same. The changes may or may not be the result of participation in the programme.
In the current context of performance management, outcome measurement can easily become a numbers game, in which players tweak the system to look good publicly. Programmes are designed and implemented to meet certain objectives, which are mostly about helping programme participants improve in some areas of their lives, for example, increased literacy, healthier family relationships, or better child care. Outcome measurement is only one of the tools to enable programme managers and social workers to keep track of whether the programme objectives have been achieved, and if not, what aspects of the programme need modifications.
Dr Ng is a lecturer from the Hong Kong Polytechnic University. She was a lecturer in the National University of Singapore, with research interests in spousal violence, family caregiving (with a special focus on spousal carers), and care of older persons, particularly those living alone, and cross-national comparison of government policies on aged care. Since moving to Hong Kong to work, she has developed new interests in social work education, the development of NGOs in China, professionalization of social work in China.
Carman, J. G., Fredericks, K. A. (2008). Nonprofits and evaluation: Empirical evidence from the field. New Directions for Evaluation, 119, 51-71.
Chelimsky, E. (1997). The coming transformations in evaluation. In E. Chelimsky & W. R. Shadish (Eds.), Evaluation for the 21st century: A handbook, pp. 1-26. Thousand Oaks, CA: Sage publications.
Datta, L-e. (2011). Politics and evaluation: more than methodology. American Journal of Evaluation, 32(2), 273-294.
Diefenbach, T. (2009). New public management in public sector organizations: The dark side of managerialistic ‘enlightenment.’ Public Administration, 87(4), 892-909.
Green, R. S., Ellis, P. M., & Lee, S. S. (2005). A city initiative to improve the quality of life for urban youth: how evaluation contributed to effective social programming. Evaluation and Program Planning, 28, 83-94.
Hatry, H. P. (1997). Outcomes measurement and social services: Public and private sector perspectives. In E. J. Mullen & J. L. Magnabosco (eds.) Outcomes measurement in the human services, pp. 3- 19). Washington DC: NASW Press.
Hendricks, M., Plantz, M. C., & Pritchard, K. J. (2008). Measuring outcomes of United Way-funded programs: Expectations and reality. New Directions for Evaluation, 119, 13-35.
Kettner, P. M., Moroney, R. M., & Martin, L. L. (2013). Designing and managing programs: An effectiveness-based approach. Los Angeles: Sage Publications.
Morley, E., Vinson, E., & Hatry, H. P. (2001). Outcome measurement in nonprofit organizations: Current practices and recommendations. Washington, DC: Independent Sector.
Posavac, E. J., & Carey, R. G. (2007). Program evaluation: Methods and case studies (7th ed.). Upper Saddle River, NJ: Prentice Hall.
Royse, D., Thyer, B. A., Padgett, D. K., & Logan, T. K. (2006). Program evaluation: An introduction (4th ed.). Belmont, CA: Brooks/Cole
Schalock, R. L., & Bonham, G. S. (2003). Measuring outcomes and managing for results. Evaluation and Program Planning, 26, 229-235.
United Way of America. (2000). Agency experiences with outcome measurement: Survey findings. Alexandria, VA: Author.
United Way of America. (2003). Indicators that a united way is prepared to plan for, implement, sustain, use and benefit from program outcome measurement. Alexandria, VA: Author.