Rationale and Criteria for the IBM Reuse Metrics Jeffrey S. Poulin Reuse Technology Support Center International Business Machines Corporation Abstract This paper describes the issues addressed by the reuse metrics and return ___________________________________________________________________________ on investment model in place at IBM [29], [29], [30], [31]. The metrics are _____________________________________________________________________________ used in a return on investment analysis to develop a business case that esti- _____________________________________________________________________________ mates the financial benefit of the organizational reuse program. Recognizing _____________________________________________________________________________ the potential benefits of reuse has proven instrumental in inserting reuse in _____________________________________________________________________________ the IBM programming process. ____________________________ The key to reuse metrics is the accurate reflection of effort saved. This ___________________________________________________________________________ paper defines the issues a reuse metrics model must address to achieve this _____________________________________________________________________________ goal. Foremost, it is important to distinguish the savings and benefits from _____________________________________________________________________________ those already gained through accepted software engineering techniques. The _____________________________________________________________________________ same metrics must also encourage reuse throughout the corporate software _____________________________________________________________________________ development process. This dual purpose reveals numerous issues that must be _____________________________________________________________________________ addressed before acceptance of the metrics by the programming community. _____________________________________________________________________________ Options and methods to resolve each of these issues are presented. __________________________________________________________________ KEYWORDS: Reuse Metrics, Measuring Reuse, Software Reuse, Reuse Business Case, ROI, Software Measurement, Reuse Issues. 1.0 Overview Software metrics are an important ingredient in effective software manage- ment. Unfortunately, the lack of an industry standard for reuse metrics is one of the major inhibitors to a coordinated reuse program [8]. Without a means to quantify the practice, development organizations are unable to judge their return on investment and are therefore reluctant to engage in an active reuse program. However, if metrics are used in a return on investment model to verify and demonstrate the substantial benefits of reuse, organizations may be more likely to realize such potential. The traditional role of metrics is to assist management by quantifying the process of developing software, thereby giving management a means to assess the level of reuse within a site or a project. With an emerging technology, however, metrics must extend beyond their traditional role. Reuse metrics must also encourage the practice of reuse. Most organizations do not prac- tice formal reuse or are reluctant to invest in a formal reuse program. Reuse metrics must assist in the technology insertion process by providing favorable process improvement statistics and by placing emphasis on activity conducive to reuse. Reuse metrics are a special class of software metrics and are unique for several reasons. One is that unlike the productivity metrics which are ingrained in most development organizations, reuse metrics seek to reward what is not done. Rewards are traditionally given to those who deliver the most function, which means productivity metrics encourage developers to write code. Reuse metrics seek the opposite; they seek to reward developers for not writing code. ___ Section one of this paper discusses the reason for metrics and general metric requirements. The remainder of the paper is in two parts. The first part describes what to measure as reuse. In effect, this defines, from the ____ business case perspective, what should count as a Reused Source Instruction _________________________ (RSI). The second part discusses how to measure reuse. This part describes _____ ___ the issues that arise when putting reuse metrics into practice. 2.0 Reuse Goals and Metric Criteria 2.1 Organizational goals The first step in defining metrics is to establish organizational goals [6]. Perhaps the most controversial question is "What should be measured as reuse?" For example, reuse may be described as the building of applications from building blocks of reusable parts, porting, using operating systems, using high-level languages, using spreadsheets, databases, and similar pro- ducts. Although these tools and techniques may greatly increase the produc- tivity of the average programmer, their use is not always considered "reuse." Ultimately, what to measure and report as reuse must be defined by the goals of the organization. Because the IBM metrics are input to a Return on Investment (ROI) model it is important that the metrics reflect effort saved, both by quantifying the level of reuse in an organization and by determining the investment value of reuse. Specifically, the IBM reuse metrics have the following goals: 1. Reuse metrics must reflect effort saved. 2. Reuse metrics must encourage reuse. Along with the above two goals, the following three general definitions of reuse will be used throughout the paper. These definitions are commonly used to describe software reuse; each is necessary but not always sufficient. They will be useful in helping to resolve the numerous issues associated with applying metrics in an organization. 1. Reuse is the use of an existing component in a new context. 2. Reuse is the raising of the level of abstraction in programming. 3. Reuse is not writing what you would have written. The real difficulty in reuse metrics is to be the judge of the third defi- nition. Often, the question hinges on whether the programmer is using or _____ reusing software. Furthermore, the situation is dynamic; as a technique _______ becomes ingrained in a culture, it transitions in status from "novel" to "expected." What is reused today may be simply "used" three years from now. The IBM reuse metrics provide input to business cases justifying invest- ments in reuse. It is important to distinguish between the usual software development process and how the process changes with reuse. Therefore, resolving most of the issues in reuse metrics is a matter of focusing on the effort saved, or the value of reuse. 2.2 Reuse Metrics Criteria Reuse metrics must establish an effective standard that may be implemented by development organizations. The data must be easily obtained, meaningful, and possible to implement in a uniform way. The goals of the organization help define the questions to be asked and lead to the formulation of metrics. At IBM the goals are to evaluate the value of reuse and to encourage reuse where it does not exist. However, the ultimate ability to report any metric depends on numerous factors influencing the collectability of the required data. The realities of the software development process, organizational needs, and available data affect what can be reported; these factors actually ___ shape the metrics as much as the goals and questions. Any useful metric must be based on common sense, providing as much useful information with as little cost as possible. Specifically [33], [11]: 1. The metrics should be compatible with the existing software development process. 2. The data needed to quantify the metrics should be easy to collect and normalize. 3. The metrics should be easy to understand, analyze, and interpret. 4. The cost of data collection, analysis, and reporting should be kept to a minimum. 5. Collecting the metric data should not adversely impact the process or products being measured. 6. The metrics should be objective and not subject to bias or distortion. 7. The metrics should be independent of implementation-specific details. 8. The metrics should help generate estimates of software cost, produc- tivity, and quality. 9. The metrics should be informative; they measure what you seek to measure and not be unduly influenced by other factors. The observable data and metrics used by IBM adhere to these criteria; they are integrated into the development process and are based on data that have been collected by the corporation for many years. 3.0 Issues in Reuse Metrics 3.1 What to measure as Reuse This section lists many of the issues surrounding the definition of reuse. The order of the issues is related to the difficulty of the issue, with the less esoteric issues presented first. Options for solving each issue are followed by the method and rationale used by IBM. 3.1.1 Use of base product (maintenance) Most software spends 80% of its life in maintenance [15]. Maintenance both adds new function to a product and repairs errors from previous releases of the product. It has been argued that every new release of a product is new software which reuses most of the software created in the previous releases and therefore software maintenance is a form of reuse [4]. When reporting programming metrics on product releases, IBM excludes soft- ware from the product base.(1) The same is true for reuse metrics. This pre- vents reporting extremely high reuse levels whenever a new release consists of minor changes to a large product. IBM reports programming metrics for new releases based on the new, changed, and deleted portion of the product. These are the Changed Source Instructions, or CSI. Reusable components that are completely new to the product contribute to the reuse level (RSI) for the release. A call, invoca- tion, or include of a component previously appearing in the base software is considered CSI but the source for the component does not appear in the metrics for the release. (The source appeared in the metrics for the release in which it was added.) This separates maintenance activities from what should be reported as reuse. 3.1.2 Use of Operating System Services The use of operating system services are a form of reuse because the oper- ating system raises the level of abstraction required to use the machine. Programs that execute in an operating system environment (e.g., OS/2, MVS, VM) reuse the services provided by that environment [26]. However, the context in which the operating system functions does not change, and we clearly do not expect to have to write a new operating system for every new application. In short, use of an operating system is not generally consid- ered reuse because we expect use of the operating system [5]. The approach taken by IBM is that, for application programming, operating system services are not considered part of the product and therefore do not affect programming metrics, including those for reuse. --------------- (1) The product base consists of all code previously released in the product. 3.1.3 Use of tools Text editors needed to input software, debuggers, compilers, and library systems for source code control are all necessary parts of software develop- ment. However, they are not "parts" of the product nor are they normally delivered to customers with the product. Tools are used for their intended purpose, which is not a change in context. Furthermore, although use of a tool represents a savings in effort, we do not normally expect programmers to develop tools for every new application. The use of tools is not reuse. We should distinguish between use of a tool and reuse of the source code of the tool. For example, parts of a debugging tool may be reused during the development of a test coverage tool. In general, however, tools are used to ____ increase programmer productivity. 3.1.4 Use versus Reuse of components Differentiating between "use" and "reuse" of a tool is important but perhaps less controversial than when the same rationale is applied to compo- nents. For example, suppose a programmer retrieves a subroutine from a reuse library and uses it without modification. This is reuse since it clearly meets the reuse criteria; e.g., saved the programmer from having to write the subroutine, etc. However, if the programmer repeatedly calls the subroutine, each call to the subroutine should not be reported as "reuse." This is critical for accurate estimates of the benefits of reuse and return on investment analysis of projects. We expect organizations to develop sub- programs for often-needed services and to use those subprograms; the part would have only been written once. Repeated calls to routines previously developed for a product or previously developed by organization should not increase the reported level of reuse. Likewise, if the development language is object-oriented, class definitions may be reused but the instantiation of classes are "used." In one actual example, a project reported 11 kloc of reuse on a relatively small application. Closer inspection revealed 5120 lines of the 11 kloc were attributed to one 10-line reusable macro and that all 5120 lines were attri- buted to the same module. A code review revealed that the original code: Do i := 1 to 512 MACRO(i); consisted of 2 instructions (the DO..WHILE and the call to MACRO) and 10 reused instructions. However, the loop was automatically unrolled during optimization to yield: MACRO(1); MACRO(2); .... MACRO(511); MACRO(512); and was reported as 512 source instructions and 5120 reused instructions. The latter is clearly neither an accurate reflection of productivity nor reuse. For this reason, IBM considers that a part may be "used" by an organ- ization numerous times, but a part can be "reused" by an organization only once. 3.1.5 Use of pre-requisite products Many utilities are available to programmers in addition to those provided by the operating system. Examples are database systems and graphics pack- ages. Whereas reuse in the small consists of assembling components into applications, reuse in the large may consist of making use of services pro- vided by spreadsheets and other tools [14]. The operative term is "tools." Although these tools raise the level of abstraction in programming and thereby increase productivity and also save having to write code to deliver equivalent function, their use is not "reuse"[5] . Not only is use of the tools expected, claiming reuse of the database system source whenever use is made of the database is not an accu- rate reflection of reuse activity. 3.1.6 Application Generators Some program tools are able to create large quantities of code from spec- ifications given in tables, pictures, templates, or through fourth generation languages. This is sometimes called generative, or top-down reuse. This is ___________ opposed to compositional reuse, or the bottom-up assembling of components [10]. Some consider the use of automatic program generators to be a form of reuse [24]. Clearly, if a programmer is able to graphically specify a user interface and have the corresponding code to create the interface automatically gener- ated for him, this is an enormous advantage. Furthermore, this is code that the programmer would have written. However, although these tools are great productivity aids, their output cannot be compared with output using tradi- tional methods. One example reported by a government agency involved 104,000 lines of code (104 kloc) developed by three programmers in less than 90 days. Approxi- mately 99 kloc was automatically generated user interface and similar code. It is unrealistic to compare the effort of this group with someone who is unable to use such tools. Likewise, the reported 95% level of reuse on the application is not a true reflection of reuse. To obviate the problems of such comparisons, IBM reports automatically generated code separately from the traditionally developed code. 3.1.7 Use of high level languages High level languages (HLLs) raise the level of abstraction needed to accom- plish a task much the same way as do abstract data types and class libraries. The advent of high level languages resulted in a five fold increase in pro- gramming productivity [9]; this higher level of abstraction and reduced com- plexity of programs may be considered a form of reuse [24]. However, at IBM the use of HLLs is normal business practice and treated similar to other tools used during application development. Therefore, no reuse credit is given for using HLLs. 3.1.8 Code libraries There are several sources of code for use in building applications. These sources may be arranged in a hierarchy based on how much their use is ingrained in a programming culture or programming language. Ultimately, the decision to classify a given component as reused or not depends on local con- siderations and the goals of the organization. Use of utility libraries The foundation of the programming library hierarchy consists of utilities that are essential or required to program any useful function in a given lan- guage. The stdio library in "C" and the math subroutine libraries of FORTRAN _____ are examples. At IBM, use of these libraries is expected and considered a part of the language rather than a separate source of reusable code. Use of standard libraries Second on the hierarchy are the local libraries that every programmer must use and be familiar with to be a contributing member of a development group. An example is the memory allocation and deallocation functions provided to operating systems programmers. These libraries are the local equivalent of standard language libraries and are, in general, not considered to be reuse. Domain-specific libraries Third on the hierarchy are the domain-specific libraries. Organizations that have extensive experience in an application area or have completed a successful domain analysis [32] may develop specialized collections of standard routines that provide the reusable foundation for future applica- tions. Example domain libraries are the interest rate functions provided to programmers of financial applications and the flight control functions pro- vided to programmers of aerospace applications. Domain-specific libraries are closely related to local, standard libraries in that their use is expected within a group or domain. Domain-specific libraries are often the result of a focus on reuse and can result in significant savings. Therefore, IBM encourages and recognizes the development and use of domain libraries as reuse. The tremendously suc- cessful levels of reuse at IBM in the avionics domain are examples of this practice [25]. Project libraries Fourth, good software design results in a well-structured program, solid class definitions, or superior use of language features such as Ada generics or object inheritance. Some "reuse" will result because it is designed into the program or because language features are conducive to calling something reusable. This activity must be encouraged, although still recognizing that metrics must accurately reflect effort saved. Subroutines and macros exist because many functions are repetitive; objects and Abstract Data Types (ADTs) exist in languages because they are powerful productivity tools. Claiming reuse credit for good design may be misplaced but it still rewards good programming. Crediting reuse of a stack procedure in PL/I but not cred- iting use of the stack features built into the REXX command language may be inconsistent but it rewards good programming. Ultimately, the organization must decide what is good design and what is reuse. At IBM, resolving the issue requires deciding if the new context is outside the scope of normal good design and software engineering. If it passes this test, then use of the parts saved effort and becomes RSI. Corporate reuse libraries Finally, reusable assets may reside in a shared, corporate database of fully tested, documented, and maintained components. This is the ultimate in clearly defined reuse. The expected programming practice is no more than one of referring to the corporate reuse database at every software lifecycle stage for reusable information. Measuring RSI is simple; the part must come from the corporate database. However, although this is a long term goal of any reuse program, few are able to impose such a strict definition of reuse on their organizations. 3.1.9 White box versus Black box reuse White box reuse is the copying and modifying of existing software to fit a new application; black box reuse is the use of unmodified software compo- nents. White box reuse is often referred to as code recovery or salvaging. It is a great aid, primarily during the development phase of a project. Black box reuse is also called planned reuse; it consists of assembling or tailoring products from building blocks of reusable software. However, it is important to differentiate between white box reuse, which results in new software to maintain, and black box reuse, which does not. The most fundamental division in how organizations practice reuse is between unplanned, white box reuse, and engaging in planned, black box reuse. When the reuse decision is made is what distinguishes between these two ____ classes of reuse [34]. Reuse must be a planned part of the software lifecycle [19]. The goal of this planning is to identify the factors that normally change in later projects, such as: 1. Hardware or System Software, 2. User, Mission, or Installation, and, 3. Function or Performance. Early design and analysis will result in components that can accommodate these changes without modification. There is an increased level of cost and quality in planned reuse that is paid for once, in the initial development of the component. However, this cost is quickly recovered by subsequent projects that are able to reuse the generalized software and by reduced support costs resulting from having only one base product to maintain. Failing to plan for reuse, however, is the norm in traditional software development. In general, a software product is an original work except for the informal consideration of existing software for use in the new applica- tion. Although this informal use of previously developed software in new applications is widely practiced, the systematic reuse of existing code is not part of traditional software development methods. At IBM it is important to distinguish between these two classes because the reported RSI determine the financial return of the reuse program. Over several development cycles, black box reuse provides the greatest cost and productivity benefits because there is only one resulting base product to maintain. Using this criteria, it is clear metrics should focus on black-box reuse. It is also clear that although white box reuse is beneficial, the benefits are somewhat limited to the development phase and are much less than with reusing black boxes. Therefore, white box reuse must be treated sepa- rately from black box reuse.(2) Table 1 is a summary of the measurement status for white box and black box reuse. +------------------------------------------------------------------+ | Table 1. Which Reuse Techniques are Measured | +---------------------------------+--------------------------------+ | TYPE OF REUSE | MEASURED? | +---------------------------------+--------------------------------+ | White Box | No | +---------------------------------+--------------------------------+ | Black Box | Yes | +---------------------------------+--------------------------------+ --------------- (2) Some organizations choose to track the amount of white box reuse in their products to emphasize the amount of "total leverage" gained by copying and modifying old software but code recovery is not included in the IBM reuse metrics. 3.1.10 Porting versus Reuse One form of planned "reuse" is porting. However, porting is an unusual case for reuse metrics because it is already a standard part of the business planning of products. The resource estimates for a product are normally based on development of the product on one hardware platform or operating system. A relatively nominal amount of resources are then allocated for changes required to adapt to other environments. Although porting is a form of reuse, reuse metrics should not claim these savings because porting is an established part of the business and development planning process. A further reason to exclude porting is that porting normally involves adapting a minor portion of a large product or simply recompiling an existing application to run on a new platform. To include ported code in reuse metrics would cause misleading results in the form of unrealistically high measures of reuse activity. For example, an organization making small changes to a large base might report levels of "reuse" close to 100% if most of their product is simply recompiled(3) whereas an organization performing an equal amount of labor on an original project might do very well to demon- strate reuse levels of 5-10%. To prevent this distortion, IBM does not include code porting in these reuse metrics. Organizations tasked with porting software separately track the amount of porting for which they are responsible. 3.1.11 Organizational boundaries A theme has developed throughout the discussions accompanying the previous issues. This theme pervades the definition of reuse and the goals of the organization; e.g., to encourage reuse where it does not naturally exist. Since we expect programmers to use good software engineering principles in their programs, and to naturally share software within their development group or department, we consider this normal business practice. One of our definitions of reuse is the use of a previously existing compo- nent in a new context. Normally, only the group which designed and maintains ________ a component is aware of its utility and is therefore likely to use it in a new context or program. It is clear that the context of a reusable component is closely aligned with the group that develops and maintains the component. Next, we define reuse based on "who" uses the component. Central to improving the practice of reuse is the understanding that good design and management is common within development organizations, but is less common between organizations. Communication, which is necessary for the simple exchange of information and critical to sharing software, becomes more difficult as the number of people involved grows and natural organizational boundaries emerge. Therefore, measurements must encourage reuse across these ______ organizational boundaries. --------------- (3) Note that since estimates show that only about 15% of applica- tions consists of truly unique software, the theoretical maximum is approximately 85%. [22] A software component is reused when it is used by an organization that did not develop and does not maintain the component [3]. Software development organizations vary, but for measuring reuse a typical organization is either a programming team, department, or functional group of about eight people or more. Also, although organizational size is a good indicator of how well communication within and between organizations takes place, functional bound- aries are equally important. For example, a small programming team may qualify as an organization if it works independently. For consistency, the type and size of the reporting organization is consid- ered part of the metrics. This provides an informal check on the flexibility allowed in selecting the most appropriate boundary for the organization. Selection of an inappropriately small boundary would distort the value of the metrics upward and an inappropriately large boundary would result in low reuse values. Changing the organizational boundary between reports would eliminate any possibility for comparisons and evaluation of the reuse program. Therefore, the organization is clearly indicated as part of the reuse measurement and is not changed between reporting periods. 3.1.12 Differences in organizations Although organizations have some flexibility, subject to the above guid- ance, on how to define the boundary that a part must cross before it may be considered "reused," social pressures and concerns persist. Interestingly, these concerns are not confined to how organizations with one boundary defi- nition compares with another organization which may choose a different boundary. In large organizations such as IBM there exists a diversity of expertise with regard to any technology, including reuse. Surprisingly, the concerns focus on the goal of encouraging reuse in organizations with varying levels of reuse process maturity [20], [23]. Furthermore, the concerns are more prevalent amongst organizations that are more advanced with regard to reuse. The issue arises in one of two circumstances. Either the organization attempts to encourage formal reuse by adopting a very strict definition of reuse, or the programming culture has so fully accepted a model of reuse that a need exists to raise the definition to another level of abstraction. Several examples of the former exist in IBM. For example, if Site A is very mature with regard to reuse, they may take a very strict view of what should be RSI. Site A may view reuse as only the use of fully tested and supported components retrieved from the corporate reuse library. Site A may consider software reused within a product to be expected and consider it part of good design and normal development. However, an immature Site B may do well to achieve even modest levels of sharing within products; they may view reuse more liberally. At IBM, this issue is somewhat resolved by the ability to define the boundary a component must cross to be considered reused. However, it is not explicitly addressed because the corporate goal is to ingrain reuse tech- nology in all software development organizations. The focus is less on achieving absolute levels of reuse than it is on constant improvement. Therefore, organizations are encouraged to set reuse goals relative to their current level. +------------------------------------------------------------------+ | Table 2. Summary: What to measure as Reuse | +---------------------------------+--------------------------------+ | TYPE OF REUSE | MEASURED? | +---------------------------------+--------------------------------+ | Maintenance | No | +---------------------------------+--------------------------------+ | Operating System | No | +---------------------------------+--------------------------------+ | Tools | No | +---------------------------------+--------------------------------+ | Prerequisite products | No | +---------------------------------+--------------------------------+ | Application generators | Separately | +---------------------------------+--------------------------------+ | High level languages | No | +---------------------------------+--------------------------------+ | Utility libraries | No | +---------------------------------+--------------------------------+ | Standard libraries | Maybe | +---------------------------------+--------------------------------+ | Domain-specific libraries | Yes | +---------------------------------+--------------------------------+ | Project libraries | Yes | +---------------------------------+--------------------------------+ | Multiple Uses | One time | +---------------------------------+--------------------------------+ | White box | No | +---------------------------------+--------------------------------+ | Black box | Yes | +---------------------------------+--------------------------------+ | Porting | Separately | +---------------------------------+--------------------------------+ 3.2 How to measure Reuse The following set of issues addresses the implementation of metrics. Many issues require implementation specific solutions. As with the previous issues, the IBM approach is presented following a discussion of possible sol- utions. 3.2.1 Units of measurement A good debate may easily be had over the most appropriate gauge of pro- grammer productivity. Numerous units exist: Lines of Code (LOC), Function Points [2], [13], [17], semantic tokens, equivalent lines of assembler, number of methods, number of features [27], number of classes, etc. However, LOC are the standard unit in industry for a number of reasons [7]. Although lines of code have well known deficiencies as a unit of measure [18], they are also simple to understand, they are easy to collect and compare, and they are difficult to distort. Furthermore, LOC are a good indicator of produc- tivity in code, and a good secondary indicator of work done in other phases [35]. Studies also show that methods such as function points and LOC are highly correlated [1]. The IBM metrics use LOC to quantify effort in software development. Since this unit of measurement is ingrained in the IBM culture it was easy to adopt for reuse measurements. The only exception is for assessing the level of reuse in object-oriented projects, where the portion of reuse is based on on the portion of reused classes. This exception is based on experiences in the internal OO community which indicate that this is an appropriate measurement, and on practical limitations as to the availability of data. For example, although the portion of methods reused would be a more accurate indicator, that information is not always available [12]. There are further actions that may be taken to increase confidence when using lines of code as a measure. One action is to use a standard code counting tool. Another action, taken with the metrics at IBM, is to use metrics derived from ratios or percentages of effort, and thereby eliminate the units of "LOC" from the metrics. The rationale is that if the units are a good indicator of overall effort [35], then the portion of units reused is a good indicator of effort saved. 3.2.2 Function written versus function used It is absolutely essential when acquiring the observable data elements, especially RSI, to recognize when reuse actually saves effort. This requires the analyst to distinguish reuse from normal software engineering practices (e.g., structured programming) and to eliminate implementation-dependent options effecting the observable data. (e.g., static versus dynamic sub- program expansion). For example, the programmer's decision to implement a system service as a subroutine, which is expanded at runtime, or as a macro, which is expanded at compile time, should not affect the reuse metric. One of the criteria for good metrics in that they are implementation inde- pendent. The choice of using a subroutine versus a macro is a design deci- sion usually resulting from many considerations well outside the realm of reuse. The decision to use macros should not be made because multiple in-line expansions increase the amount of reuse reported on a project. The issue of separating the portion of a delivered product that is actually written from the portion created as a consequence of a tool or programming practice also arises when comparing productivity levels of various organiza- tions. Where applications may be developed using automatic program genera- tors, macro generators, Fourth Generation Languages, or similar techniques, the reported "productivity" of an organization may simply be incomparable with organizations unable to use such techniques. The effort of each type of group must be reported without loosing sight of the tremendous leverage these techniques provide. To meet this need, IBM developed the concept of an Used Instruction, (UI) _________________ [16]. The total UI is the total function delivered to the customer, reported as if the entire software product had been written as in-line code (no macros, inheritance, 4GLS, etc.). The difference between the "function written" and the "function used" is an indicator of how well the program was designed and structured. The ratio of "function written" to RSI, or "func- tion not written," is an indicator of the level of reuse. This simple concept preserves the accuracy and goals of the IBM metrics with regard to encouraging both productivity and reuse. 3.2.3 LOC that would not have been written. Reusable parts often contain additional code that exists to make the compo- nent more general and more reusable. This code would not have been written by the reuser had the reuser instead simply developed a custom part for the application. Should the RSI for the instance of reuse be based on what would be written or what was reused? If only a portion of the methods in a generic package or class library are used, should the RSI be based on the used LOC or on the total LOC in the package or class? The issue is further complicated in the case of variable macros, where the value of a parameter determines which of numerous macro options to instantiate. Each option may consist of only a small portion of the total code in the generalized macro. Resolving the issue became a matter of returning to the goals and defi- nitions of reuse. The prime motivator is to encourage reuse; including a small amount of general code in the RSI is a worthy investment. Although this may cause misleading reuse results in the case of variable macros, prac- tical limits on our ability to gather the data and the additional cost of identifying these occurances did not justify separate analysis. 3.2.4 Definition of "ship." Metrics report the level of reuse on products that are shipped to cus- tomers. At IBM "shipping" a product is a very specific term related to development, sales, and marketing. However, this term pertains to external ________ customers only. There are numerous organizations that develop products exclusively for internal use. These organizations should also report the amount of reuse in their software, even though they are not technically "shipped." The reuse metrics at IBM apply equally to products intended for external and internal use. 3.2.5 When to report metrics For completely accurate reporting, metrics should be effective as of the "ship date" for the product. This is the time the product is declared ready for general customer availability. This ensures the values for the metrics will not change with subsequent modifications to the product. There are two disadvantages to this view. One is that it prevents the metrics from being used for in-process reviews and improvement. The second disadvantage is that some products take years of planning and development before any deliverable is made available to customers. To provide feedback to the developers and motivate the reuse of components, it is desirable to report reuse levels throughout the development cycle. At IBM, in-process metrics are encouraged but must be identified as "in-process" so reviewers understand the number is not stable. 3.2.6 Reporting by quality of reuse As previously discussed, IBM chose not to report the use of copied and mod- ified components (white-box) in the reuse metrics. However, IBM defines three levels of quality associated with reusable parts [21]. Parts may be used "as-is" or may be fully tested and rated as "certified" parts. The dif- ference between quality levels is determined by the quality of the associated reuse information, documentation, testing suites, integration instructions, the level of testing achieved, and whether the component is supported by an external organization. Use of certified parts is encouraged because the long term benefits exceed those of reusing as-is parts, but there are fewer certi- fied parts than as-is parts. To encourage reuse of certified parts and to identify as-is parts that may be good candidates for certification, IBM asks organizations to optionally report the quality levels of reused components. Many other companies and government agencies rate the quality of software and may consider it advanta- geous to report reuse by quality level. 3.2.7 Availability of data Two of the criteria for reuse metrics is that the data needed to quantify the metrics be easy to collect and normalize, and the cost of data collection and reporting be kept to a minimum. In organizations where software process statistics are common, data may be easy to obtain. Tools are in place to measure the software development process and databases are used for config- uration management and version control of products. Where these do not exist, considerations must be made to either make the required investment to obtain the data or to use an alternative strategy to obtain equivalent results. An example of the latter strategy existed at several sites in IBM. Process data is routinely collected and reported by function point and not LOC. In this case, it was simple to adjust the metric so that the level of reuse is reported as the portion of function points reused. This allowed the sites to adopt the metrics with minimal impact to their development process and yet still remain a reasonable representation of effort saved. The IBM reuse metrics [28] are calculated from the following observable data elements which have been in use within IBM for many years [16]. Observ- able data may usually be directly measured from the product. For example, the different classes of source instructions are directly measurable. Observable data may also be historical data, collected for a variety of reasons related to managing the software development process. Costs for software development and statistical error rates are examples of historical data. Detailed descriptions of each of the required observable data elements follow the summary in Table 3. +------------------------------------------------------------------+ | Table 3. Observable data | +----------------+----------------+----------------+---------------+ | DATA ELEMENT | SYMBOL | UNIT OF | SOURCE | | | | MEASURE | | +----------------+----------------+----------------+---------------+ | Shipped Source | SSI | LOC | Direct Meas- | | Instructions | | | urement | +----------------+----------------+----------------+---------------+ | Changed Source | CSI | LOC | Direct Meas- | | Instructions | | | urement | +----------------+----------------+----------------+---------------+ | Reused Source | RSI | LOC | Direct Meas- | | Instructions | | | urement | +----------------+----------------+----------------+---------------+ | Source | SIRBO | LOC | Direct Meas- | | Instructions | | | urement | | Reused by | | | | | Others | | | | +----------------+----------------+----------------+---------------+ | Software | Cost per LOC | $/LOC | Historical | | Development | | | data | | Cost | | | | +----------------+----------------+----------------+---------------+ | Software | Error rate | Error/LOC | Historical | | Development | | | data | | Error Rate | | | | +----------------+----------------+----------------+---------------+ | Software Error | Cost per Error | $/Error | Historical | | Repair Cost | | | data | +----------------+----------------+----------------+---------------+ Shipped Source Instructions (SSI). The total lines of code in the product source files. New and Changed Source Instructions (CSI). The total lines of code new or changed in a new release of a product. Reused Source Instructions (RSI). The total lines not written but included in the source files. RSI includes only completely unmodified reused software components. Source Instructions Reused By Others (SIRBO). The total lines of code that other products reuse from a product. Software Development Cost. A historical average required for estimating reuse cost avoidance. Software Development Error Rate. A historical average required for esti- mating maintenance cost avoidance. Software Error Repair Cost. A historical average required for estimating maintenance cost avoidance. 3.2.8 Combining the data The available data must be presented in a form that not only achieves the goals of the metrics, namely to reflect the level of reuse and to encourage reuse, but also meet the criteria for metrics. Specifically, the metrics must be easy to normalize, report, interpret, and they must be objective. Organizations must examine their existing software process metrics and deter- mine what their goals are relative to reuse. This will help identify the questions that need to be asked to achieve these goals, and finally indicate what metrics are appropriate. At IBM, we seek to achieve the goals outlined in the beginning of this paper. To help guarantee that the metrics were a reasonable reflection of the levels of reuse, we carefully defined "what is reuse." To achieve the goals of increasing the level of reuse and encour- aging reuse within IBM we asked: 1. What is the current level of reuse? 2. What is this worth, in dollars? 3. How do I enourage people to build reusable software? Three reuse metrics were developed to answer these questions. The metrics combine the observable data elements in an intuitive way. The first two metrics indicate the level of reuse activity in an organization as a per- centage of products and by financial benefit. The third metric includes recognition for writing reusable code. The three metrics, which are summa- rized in Table 4, are [29]: 1. Reuse Percent; the primary indicator of the amount of reuse in a product or practiced in an organization. Reuse Percent is derived from SSI, CSI, and RSI. 2. Reuse Cost Avoidance; indicator of reduced total product costs as a result of reuse in the product. Reuse Cost Avoidance is derived from SSI, CSI, RSI, error rates, software development cost (Cost per LOC), and maintenance costs (Cost per Error). 3. Reuse Value Added; an indicator of leverage provided by practicing reuse and contributing to the reuse practiced by others. Reuse Value Added is derived from SSI, RSI, and SIRBO. +------------------------------------------------------------------+ | Table 4. Derived Metrics | +-----------------------+------------+-----------------+-----------+ | METRIC | SYMBOL | DERIVED FROM: | UNIT OF | | | | | MEASURE | +-----------------------+------------+-----------------+-----------+ | Reuse Percent | Reuse% | | Percent | | o for products | | o SSI, RSI | | | o product releases | | o CSI, RSI | | | o organizations | | o SSI, RSI | | +-----------------------+------------+-----------------+-----------+ | Reuse Cost Avoided | RCA | SSI or CSI, | Dollars | | | | RSI, Cost/LOC, | | | | | Error/LOC, | | | | | Cost/Error | | +-----------------------+------------+-----------------+-----------+ | Reuse Value Added | RVA | SSI, RSI, SIRBO | Ratio | +-----------------------+------------+-----------------+-----------+ 3.2.9 "Rolling-up" the numbers It is relatively easy to gather data and report on the progress of auton- omous products or organization. It is much more difficult to summarize the results of numerous unrelated projects that may report to or be under the control of a single manager. For example, if every first-level manager owns a product, that manager may be able to calculate project metrics in a straightforward manner to second-level management. However, what does the second-level manager report to third-level management? There are several solutions. 1. Vary organizational boundary. The second level manager only considers _______________________________ RSI to be from parts that were developed and are maintained outside the second level organization. This means that although there may be high levels of reuse within the organization, the level of reuse may shrink when reported up the "chain of command." For example, if a site manager only reports RSI from components obtained from other sites, the site reuse percent may be at or near zero. Not only is this not appreciated by the higher-level reporting managers, it is misleading because it does not reflect the level of reuse within the organization. Finally, it may be very difficult to obtain data at each level of organization and the reporting process becomes much more complicated. 2. Average of all sub-organizations. This option has the advantage of being _________________________________ very easy to compute because it only requires the information reported by the suborganizations. However, this method could be misleading because it does not consider the relative efforts of the sub-organizations. For example, if one of the sub-organizations has a very high reuse level but is responsible for very little product relative to the other sub- organizations, the second line manager would report a reuse level in excess of what is really happening. 3. Sum SSI and unique RSI. In this option, each second level manager simply _______________________ adds the SSI reported by each of the reporting first level organizations. However, only unique occurances of reused components may contribute to the RSI for the second level organanization. As in the first option, this may cause the reported level of reuse to shrink. For example, con- sider a second line organization that consists of three departments; Department A, B, and C. Department A develops a 100 LOC macro for use by all three departments. Departments B and C will both report 100 RSI from that macro. However, the second line manager would only report 100 RSI. This option provides partial credit for reuse within the organization but requires the higher level organizations to monitor the observable data used by the sub-organizations. 4. Sum SSI and RSI. This option provides a weighted average of the levels ________________ of reuse in each of the sub-organizations. Like option 3, the higher level organizations must maintain more data. However, IBM uses this option because it is a fairly good indicator of the level of reuse in a hierarchical reporting structure. 4.0 Related Work Since quantifying a process is an essential step in assessing its success and effectiveness, there are several methods currently in use. However, the metrics in this paper are unique in the attention given to the definition of RSI and in attempting to present reuse as "real effort saved." Although [3] differentiates between reuse within an organization and reuse from sources external to the organization, no other paper, including those in the list of cited references, addresses how to measure the classes of reuse nor do they provide a concentrated definition of RSI. 5.0 Future Work The solutions presented reflect the current reuse measurement and ROI model at IBM. However, the software development business changes rapidly and therefore so will the definition of RSI. IBM is already moving towards defining RSI as the use of unmodified, high-quality, externally supported parts retrieved from the corporate reuse library. Application of the reuse metrics and collection of data continues. The metrics and IBM ROI model use actual values for reuse data where available, otherwise default values based on industry experience are substituted in the equations. For example, actual costs to develop new code and standard soft- ware development defect and maintenance data are usually known, and defect data (errors) for reusable components are routinely gathered. However, data needs to be continually collected and studied to compare with industry expe- rience and to maintain the accuracy of the model. This paper discusses reuse measurements for software only. Future work will include methods to quantify reuse in areas other than software (e.g., Design, Test Case, Information Development). 6.0 Conclusion Sound business decisions based on accurate measurements are essential to the management of any process. This paper discusses the issues surrounding development of reuse measurements; responsible and equitable resolution of the issues is critical to their acceptance. Technical leaders and managers demand responsible solutions because they insist that the values of the metrics be realistic, accurate, and form the basis for investment decisions. Management demands equitable solutions because it is judged and compared based on reuse performance and results. These issues define the rules for such comparisons. The goals of the organization define reuse and how to address the numerous implementation issues. At IBM, the reuse goals are to provide reasonable representations of reuse activity and to encourage reuse. The IBM reuse metrics model consists of three metrics: Reuse Percent, Reuse Cost Avoidance, and Reuse Value Added. The metrics provide reliable input to the corporate reuse ROI model, where the benefits attributed to reuse are carefully defined. The metrics also serve to encourage reuse by providing feedback on the results of a reuse program and by highlighting the benefits of a organ- izational reuse effort. 7.0 Cited References [1] Albrecht, A.J., and J.E. Gaffney, "Software Function, Source Lines of Code, and Development Effort Prediction: A Software Science Validation," IEEE Transactions on Software Engineering, SE-9, 1983, pp.639-648. __________________________________________ [2] Albrecht, A.J, "Measuring Application Development Productivity," in Pro- ____ ceedings of the Joint IBM/SHARE/GUIDE Application Symposium, October ________________________________________________________________ 1979, pp. 83-92. [3] Banker, Rajiv D., Robert J. Kauffman, Charles Wright, and Dani Zweig, "Automating Output Size and Reusability Metrics in an Object-Based Com- puter Aided Software Engineering (CASE) Environment," unpublished manu- __________________ script, 25 August 1991. _______ [4] Barns, B.H. and T.B. Bollinger, "Making Reuse Cost Effective," IEEE ____ Software, Vol 8., No.1, January 1991, pp. 13-24. _________ [5] Bollinger, T.B., and Pfleeger, S.L, "Economics of reuse: issues and alternatives," Inf. Software Technology, Vol. 32, No. 10, December 1990, _________________________ pp. 643-52. [6] Basili, V.R., and D.M. Weiss, "A methodology for collecting valid soft- ware engineering data," IEEE Transactions on Software Engineering, Vol. __________________________________________ SE-10, 1984, pp. 728-738. [7] Boehm, B.W., "Improving Software Productivity," IEEE Computer, 20, 1987, ______________ pp. 43-57. [8] Bowen, Gregory M, "An Organized, Devoted, Project-Wide Reuse Effort," Ada Letters, Volume 12, No. 1, January/February 1992, pp. 43-52. ____________ [9] Brooks, F.P., "No Silver Bullet," IEEE Computer, April 1987, pp. 10-19. _____________ [10] Dabin, "Software Reuse and CASE tools," Proceedings of the International ________________________________ Computer Software and Applications Conference, 1991. ______________________________________________ [11] Daskalantonakis, M.K., "A Practical View of Software Measurement and Implementation Experiences within Motorola," IEEE Transactions on Soft- __________________________ ware Engineering, Vol. 18, No. 11, November 1992, pp. 998-1010 _________________ [12] Devries, Peter, "Object Oriented Metrics," IBM Document (draft Version _____________________________ 1.0), December 23, 1992. _____ [13] Dreger, J.B. Function Point Analysis, Prentice-Hall, Englewood Cliffs, ________________________ NJ, 1989. [14] Favaro, John, "What price reusability? A case study," Ada Letters, Vol. ____________ 11, No. 3, Spring 1991, pp. 115-24. [15] "Software Engineering Strategies," Strategic Analysis Report, Gartner ____________________________ Group, Inc., April 30, 1991. [16] "Corporate Programming Measurements (CPM)," V 4.0, IBM Internal Docu- __________________ ment, 1 November 1991 _____ [17] Gaffney, J.E. and R.D. Cruickshank, "A General Economics Model of Soft- ware Reuse," Proceedings of the International Conference on Software _________________________________________________________ Engineering, Melbourne, Australia, 11-15 May 1992 pp. 327-337. ____________ [18] Firesmith, D.G., "Managing Ada projects: the people issues," Proceedings ___________ of TRI Ada '88, Charleston, WV, USA 24-27 Oct. 1988, pp. 610-19 _______________ [19] Goldberg, Adele, "Reuse: Truth or Fiction," panel position, OOPSLA 92, _______________ Vancouver, CAN, 18-22 Oct 1992. [20] Humphrey, Watts S., "Characterizing the Software Process: A Maturity Framework," IEEE Software, March, 1988, pp. 73-79. ______________ [21] "IBM Reuse Methodology: Qualification Standards for Reusable Components," IBM Document Number Z325-0683, 2 October 1992. ______________________________ [22] Jones, T.C. "Reusability in Programming: A Survey of the State of the Art," IEEE Transactions on Software Engineering, Vol. SE-10, No.5, Sep- __________________________________________ tember, 1984. [23] Kolton, Philip and Anita Hudson, "A Reuse Maturity Model," Position ________ Paper at the 5th Annual Workshop on Software Reuse, Center for Innova- ___________________________________________________ tive Technology, Herndon, VA, 18-22 November 1991. [24] Krueger, C.W., "Software Reuse," Computing Surveys, Vol. 24, No.2, June __________________ 1992, pp.131-83. [25] Margano, Johan, and Lynn Lindsey, "Software Reuse in the Air Traffic Control Advanced Automation System," paper for the Joint Symposia and __________________ Workshops: Improving the Software Process and Competitive Position, 29 ____________________________________________________________________ April-3 May 1991, Alexandria, VA. [26] McCullough, Paul, "Reuse: Truth or Fiction," panel position, OOPSLA 92, _______________ Vancouver, CAN, 18-22 Oct 1992. [27] Mukhopadhyay, Tridas and Sunder Kekre, "Software Effort Models for Early Estimations of Process Control Applications," IEEE Transactions on Soft- __________________________ ware Engineering, Vol 18, No. 10, October 1992, pp. 915-924. _________________ [28] Poulin, Jeffrey S. and W.E. Hayes, "IBM Reuse Methodology: Measurement Standards," IBM Corporation Document Number Z325-0682, 2 October 1992. __________________________________________ [29] Poulin, Jeffrey S. "Measuring Reuse," Proceedings of the 5th Interna- _______________________________ tional Workshop on Software Reuse, Palo Alto, California, 26-29 October __________________________________ 1992. [29] Poulin, Jeffrey S. and Joseph M. Caruso "A Reuse Measurement and Return on Investment Model," Proceedings of the Second International Workshop __________________________________________________ on Software Reusability, Lucca, Italy, 24-26 March 1993, pp. 152-166. ________________________ [30] Poulin, Jeffrey S. and Joseph M. Caruso, "Determining the Value of a Corporate Reuse Program," Proceedings of the IEEE Computer Society _____________________________________________ International Software Metrics Symposium, Baltimore, MD, 21-22 May 1993. _________________________________________ [31] Poulin, Jeffrey S., Debera Hancock and Joseph M. Caruso, "The Business Case for Software Reuse," to appear in the IBM Systems Journal, Vol. 32, ____________________ No. 3., Fall 1993. [32] Prieto-Diaz, Ruben, "Domain Analysis for Reusability," Proceedings of _______________ COMPSAC '87, 1987, pp. 23-29. ____________ [33] Reifer, Donald J., "Reuse Metrics and Measurement- A Framework," pre- sented at the NASA/Goddard Fifteenth Annual Software Engineering Work- _________________________________________________________ shop, 28 November 1990. _____ [34] "Repository Guidelines for the Software Technology for Adaptable, Reli- able Systems (STARS) Program," CDRL Sequence Number 0460, 15 March 1989. [35] Tausworthe, R.C, "Information models of software productivity: limits on productivity growth," Journal of Systems and Software, Vol 19, No. 2, ________________________________ Oct, 1992, pp. 185-201. 8.0 Biography JEFFREY S. POULIN joined IBM's Reuse Technology Support Center, Poughkeepsie, New York, in 1991 as an advisory programmer. His primary responsibilities include developing and applying corporate standards for reusable component classification, certification, and measurements. Dr. Poulin has worked in the area of software reuse since 1985 and helped lead the development and acceptance of the IBM software reuse metrics. He partic- ipates in the IBM Corporate Reuse Council, the Association for Computing Machinery, and Vice-Chairs the Mid-Hudson Valley Chapter of the IEEE Computer Society. A Hertz Foundation Fellow, Dr. Poulin earned his Bachelors degree at the United States Military Academy at West Point, New York, and his Masters and Ph.D. degrees at Rensselaer Polytechnic Institute in Troy, New York. +++EDF154E Value of WIDTH attribute on the TABLE tag exceeds page width. (Page 10 File: ISSUES SCRIPT) +++EDF154E Value of WIDTH attribute on the TABLE tag exceeds page width. (Page 12 File: ISSUES SCRIPT) +++EDF154E Value of WIDTH attribute on the TABLE tag exceeds page width. (Page 16 File: ISSUES SCRIPT) +++EDF154E Value of WIDTH attribute on the TABLE tag exceeds page width. (Page 18 File: ISSUES SCRIPT) DSMBEG323I STARTING PASS 2 OF 2. +++EDF154E Value of WIDTH attribute on the TABLE tag exceeds page width. (Page 10 File: ISSUES SCRIPT) +++EDF154E Value of WIDTH attribute on the TABLE tag exceeds page width. (Page 12 File: ISSUES SCRIPT) +++EDF154E Value of WIDTH attribute on the TABLE tag exceeds page width. (Page 16 File: ISSUES SCRIPT) +++EDF154E Value of WIDTH attribute on the TABLE tag exceeds page width. (Page 18 File: ISSUES SCRIPT)