Measuring Software Reusability Jeffrey S. Poulin Loral Federal Systems-Owego Abstract This paper examines various approaches to measuring software reusability. ___________________________________________________________________________ Knowing what makes software "reusable" can help us learn how to build new _____________________________________________________________________________ reusable components and help us to identify potentially useful modules in _____________________________________________________________________________ existing programs. The paper begins by establishing a taxonomy of approaches _____________________________________________________________________________ to reusability metrics based on their empirical or qualitative orientation. _____________________________________________________________________________ The paper then examines the disciplines, theories, and techniques used by _____________________________________________________________________________ numerous existing reusability measurement methods as they relate to the _____________________________________________________________________________ taxonomy. Recognizing that most of these methods focus exclusively on _____________________________________________________________________________ internal characteristics of components and ignore environmental factors, the _____________________________________________________________________________ paper challenges reusability researchers to incorporate domain attributes _____________________________________________________________________________ into their metrics. In fact, the application domain represents a critically _____________________________________________________________________________ important factor in whether or not we find a component reusable. The _____________________________________________________________________________ research, framework, and conclusions should provide a useful reference for _____________________________________________________________________________ persons interested in ways to determine the reusability of software. ____________________________________________________________________ KEYWORDS: Software Reuse, Reusability Metrics. 1.0 Overview 1.1 Motivation Many believe software reuse provides the key to enormous savings and bene- fits in software development; the U.S. Department of Defense alone could save $300 million annually by increasing its level of reuse by as little as 1% [1]. However, we have yet to identify a reliable way to quantify what we mean by "reusable" software. Such a measure may help us not only to learn how to build reusable components but also to identify reusable components among the wealth of existing programs. Existing programs contain years of knowledge and experience gained from working in the application domain and meeting the organization's software needs. If we could extract this information efficiently, we could gain a valuable resource upon which to build future applications. Unfortunately, working with existing software poses several significant problems due to its inconsistent quality, style, documentation, and design. We need an econom- ical way to find domain knowledge and useful artifacts within these programs. 1.2 Measuring reusability This paper examines metrics used to determine the reusability of software ___________ components. In light of the recent emphasis on software reuse, numerous research efforts have attempted to quantify our ability to use a component in new contexts. Many of the known empirical methods use a version of com- plexity metrics to measure reusability. However, many other objective and subjective criteria contribute to the question: WHAT MAKES SOFTWARE REUSABLE? One possible measure of a component's reusability comes from its success; how many other application modules access this common code? Other measures come from static code metrics generated automatically by a variety of commer- cial tools. Additional qualitative measures include: adherence to format- ting standards and style guidelines, completeness of testing, and existence of supporting information such as integration instructions and design doc- umentation. This paper examines these various approaches to measuring reusa- bility; note that it distinguishes reusability metrics from those that define reuse [34], the level of reuse in an organization [33], or reuse Return on Investment (ROI) models [11], [35]. Where these reports focus on the level of reuse and its benefits, reusability metrics seek ways to identify the software that will bring those savings. 2.0 A Taxonomy of Reusability Metrics Approaches to measuring reusability fall into two basic methods: empirical _________ and qualitative. Because empirical methods depend on objective data, they ____________ have the desirable characteristic that a tool or analyst can usually calcu- late them automatically and cheaply. The qualitative methods generally rely on a subjective value attached to how well the software adheres to some guidelines or principles. Although this allows us to attach a value to an abstract concept, collecting qualitative data often requires substantial manual effort. Within each method, the metrics tend to focus in one of two areas. In the first, the metrics address attributes unique to the individual module, such as the number of source statements it contains, its complexity, or the tests it successfully passed. In the second approach, the metrics take into account factors external to the module, such as the presence and/or quality of supporting documentation. When we refer to a module and all of these sup- porting pieces of information, we call it a component. In summary, these __________ methods take the form shown in the following taxonomy: TAXONOMY OF REUSABILITY METRICS o EMPIRICAL METHODS - Module oriented -- Complexity based -- Size based -- Reliability based - Component oriented o QUALITATIVE METHODS - Module oriented -- Style guidelines - Component oriented -- Certification guidelines -- Quality guidelines 2.1 Related work We can relate reusability to portability since both involve the use of a ___________ component in a new context [31]. A 1990 U.S. Department of Defense report concluded that the ultimate measure of portability comes from the number of source lines of code that a programmer must change to get a module to execute in a different environment. This study recommended developing a mathematical function to convert the amount of changed code to a portability value. However, the study continued to state that it could not make conclusions on reusability factors. Software complexity metrics reveal internal characteristics of a module, __________ collection of modules, or object-oriented programs [9]. Studies indicate that complex modules cost the most to develop and have the highest rates of failure. McCabe and Halstead developed the two most widely known complexity metrics: o The McCabe Cyclomatic Complexity metric links the number of logical branches (decisions) in a module to the difficulty of programming [27]. McCabe melds a graph theory approach with software engineering: if you represent the logic structure of a module using a flowchart (a graph) and count the regions of the graph caused by program flow statements (do- ___ while, if-then-else), the number of regions in the graph corresponds to ___________________ the complexity of the program. If the number of regions, V(G), exceeds 10, the module may have too many changes of control. o Halstead's Software Science metrics link studies on human cognitive ability to software complexity [15]. Halstead's approach parses the program or problem statement into tokens and classifies the tokens into operators (verbs, functions, procedures) and operands (nouns, variables, files). Equations based on these tokens give program complexity in terms of a variety of indicators including estimated effort, program volume, and size. Not surprisingly, basic software engineering principles address many of the aspects of software that might make software reusable. This particularly applies to those qualities that make software maintainable. One study based _____________ on this premise constructs a taxonomy of 92 attributes affecting the ease of maintaining software [46]. Among other findings, the study concludes that maintenance effort correlates highly with the Halstead complexity metrics, a finding corroborated by reusability researchers. Program comprehension relates closely to program complexity [47]. Criteria _____________________ and conditions for program comprehension show that we can translate theoretic numerical measures from software complexity back to empirical conditions. This means that we can describe program comprehension with empirical axioms. From the software reuse perspective, we could use these techniques to under- stand software for reuse and to identify potentially reusable components. In one example, a prototype tool uses candidate criterion to identify abstract data types in existing program code [7]. The tool applies the criterion in an experiment that analyses five different programs for the purpose of (1) reverse engineering/re-engineering and (2) to identify and extract reusable components from the programs. Another factor affecting whether a programmer will choose to use an existing component in a new situation depends on how quickly the programmer can assimilate what the component does and how to use it. Program under- _______________ standing methods address this problem. These methods attempt to present the ________ important information about a component to the user in a way the user can quickly assess [28]. For example, recognizing that expert programmers organize the important information about a component into mental templates, Lin and Clancy developed a visual template containing this same information. Their study shows that by using a standard layout, a potential reuser can quickly scan the important aspects of a component, such as text descriptions, pseudocode, illustrations, and implementation information [24]. Under- standing how good reusable software works not only helps the programmer learn how to write good reusable software, it increases the chances the programmer will use more of what already exists. The discussion of what makes software reusable has taken place for a long time. In 1984 Matsumoto stressed qualities such as generality, definiteness (the degree of clarity or understandability), transferability (portability), and retrievability as the major characteristics leading to the reusability of a component [25]. However, the search for quantitative measures remains elusive. Next, we will discuss some of the reasons why. 2.2 Issues One reason why we find it so hard to develop reusability metrics comes from the fact that no one completely understands "design for reuse" issues [4]. Given that humans often do not agree on what makes a component reusable, obtaining an equation that quantifies the concept offers a significant chal- lenge. To put it simply, we need to define reusability before we can quan- tify it. To illustrate this point, Woodfield, Embley, and Scott conducted an exper- iment where 51 developers had to assess the reusability of an Abstract Data Type (ADT) in 21 different situations [45]. They found developers untrained in reuse did poorly; the developers based their decisions on unimportant factors such as size of the ADT and ignored important factors such as the effort needed to modify the ADT. As a result, the study recommends devel- oping tools and education that can help developers assess components for reuse and suggests a reusability metric based on the effort needed to modify a component as reflected by the number or percent of operations to add or modify. Not only must a reusability metric involve a wide variety of input parame- ters, it must define the inter-relationships among the parameters and their relative importance. The parameters must exhibit certain qualities such as statistical independence [13]. Well-formed metrics should not have any cor- relation between the elements that make up the metric [23], an issue Selby addresses as part of his statistical study [40]. Once we identify a potentially reliable metric, we must next look at caus- ality. In other words, if we find smaller modules get reused more often, does this mean the small size made the module more reusable, and we should build all reusable modules small? Metrics must carefully separate the factors leading to the findings. Finally, some metric values may give conflicting reuse information. For example, do we desire a low value or a high value for module complexity? Complex modules may indicate potentially trouble spots and warrant additional testing, re-design, or further decomposition. However, some algorithms may have high complexity values independent of design. Does the reuser stay away from these modules or does the reuser take advantage of the greater payoff that comes from not having to re-develop complex code? 3.0 Empirical Methods The following methods primarily use objective, quantifiable attributes of software as the basis for a reusability metric. Most use module-oriented attributes, but the methods to interpret the attributes vary greatly. 3.1 Prieto-Diaz and Freeman In their landmark paper on faceted classification, Prieto-Diaz and Freeman identify five program attributes and associated metrics for evaluating reusa- bility [37]. Their process model encourages white-box reuse and consists of finding candidate reusable modules, evaluating each, deciding which module the programmer can modify the easiest, then adapting the module. In this model they identify four module-oriented metrics and a fifth metric used to modify the first four. The following list shows the five metrics and gives a description of each: 1. Program size. Reuse depends on a small module size, as indicated by ______________ lines of source code. 2. Program structure. Reuse depends on a simple program structure as indi- __________________ cated by fewer links to other modules (low coupling) and low cyclomatic complexity. 3. Program documentation. Reuse depends on excellent documentation as indi- ______________________ cated by a subjective overall rating on a scale of 1 to 10. 4. Programming language. Reuse depends on programming language to the extent _____________________ that it helps to reuse a module written in the same programming language. If a reusable module in the same language does not exist, the degree of similarity between the target language and the one used in the module affects the difficulty of modifying the module to meet the new require- ment. 5. Reuse experience. The experience of the reuser in the programming lan- _________________ guage and in the application domain affects the previous metrics because every programmer views a module from their own perspective. For example, programmers will have different views of what makes a "small" module, depending on their background. This fifth metric serves to modify the values of the other metrics. 3.2 Selby To derive measures of reusability, we must look at instances where reuse succeeded and try to determine why. Selby provides a statistical study of ____ reusability characteristics of software using data from a NASA software envi- ronment [40]. NASA used the production environment to develop ground support software in FORTRAN for controlling unmanned spacecraft. The study provides statistical evidence based on non-parametric analysis-of-variance on the con- tributions of a wide range of code characteristics. The study validated most of the findings listed below at the .05 level of confidence, showing that most modules reused without modification: o Have a smaller size, generally less than 140 source statements. o Have simple interfaces. o Have few calls to other modules (low coupling). o Have more calls to low-level system and utility functions. o Have fewer input-output parameters. o Have less human interaction (user interface). o Have good documentation, as shown by the comment-to-source statement ratio. o Have experienced few design changes during implementation. o Took less effort to design and build. o Have more assignment statements than logic statements per source state- ment. o Do not necessarily have low code complexity. o Do not depend on project size. 3.3 Chen and Lee Although Selby's evidence did not find a statistically significant corre- lation between module complexity and reusability, other studies show such a link. In one example, Chen and Lee developed about 130 reusable C++ compo- nents and used these components in a controlled experiment to relate the level of reuse in a program to software productivity and quality [8]. In contrast to Selby, who worked with professional programmers, Chen and Lee's experiment involved 19 students who had to design and implement a small data base system. The software metrics collected included the Halstead size, program volume, program level, estimated difficulty, and effort. They also included McCabe complexity and the Dunsmore live variable and variable span metrics [10]. They found that the lower the values for these complexity metrics, the higher the programmer productivity. 3.4 Caldiera and Basili Caldiera and Basili [6] state that basic reusability attributes depend on the qualities of correctness, readability, testability, ease of modification, and performance, but they acknowledge we cannot directly measure or predict most of these attributes. Therefore, the paper proposes four candidate meas- ures of reusability based largely on the McCabe and Halstead metrics. This module-oriented approach has an advantage in that tools can automatically calculate all of the four metrics and a range of values for each: 1. Halstead's program volume. A module must contain enough function to __________________________ justify the costs of retrieving and integrating it, but not so much func- tion as to jeopardize quality. 2. McCabe's cyclomatic complexity. Like Halstead's volume, the acceptable ________________________________ values for the McCabe metric must balance cost and quality. 3. Regularity. Regularity measures the readability and the non-redundancy ___________ of a module implementation by comparing the actual versus estimated values of Halstead's two length metrics. A clearly written module will have an actual Halstead length close to its theoretical Halstead length. 4. Reuse frequency. Reuse frequency indicates the proven usefulness of a _________________ module and comes from the number of static calls to the module. The paper continues by calculating these four metrics for software in nine example systems, and noting that the four metrics show a high degree of sta- tistical independence. 3.5 REBOOT The ESPRIT-2 project called REBOOT (Reuse Based on Object-Oriented Tech- niques) developed a taxonomy of reusability attributes. As part of the taxonomy, they list four reusability factors, a list of criteria for each factor, and a set of metrics for each criteria [22]. Although some of the metrics depend on subjective items such as checklists, an analyst can compute many of the metrics directly from the code, such as complexity, fan-in/out, and the comment-to-source-code ratio. The analyst combines the individual metric values into an overall value for reusability. The following list defines the four reusability factors; Table 1 gives the criteria and metrics for each factor. o Portability. The ease with which someone can transfer the software from ____________ one computer system to another. Criteria include: - Modularity - Environment independence o Flexibility. The number of choices a programmer has in determining the ____________ use of the component; also referred to as "generality." Criteria include: - Generality - Modularity o Understandability. The ease with which a programmer can understand the __________________ component. Criteria include: - Code complexity - Self descriptiveness - Documentation quality - Module complexity o Confidence. The subjective probability that a component will perform ___________ without failure over a specified time in a new environment. Criteria include: - Module complexity - Observed reliability - Error tolerance +------------------------------------------------------------------+ | Table 1. The REBOOT reusability metrics for the four reusability | | factors | +---------------------+--------------------------------------------+ | CRITERIA | METRIC | +---------------------+--------------------------------------------+ | Generality | Generality checklist | +---------------------+--------------------------------------------+ | Modularity | Code / number of methods | +---------------------+--------------------------------------------+ | Environment inde- | Machine-dependent code / executable code | | pendence | System-dependent code / executable code | +---------------------+--------------------------------------------+ | Code complexity | Cyclomatic complexity | +---------------------+--------------------------------------------+ | Self | Comments / source code | | descriptiveness | Self-descriptiveness checklist | +---------------------+--------------------------------------------+ | Documentation | Documentation / source code | | quality | Documentation checklist | +---------------------+--------------------------------------------+ | Module complexity | Fan-in, fan-out | | | Cyclomatic complexity | +---------------------+--------------------------------------------+ | Reliability | Total number of tests | | | Number of observed errors | +---------------------+--------------------------------------------+ | Error tolerance | Error tolerance checklist | +---------------------+--------------------------------------------+ 3.6 Hislop Hislop discusses three approaches to evaluating software; function, form, and similarity [19]. Evaluating software on function helps to select compo- ________ nents based on what the component does. In fact, many part collections use a hierarchical organization based on function. The second approach, form, _____ characterizes the software based on observable characteristics such as size or structure. This approach lends itself well to code analysis tools that report on numbers of source statements or code complexity. The third approach, similarity, compares modules and groups modules based on shared ___________ attributes. Hislop's work uses ideas drawn from plagiarism detection, where instructors seek to identify cases of students "reusing" each others programs. However, this type of analysis also helps in the study of reusability in a number of ways. First, a tool can identify potentially reusable modules in existing software by finding modules with a similarity metric close to those of suc- cessfully reused modules. Second, identifying groups of similar modules can help in domain analysis by showing opportunities for reuse. Third, the method can help automate reuse metrics by locating instances of informal reuse with or without modification. Hislop's prototype tool, SoftKin, consists of a data collector and a data ________ analyzer. The collector parses the software and calculates measures of form for each module. The analyzer computes the similarity measures based on a variety of form metrics such as McCabe complexity and structure profile metrics. 3.7 Boetticher and Eichmann Recognizing the difficulty of defining what humans seem to accept as an intuitive notion of reusability, Boetticher and Eichmann take an alternative approach to reusability metrics. They base their work on the training of neural networks to mimic a set of human evaluators [4], [5]. They conducted experiments that varied several neural network configurations and each net- work's ability to match the output provided by a group of four humans. Using an Ada repository, the study used commercial metric tools to generate over 250 code parameters with the goal of determining the best possible associ- ation between the parameters and the human assessments. The input parameter selection contained code parameters representing module complexity, adaptability, and coupling. The experiment proceeded in three phases, using input parameters significant in black box reuse (reuse without _________ modification), white box reuse (reuse with modification), and grey box reuse _________ ________ (a combination of black box and white box reuse). Black box parameters included source statements, physical size, file size, and number of inputs/outputs. White box parameters included Halstead volume, cyclomatic complexity, coupling, and size. Grey box metrics combined the black box and white box inputs. The experiment strived for a best fit through sensitivity analysis on parameters selected for the training vectors, the neural network configura- tion, and extent of network training. Although the black box and white box results showed little correlation to the expected outputs (.18 and .57, respectively), the grey box results correlated very well (.86) with the expert ratings. The experiment concluded that neural networks could serve as an economical, automatic tool to generate reusability rankings of software. 3.8 Torres and Samadzadeh Torres and Samadzadeh conducted a study to determine if information theory metrics and reusability metrics correlate [42]. Information theory measures information content as the level of entropy in a system (entropy reflects the degree of uncertainty or unpredictability in a system). This study examined the effects of two information theory metrics, entropy loading and control _______________ _______ structure entropy, on software reusability. __________________ Entropy loading reflects the degree of inter-process communication. Since the amount of communication required between parts of any system drives entropy up, information theory seeks ways to reduce entropy by designing systems with minimal communication between sub-systems. Applying this concept to software, the researchers measure the amount of communication required between modules and assign a value for entropy loading to each module. Entropy loading corresponds to the software engineering concepts of coupling and cohesion; programs that possess small values for entropy loading should also possess properties consistent with good program structure and reusability. Control structure entropy seeks to measure the complexity of a modules's logic structure, as reflected by the number of if-then statements in the _______ module. Like cyclomatic complexity, control structure entropy provides a value for module complexity. An experiment that calculated the two information theory metrics on six programs (three in Ada, three in C) found that high entropy loading (cou- pling) had an negative effect on reuse, while low control structure entropy (complexity) had a positive effect. The study concluded that there exists a possible relationship between information theory metrics and reusability metrics. Consequently, these metrics might help select the optimum reuse case among different reuse candidates. 3.9 Mayobre To help identify reusable workproducts in existing code, Mayobre describes a method called Code Reusability Analysis (CRA) [26]. CRA melds three reusa- bility assessment methods and an economical estimation model to identify reusable components, resulting in a combination of automatic analysis and expert analysis to determine both technical reusability and economic value. CRA uses the Caldiera and Basili method as one of the three methods. The second method, called Domain Experience Based Component Identification Process (DEBCIP) depends on an extensive domain analysis of the problem area and uses a decision graph to help domain experts identify reusable compo- nents. The primary output of the DEBCIP provides an estimate of the expected number of reuse instances for the component. The third method, called Variant Analysis Based Component Identification Process (VABCIP), also uses domain knowledge but to a lesser degree. It uses cyclomatic complexity metrics to estimate the specification distance between the existing module and the required module giving an estimate of the effort needed to modify the component. The last step of the reusability analysis consists of estimating the Component Return On Investment, a process con- sisting of comparing the estimated costs of reuse with the expected benefits. A test of this method with 7-8 engineers and 40k source statements in the data communication domain showed a very high correlation of about 88% between the Caldiera and Basili metrics and expert analysis. The full CRA including VABCIP had better results, but took up to four weeks to complete and required a domain expert, a domain analyst, and a software engineer familiar with software metrics. 3.10 NATO The NATO Standard for Software Reuse Procedures recommends tracking four metrics as indicators of software quality and reusability [30]: o Number of inspections. The number of times someone has considered the ________________________ module for reuse. o Number of reuses. The number of times someone actually has reused the _________________ module. o Complexity. The complexity of the code, normally based on the McCabe com- ___________ plexity metric. o Number of problem reports. The number of outstanding defects in the _____________________________ module. The standard suggests these metrics present a rough estimate of the reusa- bility of a component and can help eliminate unsuitable candidates. For example, potential reusers should look for components with a high number of prior reuses, but remain skeptical of components with a high number of inspections and few actual reuses. The standard recommends reusing compo- nents with low complexity values and fewer known defects. 3.11 US Army Reuse Center The Army Reuse Center (ARC) inspects all software submitted to the Defense Software Repository System (DSRS) [38]. As part of that inspection, each component undergoes a series of reusability evaluations [32]. The prelimi- nary screening consists of coarse measures of module size in source state- ments, the estimated effort to reuse the component without modification, the estimated effort needed to develop a new component rather than reuse one, the estimated yearly maintenance effort, and the number of expected reuses of the component. Following this screening, the ARC conducts several other assessments and tests. They calculate an initial and a final reusability metric using a com- mercially available Ada source code analysis tool. The initial analysis uses 31 metrics supplied by the tool; the final analysis uses a total of 150 metrics. Table 2 lists a subset of the 31 metrics used in the initial anal- ysis by metric category, and gives the metric threshold values required by the ARC. +------------------------------------------------------------------+ | Table 2. Partial list of ARC reusability metrics | +-------------------+---------------------------------+------------+ | CATEGORY | METRIC | THRESHOLD | | | | VALUE | +-------------------+---------------------------------+------------+ | Anomaly Manage- | Normal_Loops | 95% | | ment | Constrained_Subtype | 80% | | | Contraint_Error | 80% | | | Constrained_Numerics | 90% | | | Constraint_Error | 0% | | | Program_Error | 0% | | | Storage_Error | 0% | | | Numeric_Error | 0% | | | User_Exception_Raised | 100% | +-------------------+---------------------------------+------------+ | Independence | No_Missed_Closed | 0% | | | Fixed_Clause | 100% | | | No_Pragma_Pack | 0% | | | No_Machine_Code_Stmt | 100% | | | No_Impl_Dep_Pragmas | 0% | | | No_Impl_Dep_Attrs | 0% | +-------------------+---------------------------------+------------+ 4.0 Qualitative Methods Because finding and agreeing upon a purely objective reusability metric often proves difficult, many organizations provide subjective guidance on identifying or building reusable software [12], [16], [17], [20], [29], [38]. These guidelines help ameliorate the problem of not knowing exactly how to _______ define reusability with an intuitive description of what a reusable component ought to look like. The guidelines range in content from general discussions about designing for reuse to very detailed rules and specific design points. Usually module-oriented, they cover points such as formatting and style requirements. Although guidelines primarily belong to the general areas of designing for reuse or building for reuse, organizations may develop reusa- _____________________ ___________________ bility metrics based on how well a component meets the published standard. Rather than expand on the many available guidelines, this section presents some general "reusability" attributes and two examples showing how these attributes translate to reusability principles. Finally, this section ends with an example of a component-oriented approach. 4.1 General reusability attributes Most sets of reusability guidelines have a lot in common: The value they add to their organizations comes from code specific rules and from level of detail. In general, the guidelines reflect the same software characteristics as those promoted by good software engineering principles [36]. This empha- sizes the fact that reuse requires a focus on the basic problem of good soft- ware design and development. Table 3 gives a high-level summary of these software engineering concepts as seen by the Software Technology for Adapt- able, Reliable Systems (STARS) Program [41]. +------------------------------------------------------------------+ | Table 3. General attributes of reusable software | +----------------+-------------------------------------------------+ | ATTRIBUTE | DESCRIPTION | +----------------+-------------------------------------------------+ | Ease of under- | The component has thorough documentation, | | standing | including self-documenting code and in-line | | | comments. | +----------------+-------------------------------------------------+ | Functional | The component has all the required operations | | completeness | for the current requirement and any reasonable | | | future requirements. | +----------------+-------------------------------------------------+ | Reliability | The component consistently performs the adver- | | | tised function without error and passes | | | repeated tests across various hardware and | | | operating systems. | +----------------+-------------------------------------------------+ | Good error and | The component isolates, documents, and handles | | exception han- | errors consistently. It also provides a | | dling | variety of options for error response. | +----------------+-------------------------------------------------+ | Information | The component hides implementation details from | | hiding | the user, for example, internal variables and | | | their representation. It also clearly defines | | | the interfaces to other operations and data. | +----------------+-------------------------------------------------+ | High cohesion | The component does a specific, isolated func- | | and low cou- | tion with minimal external dependencies. | | pling | | +----------------+-------------------------------------------------+ | Portability | The component does not depend on unique hard- | | | ware or operating system services. | +----------------+-------------------------------------------------+ The application of these abstract concepts will in large part determine their success. In situations where one team provides shared software to the rest of a project, the products this team builds must have characteristics that both make them general enough for multiple uses and have enough sup- porting information to make them easily useful. If the team does not meet the needs of their customers, the customers will not use the software. The team must make the software general by carefully examining each customer requirement and abstracting the necessary detail. 4.2 The 3 C Model The "3 C Model" of reusable software components comes from its three con- stituent design points: [44]: o Concept. What abstraction the component represents. ________ o Content. How the component implements the abstraction. ________ o Context. The environment in which the component operates. ________ The Concept of a component relates to its specification or interface; it gives a black-box perspective of the component's function. The Content relates to the actual algorithm, or code, that implements the function abstracted by the Concept. The Context refers to those parts of the environ- ment that affect the use of the component; in other words, the Context defines the component's dependencies when used in another application. The 3C Model seeks to isolate Concept, Content, and Context-specific dependencies from each other during the design and implementation of a module. By building a clear boundary between each, a reuser can modify one without affecting the others. For example, several module implementations may exist to serve the same interface, thereby allowing the reuser to select the best implementation to meet specific constraints or performance criteria. Successful implementation of the 3C Model would allow developers to treat modules like "software integrated circuits" by plugging them into sockets in an application framework. 4.3 University of Maryland Two example sets of coding guidelines to increase the reusability of Ada modules resulted from a set of reuse related projects at the University of Maryland [2], [3]. These guidelines fall into two categories, those based on data binding and those based on transformation. Data binding measures the strength of data coupling between modules. Transformation formally quanti- fies the number and types of modifications required to adapt a module into something reusable. By writing modules to reduce data binding dependencies and the number of expected changes required, a programmer can develop more reusable modules. The following guidelines reflect software engineering principles applied to Ada: Reuse guidelines based on data bindings _______________________________________ o Avoid multiple level nesting in any of the language constructs. o Minimize use of the "use" clause. o The interfaces of the subprograms should use the appropriate abstraction for the parameters passed in and out. o Components should not interact with their outer environment. o Appropriate use of packaging could greatly accommodate reusability. Reuse guidelines based on dependencies ______________________________________ o Avoid direct access into record components except in the same declarative region as the record type declaration. o Minimize non-local access to array components. o Keep direct access to data structures local to their declarations. o Avoid the use of literal values except as constant value assignments. o Avoid mingling resources with application specific contexts. o Keep interfaces abstract. Providing and enforcing a component's compliance to completeness or quality ___________ standards also provides a way to enhance reusability. For a developer to efficiently use a software module, the developer must have access to other information such as design documents, integration instructions, test cases, and legal information [39]. Table 4 contains a listing of the kinds of sup- porting information IBM provides to potential reusers. A component receives one of three quality ratings based, in part, on the completeness and quality of this information supporting the reusable module. +------------------------------------------------------------------+ | Table 4. Information helpful when reusing software | +----------------+-------------------------------------------------+ | ATTRIBUTE | DESCRIPTION | +----------------+-------------------------------------------------+ | Abstract | Provides a clear, concise description of the | | | component. | +----------------+-------------------------------------------------+ | Change history | Describes changes to the code, who made them, | | | the date of the changes, and why. | +----------------+-------------------------------------------------+ | Dependencies | Describes prerequisite software and other soft- | | | ware the component uses. | +----------------+-------------------------------------------------+ | Design | Describes the internal design of the code and | | | major design decisions. | +----------------+-------------------------------------------------+ | Interfaces | Describes all inputs, outputs, operations, | | | exceptions, and any other side effects visible | | | to the reuser. | +----------------+-------------------------------------------------+ | Legal | Provides a summary of legal information and | | | restrictions, such as license and copyright | | | information. | +----------------+-------------------------------------------------+ | Performance | Describes the time requirements, space require- | | | ments, and any performance considerations of | | | the algorithm. | +----------------+-------------------------------------------------+ | Restrictions | Lists any situations that limit the usability | | | of the component, such as nonstandard compiler | | | options and known side effects. | +----------------+-------------------------------------------------+ | Sample | Provides a usage scenario showing how the com- | | | ponent applies to a specific problem. | +----------------+-------------------------------------------------+ | Test | Contains information about the test history, | | | procedures, results, and test cases. | +----------------+-------------------------------------------------+ | Usage | Provides helpful information on how to inte- | | | grate the component. | +----------------+-------------------------------------------------+ 5.0 A Common Model for Reusability Table 5 summarizes the reusability metrics discussed in this paper. Most empirical methods take a module-oriented approach since modules provide many easily measurable attributes. Of these attributes, the following appear common to most of the empirical methods: o Reusable modules have low module complexity. o Reusable modules have good documentation (a high number of non-blank comment lines). o Reusable modules have few external dependencies (low fan-in and fan-out, few input and output parameters). o Reusable modules have proven reliability (thorough testing and low error rates). The qualitative methods vary in their application of software engineering principles, code-specific implementation issues, and level of detail. When they assign a reusability value to a module, they typically base the value on a subjective assessment of how well a module complies with a set of guide- lines. Component-oriented approaches not only specify code standards but also list required supporting information that a developer must have to effectively reuse a component. +------------------------------------------------------------------+ | Table 5. Reusability metric summary | +---------------------------------+-------+--------+-------+-------+ | METHOD | E | Q | M | C | +---------------------------------+-------+--------+-------+-------+ | Prieto-Diaz and Freeman | &check| | &check| | +---------------------------------+-------+--------+-------+-------+ | Selby | &check| | &check| | +---------------------------------+-------+--------+-------+-------+ | Chen and Lee | &check| | &check| | +---------------------------------+-------+--------+-------+-------+ | Caldiera and Basili | &check| | &check| | +---------------------------------+-------+--------+-------+-------+ | REBOOT | &check| | &check| | +---------------------------------+-------+--------+-------+-------+ | Hislop | &check| | &check| | +---------------------------------+-------+--------+-------+-------+ | Boetticher and Eichmann | &check| | &check| | +---------------------------------+-------+--------+-------+-------+ | Torres and Samadzadeh | &check| | &check| | +---------------------------------+-------+--------+-------+-------+ | Mayobre | &check| | &check| | +---------------------------------+-------+--------+-------+-------+ | NATO (2 approaches) | &check| &check.| &check| | +---------------------------------+-------+--------+-------+-------+ | Army (2 approaches) | &check| &check.| &check| | +---------------------------------+-------+--------+-------+-------+ | STARS | | &check.| &check| | +---------------------------------+-------+--------+-------+-------+ | U. Maryland | | &check.| &check| | +---------------------------------+-------+--------+-------+-------+ | IBM | | &check.| | &check| +---------------------------------+-------+--------+-------+-------+ | NOTE: | | | | E = Empirical | | Q = Qualitative | | M = Module oriented | | C = Component oriented | +------------------------------------------------------------------+ 6.0 Domain Considerations Tracz observed that for programmers to reuse software, they must first find it useful [43]. In some sense, researchers have fully explored most tradi- _______ tional methods of measuring reusability: complexity, module size, interface characteristics, etc. However, although many recognize the importance of the problem domain to reuse, few have linked this affect on the ultimate "useful- ______ ness" of a component. The usefulness of a component depends as much on the framework in which it fits as it does on internal characteristics of the component. In other words, the real benefits of reuse occur following: 1. an analysis of the problem domain 2. capturing the domain architecture 3. building or including components into the architecture. Although low coupling, high cohesiveness, modularity, etc. give general indications as to the ease of use of a component, they cannot, by themselves, provide a generic measure of reusability. Reusability comes from adhering to a domain architecture and to the above software engineering principles. With the exception of [26], the above methods do not include any domain character- istics. The input parameters come from observable or readily obtainable data about the software component. If the reusability of a component depends on context then reusability metrics must also include domain and environment _______ characteristics. Research issues must include ways to quantify a domain's size, stability, and maturity [21]. 6.1 Applying Metric Theory The fact remains that most published work in software metrics, including reusability metrics, does not follow the important rules of measurement theory. We can informally say that any number we assign to a component's reusability must preserve the intuitive understanding and observations we make about that component. This understanding leads us to put a value on the metric. More formally, if we observe component A as less reusable than component B, then our reusability measurement, M , must preserve M(A) lt M(B). The first problem in metric theory comes from our trying to put an empir- ical value on a poorly understood attribute. To make matters worse, relations we define generally fall apart when we change contexts. We often find that a component we feel has very high reusability attributes in domain X and environment S has none of these characteristics in domain Y and environment T. Software engineers working in these two situations would never reach con- sensus on the reusability for this individual component. This makes the search for a general reusability metric seem futile for the same reasons Fenton discusses regarding the search for a general software complexity metric [14]. Metric theory tells us that we can look for metrics that assign an empir- ical value to specific attributes or views of reusability, but not an overall __________ _____ reusability rating. Furthermore, if research truly reveals that context _______ affects our view of a component's reusability as much as the component's internal attributes, then we may some day find a metric M that maps the tuple (internal attributes, domain attributes, environment ________________________________________________________ attributes) to ___________ M . 7.0 Conclusion Although reusability guidelines and module-oriented metrics provide an intuitive feel for the general reusability of a component, we need to work on proving that they actually reflect a component's reuse potential. Until researchers can agree on this issue, we will not develop a uniform metric. Existing techniques show a wide range of ways to address this problem, ranging from empirical to qualitative methods. So far, the results mostly indicate a need to first define a suitable scope for future research. This scope must include the affects of domain and environment when we talk about ______ ___________ measuring the "reusability" of individual components. 8.0 Acknowledgements This paper expands an earlier work with current research and input from recent discussions on reuse@wunet.wustl.edu. I would also like to thank Will Tracz and Marilyn Gaska of Loral Federal Systems-Owego for their insights during the preparation of this paper. Author's address: ____________________ poulinj@lfs.loral.com or MD 0124, Loral Federal Systems-Owego, New York, __ 13827. 9.0 Cited References [1] Anthes, Gary H., "Software Reuse Plans Bring Paybacks," Computerworld, ______________ Vol. 27, No. 49, pp.73,76. [2] Bailey, John W. and Victor Basili, "Software Reclamation: Improving Post-Development Reusability," 8th Annual National Conference on Ada _______________________________________ Technology, 1990. ___________ [3] Basili, Victor R., H. Dieter Rombach, John Bailey, and Alex Delis, "Ada Reusability Analysis and Measurement,," Empirical Foundations of Infor- _______________________________ mation and Software Science V, Atlanta, GA, 19-21 October 1988, pp. ______________________________ 355-368. [4] Boetticher, G., K. Srinivas, and D. Eichmann, "A Neural Net-based Approach to Software Metrics," Proceedings of the 5th International Con- _________________________________________ ference on Software Engineering and Knowledge Engineering, San __________________________________________________________________ Francisco, CA, 14-18 June 1993, p. 271-4. [5] Boetticher, Gary and David Eichmann, "A Neural Network Paradigm for Characterizing Reusable Software," Proceedings of the 1st Australian _____________________________________ Conference on Software Metrics, 18-19 November 1993. _______________________________ [6] Caldiera, Gianluigi and Victor R. Basili, "Identifying and Qualifying Reusable Software Components,," IEEE Software, Vol. 24, No. 2, February ______________ 1991, pp. 61-70. [7] Canfora, G. Cimitile, A. Munro, M. Tortorella, M., "Experiments in Identifying Reusable Abstract Data Types in Program Code," Proceedings ___________ IEEE Second Workshop on Program Comprehension, Capri, Italy, 8-9 July _________________________________________________ 1993, pp. 36-45. [8] Chen, Deng-Jyi and P.J. Lee, "On the Study of Software Reuse Using Reus- able C++ Components," Journal of Systems Software, Vol. 20, No. 1, Jan ____________________________ 1993, pp. 19-36. [9] Chidamber, Shyam R. and Chris F. Kemerer, "Towards a Metrics Suite for Object Oriented Design,," Proc. OOPSLA 1991, ACM Press, Oct 1991, pp. __________________ 197-211. [10] Conte, S.D., H.E. Dunsmore, and V.Y. Shen. Software Engineering Metrics ____________________________ and Models. Benjamin Cummings, NY, 1986. ___________ [11] Cruickshank, Robert D. and John E. Gaffney, Jr., "The Economics of Soft- ware Reuse," Software Productivity Consortium, SPC-92119- CMC, Version _________________________________ 01.00.00, September 1991. [12] Edwards, Stephan, "An Approach for Constructing Reusable Software compo- nents in Ada," Strategic Defense Organization Pub # Ada233 662, ______________________________________________________ Washington, D.C., September 1990. [13] Fenton, Norman E., Software Metrics: A Rigourous Approach. Chapman & __________________________________________________ Hall, London, UK, 1991. _____ [14] Fenton, Norman E., "Software Measurement: A Necessary Scientific Basis," IEEE Transactions on Software Engineering, Vol. SE-20, No. 3, March __________________________________________ 1994, pp. 199-206. [15] Halstead, Maurice H. Elements of Software Science. Elsevier North- _____________________________ Holland, New York, 1977. [16] Hooper, James W. and Chester, Rowena O., "Software Reuse Guidelines," U.S. Army Institute for Research in Management Information, Communi- cation, and Computer Sciences, ASQB-GI-90-015, April 1990. [17] Hooper, James W. and Chester, Rowena O.. Software Reuse Guidelines and ______________________________ Methods. Plenum Press, 1991. ________ [18] Griss, Martin and Will Tracz, eds. "WISR'92: 5th Annual Workshop on Software Reuse Working Group Reports," ACM SIGSOFT Software Engineering _________________________________ Notes, Vol. 18, No. 2, April 1993, pp. 74-85. ______ [19] Hislop, Gregory W., "Using Existing Software in a Software Reuse Initi- ative," The Sixth Annual Workshop on Software Reuse (WISR'93), 2-4 ___________________________________________________________ November 1993, Owego, New York. [20] Hollingsworth, Joe. Software Component Design-for-Reuse: A Language ________________________________________________ Independent Discipline Applied to Ada. Ph.D. Thesis, Dept. of Computer ____________________________________________________ and Information Science, The Ohio State University, Columbus, OH, 1992. [21] Isoda, Sadahiro, "Experience Report on Software Reuse Project: Its Structure, Activities, and Statistical Results," Proceedings of the ____________________ International Conference on Software Engineering, Melbourne, Australia, __________________________________________________ 11-15 May 1992, pp. 320-326. [22] Karlsson, Even-Andre, Guttorm Sindre, and Tor Stalhane, "Techniques for Making More Reusable Components," REBOOT Technical Report #41, 7 June ____________________________ 1992. [23] Kitchenham, Barbara and Kari Kansala, "Inter-item Correlations among Function Points," Proceedings of the IEEE Computer Society International ______________________________________________________ Software Metrics Symposium, Baltimore, MD, 21-22 May 1993, pp. 11-15. ___________________________ [24] Linn, Marcia C. and Michael J. Clancy, "The Case for Case Studies of Programming Problems," Communications of the ACM, Vol. 35, No. 3, March __________________________ 1992, p. 121. [25] Matsumoto, Yoshihiro, "Some Experience in Promoting Reusable Software Presentation in Higher Abstraction Levels," IEEE Transactions on Soft- __________________________ ware Engineering, Vol. 10, No. 5, September 1984, pp. 502-513. _________________ [26] Mayobre, Guillermo, "Using Code Reusability Analysis to Identify Reus- able Components from the Software Related to an Application Domain," Fourth Annual Workshop on Software Reuse (WISR'91), Reston, VA, 18-22 ___________________________________________________ November 1991. [27] McCabe, T.J., "A Complexity Measure," IEEE Transactions on Software _____________________________ Engineering, SE-2, 1976, pp. 308-320. ____________ [28] Musser, David R. and Alexander A. Stepanov, The Ada Generic Library. ___________________________ Springer-Verlag, New York, 1989. [29] NATO, "Standard for the Development of a Reusable Software Components," NATO Communications and Information Systems Agency, 18 August 1991. ___________________________________________________ [30] NATO, "Standard for Management of a Reusable Software Component Library," NATO Communications and Information Systems Agency, 18 August ___________________________________________________ 1991. [31] Pennell, James P., "An Assessment of Software Portability and Reusa- bility for the WAM Program," Institute for Defense Analysis, Alexandria, _______________________________ VA, October 1990. [32] Piper, Joanne C. and Wanda L. Barner, "The RAPID Center Reusable Compo- nents (RSCs) Certification Process," U.S. Army Information Systems Soft- ___________________________________ ware Development Center - Washington, Ft. Belvoir, VA. _____________________________________ [33] Poulin, Jeffrey S. and Joseph M. Caruso, "A Reuse Measurement and Return on Investment Model," Proceedings of the Second International Workshop __________________________________________________ on Software Reusability, Lucca, Italy, 24-26 March 1993, pp. 152-166. ________________________ [34] Poulin, Jeffrey S., "Issues in the Development and Application of Reuse Metrics in a Corporate Environment," Fifth International Conference on ___________________________________ Software Engineering and Knowledge Engineering, San Francisco, CA, 16-18 _______________________________________________ June 1993, pp. 258-262. [35] Poulin, Jeffrey S., Debera Hancock and Joseph M. Caruso, "The Business Case for Software Reuse," IBM Systems Journal, Vol. 32, No. 4., 1993, ____________________ pp. 567-594. [36] Pressman, R.S. Software Engineering: A Practitioner's Approach. _____________________________________________________ McGraw-Hill, 1992. [37] Prieto-Diaz, Ruben and Peter Freeman, "Classifying Software for Reusa- bility," IEEE Software, Vol. 4, No. 1, January 1987, pp. 6-16. ______________ [38] RAPID, "RAPID Center Standards for Reusable Software," U.S. Army Infor- _________________ mation Systems Engineering Command, 3451-4-012/ 6.4, October 1990. ___________________________________ [39] RIG Technical Committee on Asset Exchange Interfaces, "A Basic Interop- erability Data Model for Reuse Libraries (BIDM)," Reuse Interoperability ______________________ Group (RIG) Proposed Standard RPS-0001, 1 April 1993. _______________________________________ [40] Selby, Richard W., "Quantitative Studies of Software Reuse," in Software ________ Reusabilty, Volume II, Ted J. Biggerstaff and Alan J. Perlis (eds.). ______________________ Addison-Wesley, Reading, MA, 1989. [41] STARS, "Repository Guidelines for the Software Technology for Adaptable, Reliable Systems (STARS) Program," CDRL Sequence Number 0460, 15 March 1989. [42] Torres, William R. and Mansur H. Samadzadeh, "Software Reuse and Infor- mation Theory Based Metrics," Proc. 1991 Symposium on Applied Computing, __________________________________________ Kansas City, MO, 3-5 April 1991, pp. 437-46. [43] Tracz, Will, "Software Reuse Maxims," ACM Software Engineering Notes, _______________________________ Vol. 13, No. 4, October 1988, pp. 28-31. [44] Tracz, Will, "A Conceptual Model for Megaprogramming," SIGSOFT Software _________________ Engineering Notes, Vol. 16, No. 3 July 1991, pp. 36-45. __________________ [45] Woodfield, Scott N., David W. Embley, and Del T. Scott, "Can Programmers Reuse Software," IEEE Software, Vol. 4, No. 7, July 1987, pp. 168-175. ______________ [46] Zhuo, Fang, Bruce Lowther, Paul Oman, and Jack Hagemeister, "Con- structing and Testing Software Maintainability Assessment Models," Pro- ____ ceedings of the IEEE Computer Society International Software Metrics ________________________________________________________________________ Symposium, Baltimore, MD, 21-22 May 1993, pp. 61-70. __________ [47] Zuse, H., "Criteria for Program Comprehension Derived from Software Com- plexity Metrics," Proceedings IEEE Second Workshop on Program Comprehen- ______________________________________________________ sion, Capri, Italy, 8-9 July 1993, pp. 8-16. _____