Measuring the Level of Reuse
in
Object-Oriented Development
Jeffrey S. Poulin
Loral
Federal Systems
MD
0210, Owego, NY 13827
Tel:
(607) 751-6899, fax: (607) 751-6025
Email:
poulinj@lfs.loral.com
28
August 1995
Abstract
Defining what software to count as “reused” software constitutes
the most difficult and meaningful problem in reuse metrics. Experience reports routinely report
impressive reuse levels and benefits due to reuse. However, the reader cannot trust the report without understanding
what the report counted; e.g., modified software, ported software, generated
software, etc. To date, only one model exists for defining a consistent method
of counting and reporting software reuse.
Although this model applies equally to various development paradigms to
include object-oriented programming, this paper clarifies the model with
respect to objects and expands on issues related to object-oriented reuse
metrics.
Keywords: Software Reuse, Reuse Metrics,
Object-Oriented Programming
Workshop Goals: Learn and discuss current reuse issues
and methods.
Working Groups: Useful and collectible metrics; Domain
analysis and engineering; Software Architectures
To date, only one model exists for
defining a consistent method of counting and reporting software reuse
[Poulin93a]. This model methodically
makes recommendations as to what to count as a Reused Source Instruction
(RSI). The rationale for the model
applies equally to the functional, procedural, and object-oriented (OO)
paradigms because the rationale does not depend on any paradigm-specific
methods. However, recent work in OO
metrics often either fails to address this issue or the OO work identifies the
need for a common counting model. As OO
metrics mature and the community identifies the most meaningful metrics and
ranges of values for each [Poulin93b], the metrics must include consistent
definitions to serve as a basis for comparing results.
The definition of Reused Source
Instruction (RSI) applies to all programming paradigms.
This paper describes the three primary OO
reuse methods and briefly discusses the benefits and drawbacks of each. It compares each of these methods to the
analogous practice in traditional procedural development. Finally, the paper presents some issues that
an organization should address when instituting reuse metrics for OO
development.
Expectations of high reuse levels provide
one of the most important motivations for OO software development. In fact, the OO paradigm provides three
primary ways to support reuse:
1.
Classes. The encapsulation of function and data
provides for their use via multiple instantiation and in aggregation
relationships with other classes.
2.
Inheritance.
The extension of a class into subclasses by specializing the existing class
allows subclasses to share common features.
3.
Polymorphism.
The reduction of program complexity by providing the same operation on several
data types via several implementations for a common interface.
Each of these reuse approaches involves
some trade-offs in terms of application and desirability. These trade-offs include the “cleanliness” of the
reuse; in other words, whether or not each component has a traceable heritage
of evolution and consequently whether or not changes to a component will result
in changes throughout the system [Cockburn93].
For example, creating a subclass and overloading a method through
polymorphism can cause unexpected results at execution time in the case of an
unintentional name conflict. Modifying
source code, whether for the method implementation or simply the interface,
will propagate changes to all parts of the system using that method. Alternately, simple sub-classing (without
polymorphism) or use of a method without modification of source code provides
for clean and safe reuse.
This view of OO reuse corresponds to the
subsequent testing requirements incurred by the implications of inheritance and
polymorphism [Binder95]. For testing
purposes, each inherited methods requires retesting and each possible binding
of a polymorphic component requires a separate test. The extent of testing depends on the “cleanliness” of the
reuse.
If the features of OO languages do not
guarantee reuse [Griss94], they certainly do not guarantee good
programming. In traditional procedural
development programmers should use language features such as macros, functions,
and procedures when they must repeatedly execute the same operations. The programmer may use techniques (depending
on the implementation language) such as variable macros (a method for
overloading macro definitions) or generic packages (a method for overloading
abstract data types in Ada).
We consider the use of these language
features and techniques “good
programming”
or “good
design.” Likewise, we expect OO programmers to use
the language features at their disposal (e.g., classes, inheritance, and
polymorphism) to abstract features common to objects and to minimize the amount
of code they must write and maintain.
In traditional development, we consider a
component “reused” when the
programmer avoided having to write software by means of obtaining it from “someplace
else.” Within small teams or organizations, we
expect the use of macros, functions, and procedures for common functions. Repeated calls to these components do not
count as reuse; calling components simply uses the features of procedural
languages. However, if the organization
obtains the software from another organization or product we count it as reuse.
Likewise, a small team or organization may
build a class hierarchy in which subclasses incrementally specialize the common
aspects of parent classes. We expect
sub-classing, inheritance, and polymorphism in OO development. Of course, multiple instantiations of an
object do not count as reuse for the same reason multiple calls to a function
do not count. Reuse in OO programs
comes from the use of classes and methods obtained from other organizations or
products, thereby resulting in a savings of development effort.
I call the issue of determining when an
organization has saved effort by using software from another organization or
product the boundary problem. In
practice, however, boundaries become obvious due to the structure of the
organization (the people) and/or the structure of the software (in most cases
management structure maps to the software structure). On small projects, “reused” classes come from pre-existing class
libraries, commercially licensed class libraries, or other projects. On large projects, the many organizations
working on the project will reuse each other’s software as well as reuse
software from commercial sources and reuse software perhaps developed for that
purpose by a team dedicated to building shared classes.
One issue involves counting “extra” code in
a reused class. In traditional
development, generalizing a component may require additional code. Take the case of a variable macro
instantiation. Do you count reuse based
on:
1.
the code produced in that instantiation.
2.
the average code produced over all
possible instantiations.
3.
the maximum code produced of all possible
instantiations.
4.
the macro source code needed for that
instantiation including error checking on input parameters and environment
variables.
5.
the total macro source, to include code
for all possible macro instantiations and the error checking for each, in which
case you may give reuse credit for a significant amount of software the
programmer would not have written.
Likewise, generalizing objects may require
additional code to support many possible applications. Take the case of a reused class stack,
which contains numerous methods for testing the status of and manipulating the
stack. However, a developer may only need
the push and pop methods.
Do you count reuse based on:
·
code used by the developer (push, pop)
·
total code in the reused stack
class.
In practice I have found that this
occurrences happens rarely, in part because other constraints (such as on object
code size) restrict the practical ability to include more code than a developer
needs. In addition to encouraging the
use of pre-existing classes, this realization supports the counting of the
entire class versus trying to determine the subset of methods actually used.
In fact, data collection affects reuse
metrics because an analyst must usually work with available or easily
obtainable data. For traditional
development, the reuse analyst must know the size of each reused component, the
total size of the software, and the developers and users of each
component. Likewise, the OO analyst
must know the size of each class, the total size of the program, and all the
developers and users of each class.
Currently, traditional metrics use “lines of code” for
measuring component size; although this measure holds for OO, other measures of
effort may include the total number of classes or the total number of methods.
The definition of reuse and what should
count as a Reused Source Instruction (RSI) (or reused objects) applies
to
all programming paradigms. The use of language features such as:
·
functions, procedures, generics, and
(variable) macros in the procedural paradigm
·
classes, inheritance, and polymorphism in
the OO paradigm
do not, by themselves, constitute “reuse” as much
as they represent a particular method to solve a problem. The rationale for defining reuse in any
paradigm depends not on language features but rather on an assessment of
whether someone saved effort through the use of software they obtained from
someplace else; in short, software they did not have to write.
The solution to determining reuse levels
in programs lies in the boundary problem. We expect organizations to call procedures and inherit methods
that the organization developed for its own use. When an organization avoided development by using software it
obtained from someplace else, then we say it “reused” the software.
Numerous references cite reuse as a
principal benefit of object-oriented technology. Kain admits the limited availability of quantitative evidence of
reuse in object-oriented programming and observes that few experience reports
say how they measure reuse [Kain94].
Kain cites evidence that some reports count reuse as the use of classes
and methods across applications (as explained above) and not, for example,
inheritance within an application.
Other reports count inheritance as reuse of the super class; these differences
prevent the comparison of OO experience reports.
Although Kain recognizes the need to
evaluate OO reuse in the context of multiple teams and organizations (not
individual programmers), Henderson-Sellers claims reuse comes from sub-classing
via inheritance and from multiple instantiations of the same class [HS93]. Although he focuses on the reuse of classes
from reusable class libraries, he includes references to the reuse of classes from
one project to another and from classes built for reuse within the
organization. Henderson-Sellers
discusses OO reuse economics using a simple Cost-Benefit Model but he fails to
compare the model to most other work done in the field.
[Bin95] R.V. Binder.
Testing Object-Oriented Systems: A Status Report. Crosstalk, 8(4):16-20, April 1995.
[Coc93] A.A. Cockburn. The Impact of Object-Orientation on Application Development. IBM Systems Journal, 32(3):420-444,
1993.
[Gri94] M.L. Griss.
PANEL: Object-Oriented Reuse. In
Proceedings of the Third International Conference on Software Reuse, 1-4
November 1994.
[HS93] B. Henderson-Sellers. The Economics of Reusing Library Classes. Journal of Object-Oriented Programming,
6(4):43-50, July-August 1993.
[Kai94] J.B. Kain.
Making Reuse Cost Effective.
Object Magazine, 4(3):48-54, June 1994.
[Pou93a]
J.S. Poulin. Issues in the Development
and Application of Reuse Metrics in a Corporate Environment. In Proceedings of the 5th International
Conference on Software Engineering and Knowledge Engineering, 16-18 June
1993.
[Pou93b] J.S. Poulin. Panel: Metrics for Object-Oriented Software Development. In Proceedings of the 6th IBM
Object-Oriented Software Development Conference, 19-23 July 1993.
Jeffrey S. Poulin works
as a Senior Programmer with Loral Federal Systems (formally IBM Federal
Systems Company) in Owego, NY. As a
member of the Advanced Technology Group, he serves as the Principal
Investigator and leader of research into new open systems standards-compliant technologies
in the areas of distributed UNIX system management, networking, and
object-oriented software development.
His responsibilities in Owego include lead architect for the LFS reuse program,
technical lead for a contract in support of reuse across the U.S. Army, and the
reuse strategy for a major Army MIS development program.
From 1991-1993 Dr. Poulin worked in the
IBM Reuse Technology Support Center (RTSC) where he led the development and
acceptance of the IBM software reuse metrics and return on investment (ROI)
model. He also organized, edited,
contributed to, and published a complete library of resources on the IBM Reuse
Method. Active in many professional activities,
Dr. Poulin has published over 30 papers on software measurement and reuse. In addition to serving on numerous
conference committees and panels, he chaired the IEEE Computer Society 6th
Annual Workshop on Software Reuse (WISR’93) in November 1993. A Hertz Foundation Fellow, Dr. Poulin earned
his Bachelors degree at the United States Military Academy at West Point and
his Masters and Ph.D. degrees at Rensselaer Polytechnic Institute in Troy, New
York.