Software metrics and measurement
Software is measured to:
- Establish the quality of the current product or process.
- To predict future qualities of the product or process.
- To improve the quality of a product or process.
- To determine the state of the project in relation to budget and schedule.
Measurements and Metrics A measurement is an indication of the size, quantity, amount or dimension of a particular attribute of a product or process. For example the number of errors in a system is a measurement.
A Metric is a measurement of the degree that any attribute belongs to a system, product or process. For example the number of errors per person hours would be a metric.
Thus, software measurement gives rise to software metrics.
Metrics are related to the four functions of management:
Metric Classification Software metrics can be divided into two categories; product metrics and process metrics.
Product metrics are used to asses the state of the product, tracking risks and discovering potential problem areas. The team's ability to control quality is assessed.
Process metrics focus on improving the long term process of the team or organisation.
Goal Question Metric(GQM)
The GQM, developed by Dr. Victor Bassili defines a top-down, goal oriented framework for software metrics. It approaches software measurement using a three level model; conceptual, operational, and quantitative. At the conceptual level, goals are set prior to metrics collection. According to the GQM organisational goals are understood to shape project goals. Organisational goals may be set by upper management or by organisation stakeholders. To establish Project goals Brainstorming sessions by the project team may be used. At the operational level, for each goal question(s) are established which when answered will indicate if the goal has been achieved. Finally, at the quantitative level, each question has a set of data associated with it which allow it to be answered in a quantitative way. File:Gqm.jpg
The GQM is described as a seven step process. (Some authors give a different number of steps, leaving out step 7 or merging two steps into one). The first three steps are crucial and correspond to the main activities at each level in the model as described above.
Step 1. Develop a set of Goals
Develop goals on corporate, division, or project level. These goals can be established from brainstorming sessions involving project team members, or they may be set by organisational goals or from stakeholder's requirements.
Basili and Rombach provide a template for recording the purpose, perspective and environment that will add structure to the goal
The [process, metric, product, etc] is [characterised, evaluated, understood, etc] in order to [understand,improve,engineer,etc] it.
The [cost,defects,changes,etc] are examined from the point of view of the [customer,manager,developer,etc].e.g. The changes are examined from the developers viewpoint.
The environment in which measurement takes place is evaluated in terms of people, process, problem factors, tools, and constraints.
Step 2. Develop a set of questions that characterise the goals.
From each goal a set of questions is derived which will determine if each goal is being met.
Step 3. Specify the Metrics needed to answer the questions.
From each question from step two it is determined what needs to be measured to answer each question adequately.
Step 4. Develop Mechanisms for data Collection and Analysis
It must be determined:
- Who will collect the data?
- When will the data be collected?
- How can accuracy and efficiency be ensured?
- Who will be the audience?
Step 5. Collect Validate and Analyse the Data.
The data may be collected manually or automatically. Metrics data can be portrayed graphically to enhance understanding.
Step 6. Analyse in a Post Mortem Fashion
Data gathered is analysed and examined to determine its conformity to the goals. Based on the findings here recommendations are made for future improvements.
Step 7. Provide Feedback to Stakeholders
The last step, providing feedback to the stakeholders is a crucial step in the measurement process. It is essentially the purpose behind the previous six steps. Feedback is often presented in the form of one goal per page with the questions, metrics and graphed results.
What is Six Sigma?
Sigma is a letter in the Greek alphabet which becomes the statistical symbol and metric of process variation. It scales the characteristics of defects per unit, part per million defective and the probability of a failure. The concept of Six Sigma(6σ) began in 1986 as a statistically-based method to reduce variation in electronic manufacturing processes in Motorola Inc in the USA. The top management with CEO Robert Galvin developed a concept named Six Sigma, In 1987, he formulated the goal of "achieving Six-Sigma capability by1992" in a memo to all Motorola employees (Bhote, 1989). The results in terms of reduction in process variation were on-track and cost savings totalled US$13 billion and improvement of labour productivity became 204% increase during 1987-1997 (Losianowycz, 1999).
Nowadays, as the founder for Six Sigma, Motorola University defines Six Sigma as (http://www.motorola.com/content.jsp?globalObjectId=3088):
“Six Sigma has evolved over the last two decades and so has its definition. Six Sigma has literal, conceptual, and practical definitions. At Motorola University, we think about Six Sigma at three different levels: o As a metric o As a methodology o As a management system Essentially, Six Sigma is all three at the same time.” "...Six Sigma as a Metric: The term "Sigma" is often used as a scale for levels of 'goodness' or quality. Using this scale, 'Six Sigma' equates to 3.4 defects per one million opportunities (DPMO). Therefore, Six Sigma started as a defect reduction effort in manufacturing and was then applied to other business processes for the same purpose."
What is Software Six Sigma?
From the software process aspect, Six Sigma has become a top-down methodology or strategy to accelerate improvements in the software process and software product quality. It uses analysis tools and product metrics to evaluate the software process and software product quality.
DMAIC vs. DMADV
DMAIC and DMADV are two Six Sigma sub-methodologies.
DMAIC is an abbreviation of Define requirements, Measure performance, Analyse relationships, Improve performance, Control performance. It is the most popular framework used within DFSS (Design for Six Sigma) projects. There are many methods to be implemented in each process:
|Benchmark||7 basic tools||cause & effect diagrams||Design of experiments||Statistical control|
|Baseline contract/charter||Defect metrics||Failure mode & effects analysis||Modelling||Control charts|
|Kano model||Data collection forms||Decision & risk analysis||Tolerance||Time series methods|
|Voice of the costumer||Sampling techniques||Statistical inférence||Robust design||procedural adherence performance|
|Voice of the business||Control charts||Preventive activities|
|Quality function deployment||Capability|
|Process flow map||Reliability analysis|
|Project management||Root cause analysis|
|"Management by fact"||System thinking|
DMADV is an abbreviation of Define requirements, Measure performance, Analyse relationships, Design solutions, Verify functionality. It is the problem-solving framework used within DFSS (Design for Six Sigma) projects.
Define & Measure
The relationship with Six Sigma level and DPMO
|Long term yield
(basically the percentage of successful output or operation )
|Defect per million opportunities
The purpose of process mapping is helping project define the project process, depict inputs, outputs and units of activity. It can serve as an instruction manual or a tool for facilitating detailed analysis and optimization of workflow and service delivery.
Stangenberg said, in Jul.9, 2003, a good Process Map should:
1) Allow people unfamiliar with the process to understand the interaction of causes during the work-flow. 2) Contain additional information relating to the Six Sigma project i.e. information per critical step about input and output variables, time, cost, DPU value.
Value stream mapping analytics is popular for use. It can output the Value Analysis, Lean Benchmarking, Value Stream Mapping, Cycle-time Variability Modelling and Discrete Event Simulation. Analyse: It includes regression techniques, estimating, process modelling, statistical process control and the Pareto Analysis (Pareto chart): The Pareto chart is used to graphically sum up and display the related importance of the differences between groups of data
Six Sigma resolves the problem in the organization. Using DMAIC approach, they start at boundary analysis and qualitative analysis, collect data with the relative problem, then find the root cause using metrics and analysis tools. Consequently, the measurement and solution will be put forward. The continuous control activity should be used into the improving and optimizing phases, the purpose is to ensure such problem will not happen in the future.
However, the natural limitation of Six Sigma also exists. It uses statistic analysis tools to find defects in current execution process, but the solution could not get from these statistic analyses. The success ratio is higher within small projects when using Six Sigma to manage their project than resolving big problem in organization.
- Recognizing Improvement Opportunities
- Making Effective Improvements:
1. Reviews, Inspections, and effective checklist management 2. Designing for defect prevention 3. Using Standards
- Measuring Effectiveness
- Tracking Status and Managing Projects
- Managing Effort
- Quality Control Plan
- Managing Defects in Integration & Test
Six Sigma processes are executed by Six Sigma Green Belts and Six Sigma Black Belts, and are overseen by Six Sigma Master Black Belts. (Six Sigma Dictionary ):
- Six Sigma Green Belts: A Six Sigma practitioner trained in the methodology and tools to need to work effectively on a process improvement team. Green Belts may act as team members under the direction of a Black Belt or may lead their own less complex, high impact projects.
- Six Sigma Black Belts: A Six Sigma expert highly skilled in the application of rigorous statistical tools and methodologies to drive business process improvement.
- Six Sigma Master Black Belts: A Black Belt achieves "Master" status after demonstrating experience and impact over some period of time. Master Black Belts address the most complex process improvement projects and provide coaching and training to Black Belts and Green Belts.
Metrics in Project Estimation
Software metrics are a way of putting a value/measure on certain aspects of development allowing it to be compared to other projects. These values have to be assessed correctly otherwise they will not give accurate measurements and can lead to false estimations, etc.
Metrics are used to maintain control over the software development process. It allows managers to manage resources more efficiently so the overall team would be more productive. Some examples of metrics include Size Projections like Source Byte Size, Source Lines of Code, Function pointers, GUI Features and other examples are Productivity Projections such as Productivity Metrics.
The metrics can be used to measure size, cost, effort, product quality of a project as well as putting a value on the quality of the process taken and personal productivity. There are certain factors that have to be taken into account when comparing different projects with each other using metrics. If one project has was written in a different language then the number of lines of code could be significantly different or perhaps the larger project has many more errors and bugs in it. Measurements such as Function Pointers give a better indication because they are the actual methods that are in the project rather than the number of lines.
Using metrics companies can make much better estimations on the cost, length, effort, etc of a project which leads to them giving a more accurate view of the overall project. This better view will allow the company to bid for projects more successfully, make projects more likely to succeed and will greatly reduce the risk of late delivery, failure, being penalised for late delivery, bankruptcy, etc.
The processes that manage the project and its code can have issues such as build failures, patches needed, etc that can affect the software metric's measures. By using ISO 9000 this can help alleviate these issues.
For smaller companies where customer retention is important, using metrics to better the software development process and improve on overall project quality, delivery time, etc will make the customers happy which may lead to continued business.
Source Byte Size
Source Byte Size is a measure of the actual size of the source code data (e.g. in Kilobytes). It measures the file size vs. packages, classes, methods, etc.
The overall byte size of the project can be estimated which would give a better indication of the type/size of hardware needed to use the software. This becomes an issue on systems/devices where they are of a limited nature (e.g. watch, washing machine, intelligent fridge, etc)
The byte size of the source code would vary greatly depending on which programming language was used to develop it. For example a program developed in Java could be much smaller than the same program coded in COBOL.
If we were to compare executable byte size of programs written in different languages we would see that there would be a difference too. The compiler compiles source code into byte code so different compilers would give out different executable file sizes.
As more development is done and the project increases, the overall byte size of the project increases. This will give estimations on how much space is needed to store, backup, etc the source files and also the size of transfers over the network. Although this is not much of a problem these days with the cost of storage, transfer bandwidth, etc being so cheap, it is a metric that can be used for this type of estimation (storage).
As the byte size builds up, searching and indexing will take slightly longer every time it increases.
Source Lines of Code (SLOC)
SLOC gives the size of the project by counting all the lines of source code in a project. It is used to measure effort both before as an estimate and after as an actual value. It comes from the days of FORTRAN and assembly coding.
SLOC gives a much clearer image to developers on the size of the project.
When code is written, integration and unit testing can be performed so measures of programming productivity and quality can be assessed.
Source lines of code themselves are not as meaningful as the other metrics. Just because one project has more lines of code than another does not make it more complex or give it a better quality rating. When using number of lines of code as a metric, other metrics would need to be used such as product quality. When product quality is looked at with number of lines of code, it gives a much better reflection on the overall project’s quality; radio of good code to buggy code; efficiency of the code; etc.
Source lines of code can be measured from another point of view. This is the measuring the actual number of lines of code written in a specific amount of time. When looking at a project level, the number of lines of code that is measured would typically come from the overall lines of code written throughout the project within a specific amount of time.
If the SLOC metric was being applied to an individual developer or a team then the number of lines of code would obviously be measured on the lines written by the developer or team respectively.
When developers use auto-generated code (from GUI designers, frameworks, etc) it can lead to incorrect productivity measures. These lines of code should be removed from the calculation to give a better indication of the actual number of lines written by the developer/team.
When using SLOC to assess different projects written in different programming languages it cannot be taken directly. This is because different programming languages can take more/less lines to do the same functions. The measure of number of lines of code need to be condensed/expanded so more meaningful values can be assessed. The use of function points would give a better indication because the number of function points would remain the same (i.e. a function point may take 10 lines in C# and 30 lines in FORTRAN but there still is only one function). Since the number of lines of code would be different, even for the same functionality, the effort required would be completely different.
When comparing projects by using SLOC, then it is far more useful if there are orders of magnitudes between them.
There are two ways to measure lines of code:
- Physical SLOC = Lines of Source Code + Comments + Blank lines1
- Logical SLOC = Number of lines of functional code (“statements”)
1 if blank lines are less the 25% of the section.
Names of SLOC measures:
- K-LOC – 1,000 Lines of Code
- K-DLOC - 1,000 Delivered Lines of Code
- K-SLOC - 1,000 Source Lines of Code
- M-LOC - 1,000,000 Lines of Code
- G-LOC - 1,000,000,000 Lines of Code
- T-LOC - 1,000,000,000,000 Lines of Code
[K – Kilo, M – Mega, G – Giga, T – Tera]
Function Point metric was described by Alan J. Albrecht in 1979 at IBM and official release was in 1984 available for the community. This is a relatively old technique for measurement of certain software module or property. Function Point Analysis (FPA) is a technique accepted by International Organization for Standards (ISO) for measuring functional size of Information Systems (IS). Function Point is a measurement unit or software metric for FPA that is the end-user analysis of the functions needed for the software. FPA doesn’t take into an account technology used for the software project, programming language or tools. Function Points are grouped into five types of functionality:
- Internal Logical Files – logical data that is nurtured inside the application.
- External Interface Files – logical data that resides outside the application but is restrained with the application that is measured.
- External Input – preservation, management and processing of logical data through external input through peripherals and other sources.
- External Output – logical output of data by application.
- External Inquiries – requests and responses for external data procedures.
Each of those types of functionality is given metric called weighting factor. Those weighted functions are known as Unadjusted Function Points (UFPs). Reviewing fourteen General System Characteristics (GSCs) and the summing those assessed GSCs we get Degree of Influence (DI). Technical Complexity Factor (TCF) is 0.65 + 0.01 * DI, further Function Points are calculated UFP’s * TCF. Implementation of Function Point metrics is a very effective way to measure the size of the software at the beginning of the development phase after establishing needs and requirements of the software. Since late 70’s up to nowadays software engineering has been developing and many of the ‘function points’ have been outdated or less relevant to modern software development moving from procedural programming practice to object oriented.
In software development was a need for application extensibility and software reuse. Object Oriented languages like Java and C++ makes it possible to develop applications that are easy to change, add a new functionality by reusing existing resources. By the reuse of existing resources we have to manage dependencies that are between modules of the application and cause of the changes that may arise through the dependencies. These factors are described as software package metrics that are derived from:
- Class Cohesion – a measurement for the unity of purpose in the object or class; elements within the class or object should make as much purpose as possible.
- Coupling – dependency that are between two or more classes that are heavily related together. Changing one class may force drastic changes in other classes that are highly coupled together. Low coupling is by changing one class doesn’t require any changes in another classes or modules because modules communicate together through well established interfaces.
- Cardinality – number of objects in the relationship between classes.
- Open-Closed Principle (OCP) – one of the most important principles of object oriented design. OCP means opened for module extension and closed from modification (changing the source code of the module).
- Single Responsibility – changes may occur in the class domain logic or format, we need to minimize those reasons by separating the class into two or more classes, if we need to introduce a change. This change won’t affect the whole class or module by placing into two separate responsibilities.
- Dependency Inversion Principle (DIP) or Inversion of Control – dependency upon abstraction but not upon concretion. Every dependency should target interface or abstract class.
- Interface Segregation Principle – creation of many client specific interfaces is better that general purpose interface. This is because if many clients depend on the functions of the same class that require change and recompilation of the project.
- Listkov Substitution Principle – subclasses or derived classes should be substitutable for their base classes; the user of the base class should function if the derived class of the base class is passed to the base class.
There are many other principles that make up software package metrics that define measure.
GUI Features Since 70’s and 80’s when dominant were command line interfaces, things started to change with Windows, Icons, Menus and Pointing devices WIMP and Graphical User Interface (GUI) has substituted consol input in late 80’s. It is very difficult to measure the functionality of the application Vs GUI. All depends how efficiently algorithms that solve the problem are implemented to add a new functionality. Significance of the lines contained in the software or lines written is not too much of an importance. Developers working on the project write in average 10 to 15 lines a day. The real importance is what those lines offer to the user, what functionality do they provide behind the graphical interface?
- It is very difficult to measure GUI metrics because user interfaces have special characteristics. Visibility in design plays a very important role. Functions should be visible ‘in-sight’ for users easy to find and predict the steps that follow on the common sense basis.
- Feedback on the notion of visibility, how long should we wait for the response e.g. pressing the button.
- Constraints is a graphical design principle of restraining users of performing particular operations, fading out options on the menu or restricting access to other graphical representations. This minimizes making a wrong choice.
- Consistency is an operation that uses similar elements for achieving similar tasks.
- Affordance is an important features of the design of GUI, that is obvious for the user how to use the application e.g. button is highlighted moving over the mouse pointer, for the user is fairly understandable the operation of pressing on it.
Software Engineering Productivity
Projecting productivity is all about measurements. In general these measurements involve the rate at which a software engineer produces software and the accompanied documentation. While quality is also an important aspect of the produced software the measurement is not quality oriented. In essence the goal is to measure the useful functionality produces in a unit of time.
Metrics on productivity come in two main categories.
- Size of product based on some output from the software process. i.e.. lines of delivered source code, object code instructions, etc.
- Function-related measurements based on the functionality of the deliverables. This is expressed in so called "Function-points".
Function points are determined on the basis of a combination of programmer characteristics such as external inputs and outputs, user interactions, external interfaces and files used by the system followed by associating a weight with each of these characteristics. Finally a "function point count" is computed by multiplying each raw count by the weight and summing all values.
Challenges in Software Metrics
The biggest challenge that lie in measures is mainly the estimation process of the size, the total elapsed time assigned to programming. In addition it can be difficult to define elements such as "a line of code", "Programs to determine part of the system".
General Productivity estimates
Several studies conducted by researches show average lines of code that represent software productivity in systems and commercial applications. For a description of these studies and their result visit UNDERSTANDING SOFTWARE PRODUCTIVITY
Quality and productivity
Although metrics can provide estimates for productivity, they usually do not take quality into account. One can write thousands lines of working code, but the challenge remains to determine how to measure the quality of this code in regard to productivity. by Perez Makerere university
Cost estimations in general are based on a form of reference data also known as Analogy Data. This reference data can be data from previous successful projects, consultancy data, available models such as mathematical relationships or parametric cost models, and rules-of-thumb in software cost estimations. The apply factor of these forms of data depends on the current stage in the software life cycle. A combination of these methods can be especially useful in the early conceptual stages of development when available models are combined with high-level reference data which provide a general concept overview. When the requirements and design becomes clearer in a later stage of the project, more specified functional decompositions are likely to become the main method of cost estimation.
In order make a correct estimate of the costs involved in a project it is important to make a break down of the required work elements. A good tool to do this could be the application of a Work Break-down Structure (WBS). The main initial efforts can be decomposed in the following sections:
- Software Management
- Software Development
- Software Systems Engineering
- Software Engineering
- Software Test Engineering
- System-Level Test Support
- Software Development Test Bed
- Software System-level Test Support
- Assembly, Test, Launch Operations (ATLO) Support for flight projects
- Software Quality Assurance
- Independent Verification and Validation (IV&V)
Suitable domain experts provide estimates for fine-grained tasks, and using suitable software and statistical tools a reliable measurement is achieved. The emphasis is on the knowledge and experience of the expert.
Lederer and Prasad categorise expert judgement as either intuitive or structured. Intuitive expert judgement relies solely on the experience and opinion of the expert. Structured judgement for cost estimation also relies on expert knowledge but validates estimates using historical data and statistical tools.
At the task level, work breakdown structures are often used to achieve a high degree of granularity of tasks. The chosen expect then provides a range of estimated values for the task (actual, best case, worst case). Various statistical formulas are applied to these measures to ensure a reasonable result.
Jørgensen proposes the following steps for estimation experts:
- Evaluate estimation accuracy, but avoid high evaluation pressure
- Avoid conflicting estimation goals
- Ask the estimators to justify and criticize their estimates
- Avoid irrelevant and unreliable estimation information
- Use documented data from previous development tasks
- Find estimation experts with relevant domain background and good estimation records
- Estimate top-down and bottom-up, independently of each other
- Use estimation checklists
- Combine estimates from different experts and estimation strategies
- Assess the uncertainty of the estimate
- Provide feedback on estimation accuracy and development task relations
- Provide estimation training opportunities.
These ideas reflect some of the practices laid out in PSP.
Estimation by Analogy
Use Estimation by Analogy (EBA) to identify completed projects and features that are similar to a new project and use that historical data to create estimates for the cost and effort of a newer project.
EBA hinges on identifying the 'analogies' between previous projects and planned projects. Some software applications exist which can help identify such projects (e.g. The ANGEL Project) By identifying a feature or component that is of similar complexity to a previous feature, and making a reasonable judgement on the relative sizes of the feature, an estimation can be arrived at.
Steve McConnell outlines a five step approach to EBA for a new project as follows:
- Get Detailed Size, Effort, and Cost Results for a Similar Previous Project
- Compare the Size of the New Project to a Similar Past Project
- Build Up the Estimate for the New Project's Size as a Percentage of the Old Project's Size
- Create an Effort Estimate Based on the Size of the New Project Compared to the Previous Project
- Check for Consistent Assumptions Across the Old and New Projects
A few key factors govern the accuracy of estimates:
- Break down the project into relatively detailed features. Measuring too few features will not provide enough accuracy, too much detail meanwhile risks introducing extraneous elements into the model.
- Judgment is required when calculating comparisons. Measurements of completed projects may need to be adjusted to account for complexity, ability or other facts that had a bearing on the outcome of the that project.
- McConnel stresses that you should resist the temptation to inflate estimates to accommodate inherent inaccuracy in estimates. For example, if your estimates suggest a figure between 40 and 50 weeks, resist the temptation to communicate an effort of 55 weeks.
- Software Estimation: Demystifying the Black Art. McConnell, Steve. 2002
- Decision Support Analysis for Software Effort Estimation by Analogy, International Conference on Software Engineering archive, Proceedings of the Third International Workshop on Predictor Models in Software Engineering table of contents, Page 6, Year of Publication: 2007, ISBN:0-7695-2954-2
Cost Estimation Tools
It is estimated that there has been 75 major software cost estimation tools produced over the last decade.
Some Software Measurement tools are listed below:
- The COMOCO site provides various numerous tools that support COCOMO based measurement.
- The ANGEL project from Bournemouth University. Based on research on Analogy method of estimation. The version of the software currently on the site required JDK 1.4 to install but will run with Java 1.4 and above. To get around this problem do the following:
- Save the angel.jar installation file to a directory
- Extract the contents using the command
jar xf angel.jar
- Locate the file copy.jar that was extracted using the previous command
- Extract the contents of copy.jar using the command
jar xf copy.jar
- Locate the bin directory and in there you will find batch files and shell scripts for running ANGEL.
- NASAs NASA Cost Estimation Website provides various cost estimation models. While heavily focused on aerospace and aeronautics, it is still worth looking over.
- QSM provide a suite of products for software products called SLIM tools. QSM are one of the leading providers of commercial software measurement tools.
- Basili, Victor R., et al. (1994). "Goal/Question/Metric Paradigm." Encyclopaedia of Software Engineering, volume 1. New York, NY: John Wiley and Sons. pp. 528–532.
- Lederer, Albert L., and Jayesh Prasad, 1992. "Nine Management Guideliness for Better Cost Estimating," Communications of the ACM, February 1992, pp. 51–59.
- Jørgensen, M. A Review of Studies on Expert Estimation of Software Development Effort. 2002
- Software Estimation: Demystifying the Black Art. McConnell, Steve. 2002, Chapter 11
- How Software Estimation Tools Work, Jones, C., 2005 (http://www.compaid.com/caiinternet/ezine/capers-estimation.pdf - visited 28th Feb 2008)