P4V, the Perforce Visual Client
Perforce Newsletter: The Head Revision
The Head Revision
   PERFORCE SOFTWARE NEWSLETTER, FALL 2008
 IN THIS ISSUE › Newsletter Home  
› What’s Missing from Your Testing Strategy?
William White, Director of QA/Test, Build/Release Engineering, and Web Infrastructure at Perforce Software, on how SCM Can Reduce Redundancy and Find Defects
› Power to the People!
Robert X. Cringely describes his personal nightmare experience of lost source code.
› Techno-files: P4Ruby
Laura Wingerd, Vice President of Product Technology at Perforce, tells how to load the right P4Ruby for your script.

What’s Missing from Your Testing Strategy?

Using SCM to Reduce Redundancy and Find Defects.

*

Using SCM to Focus Testing

The challenge for testing, especially in a dynamic, iterative development environment, is to know what to test. In the early stages of a project, regular meetings of the designers, the developers, and the testing staff will make it clear where the focus should be. But as development proceeds, and as the project gets closer to being functionally complete, it becomes more difficult to know exactly what the developers have been working on. And if you're facing a big, collaborative project, with hundreds of thousands of lines of code and dozens of programmers all working on it at once, it will become increasingly difficult to know what needs to be tested. SCM can help.

Tracking Change in Real Time

As testing professionals, we know that a high percentage of the defects in a system are going to be in the parts of the code that have changed the most. This is exactly the kind of information an SCM system is designed to track.

With Perforce, the simplest way to keep on top of what has changed is to install the Review Daemon (available on the Related Software page of the Perforce web site) and set the “Reviews:” section of your user account to include all of the directories that contain files that you’re interested in. A user profile that includes a "reviews" entry might look like figure 1.

 User:     bruno
 Email:    bruno@CompanyX.com
 Update:   2005/10/11 13:05:15
 Access:   2007/02/03 15:13:55
 FullName: Bruno Sammartino
 Reviews:
	//CompanyX/ProjectY/main/...	
	//CompanyX/ProjectY/r06.2/...
 Figure 1	

With this entry in his user profile, bruno will receive email whenever a change is made in the main development line or in the 6.2 release line for ProjectY.

Specifying directories for review has two immediate benefits. First, it provides assurance that any changes the programmers make won’t slip by unnoticed, and second, it lets the tester begin to get a feel for what features are implemented in what parts of the code. This knowledge becomes very important as the project proceeds. If you’re aware of the relationship between features and the source code that implements them, then you’re more likely to know what to do when you see a code change come through later, even if the description that the submitter entered for the change is cryptic or incomplete.

Reviewing Change History

A steady, unsolicited stream of email notifications describing individual change submissions is a useful thing; it does keep you on top of what's happening in the code you’re testing. But there's a limit to how helpful this email can be. If a milestone in the project is coming due, you might want to take a step back and get an overview of what’s been happening. Your SCM system can help with this, too. It’s easy to generate a report that summarizes changes that have been made to the code base; Perforce uses the p4 changes command for this purpose:

$ p4 changes

This command lists all the changes that have ever been made to any part of your source code repository, which is probably a little more information than you really want. But it's easy to narrow the scope. For example, you can limit the list to changes affecting a particular subset of the source tree by specifying a partial directory path:

$ p4 changes //CompanyX/ProjectY/main/...

The ellipsis at the end of this path tells the server that you want to include any subdirectories that occur beneath the last directory you specified in the path.

You can further narrow the scope by specifying a date range. For example, if you want to see all of the changes made in the "main" directory tree of the ProjectY development branch between November 20, 2006 and November 28, 2006, you can use a command like this:

$ p4 changes //CompanyX/ProjectY/main/...@2006/11/20,2006/11/28

This command generates an inventory of test objectives. It's certainly not a complete list of everything that needs to be tested, but it is a complete list of everything that’s changed, and that's where the defects are likely to be.

You can generate a report like this at any time just by typing a simple command. Or you can set up an automated script that will run the report on whatever schedule makes sense to you. It can email you the results or post them to an internal Web page, so you'll have a nice activity summary whenever you need it. Either way, now you have a better grip on what’s been changing and where you need to be especially thorough in your testing. Since the output from the p4 changes command includes the individual change submission numbers, you can drill down on any change of interest and see as much information as you need about it, right down to the individual lines of the individual files that were updated by any given change.

Using Scripts to Summarize Change

When you're deep into a big project, a stream of email reports that detail individual change submissions might be too much data to be useful. Often what you really need is a summary. Using simple scripts, you can mine the information from your SCM system and summarize it in ways that you find useful. For example, it's not too hard to write a script that pulls the raw data from p4 changes commands and generates a useful summary report.

Let’s take a look at an example of a simple script that can tell you something about the volatility of your source code.

The Ruby script volatility.rb runs a series of commands and produces a report that shows which directories have seen a high rate of change and which have not. The script takes a parameter in the form of a depot directory path, analyzes all of the recent changes that have affected files in that path, counts the number of files that were touched by each of those changes, and prints a list of the directories containing those files, from the most volatile—the ones whose files have been changed the most often—to the least.

Figure 2 shows what the output from a run of volatility.rb might look like.

    $ volatility.rb //CompanyX/ProjectY/main/... 
 Change summary: Most recent 100 changes for //CompanyX/ProjectY/main/...   
  

 //CompanyX/ProjectY/main/ModuleA                76.2%
 //CompanyX/ProjectY/main/ModuleC                16.7%
 //CompanyX/ProjectY/main/ModuleX                 5.6%
 //CompanyX/ProjectY/main/ModuleY                 1.5%

 Total files affected: 324

Figure 2

The last one hundred changes made in the //CompanyX/ProjectY/main path in our source tree touched 324 individual files, and 76.2 percent of them were in ModuleA of the project. Clearly, the developers assigned to this module have been busy, and we can assume that it needs to be thoroughly tested. Conversely, we know that any projects whose source directories aren't listed here haven’t been changed recently.

How the script works

The script starts by running p4 changes:

 # ---------------------------------------------------------------
 # Run the "p4 changes" command to get a list of the target
 # changelist numbers.

 def run_changes_command( depot_path, changelist_count )
  cmd = sprintf("p4 changes -m %s %s", changelist_count,
     depot_path)
  changes_out = `#{cmd}`
  return changes_out
 end
 # ---------------------------------------------------------------

Figure 3

The depot_path was entered as a parameter on the command line, and the changelist_count is the number of recent changes to include in the analysis. If no changelist_count is specified on the command line, the script will report the most recent one hundred changes that affected files in the specified directory path.

The raw output from this command is shown in figure 4.

    $ p4 changes -m 100 //CompanyX/ProjectY/main/...
Change 776 on 2005/01/27 by ines@ines-rose 'Add hooks to simplify debugging '
Change 775 on 2005/01/27 by gale@gale-cedar 'First set of user documents '
Change 768 on 2005/01/21 by quinn@quinn-dev-azalea 'Add jamgraph image'
Change 766 on 2005/01/21 by quinn@quinn-dev-azalea 'Merge changes '
  ...

Figure 4

Next, the script looks at each line of that output, extracts the changelist number, and calls
p4 describe (see figure 5).

 # ---------------------------------------------------------------
 # Take one line of output from "p4 changes", run "p4 describe -s"
 # on the change number, and peel out the list of filenames
 # affected by it. 
 def get_file_lines( change_line )

  split_line = change_line.split
  p4cmd = "p4 describe -s "
  p4cmd << split_line[1]
  describe_output = `#{p4cmd}`
  file_lines = ""
  describe_output.each_line { |theline| 
   if  theline.index("... //")  == 0 
     then 
     file_lines << theline
     end 
   }
   return file_lines
 end
 # ---------------------------------------------------------------

Figure 5

By default, the output from a p4 describe command includes the changelist description, a list of the files affected by the change, and a block of text in Unix diff format showing the individual changes made to each file affected by the change. Since our script isn’t interested in the individual file differences, we exclude that output with the "-s" (summary) option. The output for the first change is shown in figure 6.

   $ p4 describe -s 776

 Change 776 by ines@ines-rose on 2005/01/27 08:58:34 

   Add hooks to simplify debugging in ModuleA 

 Affected files ... 

 ... //CompanyX/ProjectY/ModuleA/main_module.cpp#5 edit
 ... //CompanyX/ProjectY/ModuleA/main_module.h#5 edit
 ... //CompanyX/ProjectY/ModuleA/sub_module.cpp#16 edit


Figure 6

The volatility.rb script captures the file list for each changelist submission and appends it to an internal array of file paths. That array is then sorted, and the number of occurrences of each file is counted and added to a running total for each directory. Finally, the totals are sorted from highest to lowest, percentages are calculated, and the summary is printed.

Other Available Information

This script is just one example of how your SCM system can help guide your testing efforts. But if you think about the kind of information that an SCM system manages—all of the details about which files were changed, when, how, and by whom—you see that much of this information is of interest to testers.

If your SCM system is integrated with your defect-tracking system, you have a genuine motherlode of testing and quality data. In addition to identifying which areas in the source code have been changing a lot, which individual files have been modified frequently, and how extensive those modifications have been, you can track defect fixes back to the files where the errors occurred. Then you can pinpoint the parts of your code that have been responsible for higher numbers of defects; those parts are likely to have more defects going forward, too. You could find out how many defects your testing team caught before your last release went out and compare that to the number that your customers subsequently found, and you could compare those numbers from one release to the next and get a sense for whether your processes are improving or going downhill.

Conclusion

We've only touched the tip of the proverbial iceberg with regard to SCM as a tool for testers, but it's clear that SCM isn’t just for developers and release managers anymore. As testers, if we take advantage of the information that can be found in our SCM systems and use it to help guide our testing efforts at each phase of the development process, we'll not only be more effective at finding defects, but we may also find that our development colleagues will come to see us in a somewhat different light. As we learn more about how the source code that implements our products is organized, which files are associated with which features, and which modules are more—or less—prone to defects, we may find that we are seen less as the bottleneck to getting a release out and more as valued contributors to the success of each new product or release.

The volatility.rb script is available for download.

William W. White currently directs the QA/test, build/release, and web infrastructure teams at Perforce Software. He has more than 20 years of experience in software application design and development, porting, technical writing, and QA.

A version of this article first appeared in CIE in August 2008, and Better Software in July 2007.

› Subscribe    › Unsubscribe    › RSS Feeds    › More About Perforce
Copyright © 2008 Perforce Software, Inc.
The Fast Software Configuration Management System.
Visit us online at www.perforce.com. We welcome your feedback.
Perforce Software
2320 Blanding Avenue
Alameda, CA 94501
If you'd prefer not to receive e-mail like this from Perforce in the future, please click here to unsubscribe.
We have created this email privacy policy to demonstrate our firm commitment to your privacy and the protection of your information.