State transition testing is applicable for any software that has defined states and has events that will cause the transitions between those states (e.g., changing screens). State transition testing can be used at any level of testing. Embedded software, web software, and any type of transactional software are good candidates for this type of testing. Control systems, i.e., traffic light controllers, are also good candidates for this type of testing.
Determining the states is often the most difficult part of defining the state table or diagram. When the software has a user interface, the various screens that are displayed for the user are often used to define the states. For embedded software, the states may be dependent upon the states that the hardware will experience.
Besides the states themselves, the basic unit of state transition testing is the individual transition, also known as a 0-switch. Simply testing all transitions will find some kinds of state transition defects, but more may be found by testing sequences of transactions. A sequence of two successive transitions is called a 1-switch; a sequence of three successive transitions is a 2-switch, and so forth. (These switches are sometimes alternatively designated as N-1 switches, where N represents the number of transitions that will be traversed. A single transition, for instance (a 0-switch), would be a 1-1 switch.
As with other types of test techniques, there is a hierarchy of levels of test coverage. The minimum acceptable degree of coverage is to have visited every state and traversed every transition. 100% transition coverage (also known as 100% 0-switch coverage or 100% logical branch coverage) will guarantee that every state is visited and every transition is traversed, unless the system design or the state transition model (diagram or table) are defective. Depending on the relationships between states and transitions, it may be necessary to traverse some transitions more than once in order to execute other transitions a single time.
The term “n-switch coverage” relates to the number of transitions covered. For example, achieving 100% 1-switch coverage requires that every valid sequence of two successive transitions has been tested at least once. This testing may stimulate some types of failures that 100% 0-switch coverage would miss.
“Round-trip coverage” applies to situations in which sequences of transitions form loops. 100% round- trip coverage is achieved when all loops from any state back to the same state have been tested. This must be tested for all states that are included in loops.
For any of these approaches, a still higher degree of coverage will attempt to include all invalid transitions. Coverage requirements and covering sets for state transition testing must identify whether invalid transitions are included.
Types of Defects
Typical defects include incorrect processing in the current state that is a result of the processing that occurred in a previous state, incorrect or unsupported transitions, states with no exits and the need for states or transitions that do not exist. During the creation of the state machine model, defects may be found in the specification document. The most common types of defects are omissions (there is no information regarding what should actually happen in a certain situation) and contradictions.
Combinatorial Testing Techniques
Combinatorial testing is used when testing software with several parameters, each one with several values, which gives rise to more combinations than are feasible to test in the time allowed. The parameters must be independent and compatible in the sense that any option for any factor can be combined with any option for any other factor. Classification trees allow for some combinations to be excluded, if certain options are incompatible. This does not assume that the combined factors won’t affect each other; they very well might, but should affect each other in acceptable ways.
Combinatorial testing provides a means to identify a suitable subset of these combinations to achieve a predetermined level of coverage. The number of items to include in the combinations can be selected by the Test Analyst, including single items, pairs, triples or more. There are a number of tools available to aid the Test Analyst in this task (see www.pairwise.org for samples). These tools either require the parameters and their values to be listed (pairwise testing and orthogonal array testing) or to be represented in a graphical form (classification trees). Pairwise testing is a method applied to testing pairs of variables in combination. Orthogonal arrays are predefined, mathematically accurate tables that allow the Test Analyst to substitute the items to be tested for the variables in the array, producing a set of combinations that will achieve a level of coverage when tested. Classification tree tools allow the Test Analyst to define the size of combinations to be tested (i.e., combinations of two values, three values, etc.).
The problem of having too many combinations of parameter values manifests in at least two different situations related to testing. Some test cases contain several parameters each with a number of possible values, for instance a screen with several input fields. In this case, combinations of parameter values make up the input data for the test cases. Furthermore, some systems may be configurable in a number of dimensions, resulting in a potentially large configuration space. In both these situations, combinatorial testing can be used to identify a subset of combinations, feasible in size.
For parameters with a large number of values, equivalence class partitioning, or some other selection mechanism may first be applied to each parameter individually to reduce the number of values for each parameter, before combinatorial testing is applied to reduce the set of resulting combinations.
These techniques are usually applied to the integration, system and system integration levels of testing.
The major limitation with these techniques is the assumption that the results of a few tests are representative of all tests and that those few tests represent expected usage. If there is an unexpected interaction between certain variables, it may go undetected with this type of testing if that particular combination is not tested. These techniques can be difficult to explain to a non-technical audience as they may not understand the logical reduction of tests.
Identifying the parameters and their respective values is sometimes difficult. Finding a minimal set of combinations to satisfy a certain level of coverage is difficult to do manually. Tools usually are used to find the minimum set of combinations. Some tools support the ability to force some (sub-) combinations to be included in or excluded from the final selection of combinations. This capability may be used by the Test Analyst to emphasize or de-emphasize factors based on domain knowledge or product usage information.
There are several levels of coverage. The lowest level of coverage is called 1-wise or singleton coverage. It requires each value of every parameter be present in at least one of the chosen combinations. The next level of coverage is called 2-wise or pairwise coverage. It requires every pair of values of any two parameters be included in at least one combination. This idea can be generalized to n-wise coverage, which requires every sub-combination of values of any set of n parameters be included in the set of selected combinations. The higher the n, the more combinations needed to each 100% coverage. Minimum coverage with these techniques is to have one test case for every combination produced by the tool.
Types of Defects
The most common type of defects found with this type of testing is defects related to the combined values of several parameters.
Use Case Testing
Use case testing provides transactional, scenario-based tests that should emulate usage of the system. Use cases are defined in terms of interactions between the actors and the system that accomplish some goal. Actors can be users or external systems.
Use case testing is usually applied at the system and acceptance testing levels. It may be used for integration testing depending on the level of integration and even component testing depending on the behavior of the component. Use cases are also often the basis for performance testing because they portray realistic usage of the system. The scenarios described in the use cases may be assigned to virtual users to create a realistic load on the system.
In order to be valid, the use cases must convey realistic user transactions. This information should come from a user or a user representative. The value of a use case is reduced if the use case does not accurately reflect activities of the real user. An accurate definition of the various alternate paths (flows) is important for the testing coverage to be thorough. Use cases should be taken as a guideline, but not a complete definition of what should be tested as they may not provide a clear definition of the entire set of requirements. It may also be beneficial to create other models, such as flow charts, from the use case narrative to improve the accuracy of the testing and to verify the use case itself.
Minimum coverage of a use case is to have one test case for the main (positive) path, and one test case for each alternate path or flow. The alternate paths include exception and failure paths. Alternate paths are sometimes shown as extensions of the main path. Coverage percentage is determined by taking the number of paths tested and dividing that by the total number of main and alternate paths.
Types of Defects
Defects include mishandling of defined scenarios, missed alternate path handling, incorrect processing of the conditions presented and awkward or incorrect error reporting.
User Story Testing
In some Agile methodologies, such as Scrum, requirements are prepared in the form of user stories which describe small functional units that can be designed, developed, tested and demonstrated in a single iteration. These user stories include a description of the functionality to be implemented, any non-functional criteria, and also include acceptance criteria that must be met for the user story to be considered complete.
User stories are used primarily in Agile and similar iterative and incremental environments. They are used for both functional testing and non-functional testing. User stories are used for testing at all levels with the expectation that the developer will demonstrate the functionality implemented for the user story prior to handoff of the code to the team members with the next level of testing tasks (e.g., integration, performance testing).
Because stories are little increments of functionality, there may be a requirement to produce drivers and stubs in order to actually test the piece of functionality that is delivered. This usually requires an ability to program and to use tools that will help with the testing such as API testing tools. Creation of the drivers and stubs is usually the responsibility of the developer, although a Technical Test Analyst also may be involved in producing this code and utilizing the API testing tools. If a continuous integration model is used, as is the case in most Agile projects, the need for drivers and stubs is minimized.
Minimum coverage of a user story is to verify that each of the specified acceptance criteria has been met.
Types of Defects
Defects are usually functional in that the software fails to provide the specified functionality. Defects are also seen with integration issues of the functionality in the new story with the functionality that already exists. Because stories may be developed independently, performance, interface and error handling issues may be seen. It is important for the Test Analyst to perform both testing of the individual functionality supplied as well as integration testing anytime a new story is released for testing.
A domain is a defined set of values. The set may be defined as a range of values of a single variable (a one-dimensional domain, e.g., “men aged over 24 and under 66”), or as ranges of values of interacting variables (a multi-dimensional domain, e.g., “men aged over 24 and under 66 AND with weight over 69 kg and under 90 kg”). Each test case for a multi-dimensional domain must include appropriate values for each variable involved.
Domain analysis of a one-dimensional domain typically uses equivalence partitioning and boundary value analysis. Once the partitions are defined, the Test Analyst selects values from each partition that represent a value that is in the partition (IN), outside the partition (OUT), on the boundary of the partition (ON) and just off the boundary of the partition (OFF). By determining these values, each partition is tested along with its boundary conditions.
With multi-dimensional domains the number of test cases generated by these methods rises exponentially with the number of variables involved, whereas an approach based on domain theory leads to a linear growth. Also, because the formal approach incorporates a theory of defects (a fault model), which equivalence partitioning and boundary value analysis lack, its smaller test set will find defects in multi-dimensional domains that the larger, heuristic test set would likely miss. When dealing with multi-dimensional domains, the test model may be constructed as a decision table (or “domain matrix”). Identifying test case values for multi-dimensional domains above three dimensions is likely to require computational support.
Domain analysis combines the techniques used for decision tables, equivalence partitioning and boundary value analysis to create a smaller set of tests that still cover the important areas and the likely areas of failure. It is often applied in cases where decision tables would be unwieldy because of the large number of potentially interacting variables. Domain analysis can be done at any level of testing but is most frequently applied at the integration and system testing levels.
Doing thorough domain analysis requires a strong understanding of the software in order to identify the various domains and potential interaction between the domains. If a domain is left unidentified, the testing can be significantly lacking, but it is likely that the domain will be detected because the OFF and OUT variables may land in the undetected domain. Domain analysis is a strong technique to use when working with a developer to define the testing areas.
Minimum coverage for domain analysis is to have a test for each IN, OUT, ON and OFF value in each domain. Where there is an overlap of the values (for example, the OUT value of one domain is an IN value in another domain), there is no need to duplicate the tests. Because of this, the actual tests needed are often less than four per domain.
Types of Defects
Defects include functional problems within the domain, boundary value handling, variable interaction issues and error handling (particularly for the values that are not in a valid domain).
Sometimes techniques are combined to create test cases. For example, the conditions identified in a decision table might be subjected to equivalence partitioning to discover multiple ways in which a condition might be satisfied. Test cases would then cover not only every combination of conditions, but also, for those conditions which were partitioned, additional test cases would be generated to cover the equivalence partitions. When selecting the particular technique to be applied, the Test Analyst should consider the applicability of the technique, the limitations and difficulties, and the goals of the testing in terms of coverage and defects to be detected. There may not be a single “best” technique for a situation. Combined techniques will often provide the most complete coverage assuming there is sufficient time and skill to correctly apply the techniques.
Using Defect-Based Techniques
A defect-based test design technique is one in which the type of defect sought is used as the basis for test design, with tests derived systematically from what is known about the type of defect. Unlike specification-based testing which derives its tests from the specification, defect-based testing derives tests from defect taxonomies (i.e., categorized lists) that may be completely independent from the software being tested. The taxonomies can include lists of defect types, root causes, failure symptoms and other defect-related data. Defect-based testing may also use lists of identified risks and risk scenarios as a basis for targeting testing. This test technique allows the tester to target a specific type of defect or to work systematically through a defect taxonomy of known and common defects of a particular type. The Test Analyst uses the taxonomy data to determine the goal of the testing, which is to find a specific type of defect. From this information, the Test Analyst will create the test cases and test conditions that will cause the defect to manifest itself, if it exists.
Defect-based testing can be applied at any testing level but is most commonly applied during system testing. There are standard taxonomies that apply to multiple types of software. This non-product specific type of testing helps to leverage industry standard knowledge to derive the particular tests. By adhering to industry-specific taxonomies, metrics regarding defect occurrence can be tracked across projects and even across organizations.
Multiple defect taxonomies exist and may be focused on particular types of testing, such as usability. It is important to pick a taxonomy that is applicable to the software being tested, if any are available. For example, there may not be any taxonomies available for innovative software. Some organizations have compiled their own taxonomies of likely or frequently seen defects. Whatever taxonomy is used, it is important to define the expected coverage prior to starting the testing.
The technique provides coverage criteria which are used to determine when all the useful test cases have been identified. As a practical matter, the coverage criteria for defect-based techniques tend to be less systematic than for specification-based techniques in that only general rules for coverage are given and the specific decision about what constitutes the limit of useful coverage is discretionary. As with other techniques, the coverage criteria do not mean that the entire set of tests is complete, but rather that defects being considered no longer suggest any useful tests based on that technique.
Types of Defects
The types of defects discovered usually depend on the taxonomy in use. If a user interface taxonomy is used, the majority of the discovered defects would likely be user interface related, but other defects can be discovered as a byproduct of the specific testing.
Defect taxonomies are categorized lists of defect types. These lists can be very general and used to serve as high-level guidelines or can be very specific. For example, a taxonomy for user interface defects could contain general items such as functionality, error handling, graphics display and performance. A detailed taxonomy could include a list of all possible user interface objects (particularly for a graphical user interface) and could designate the improper handling of these objects, such as:
- Text field
- Valid data is not accepted
- Invalid data is accepted
- Length of input is not verified
- Special characters are not detected
- User error messages are not informative o User is not able to correct erroneous data o Rules are not applied
- Valid dates are not accepted
- Invalid dates are not rejected
- Date ranges are not verified
- Precision data is not handled correctly (e.g., hh:mm:ss)
- User is not able to correct erroneous data
- Rules are not applied (e.g., ending date must be greater than starting date)
There are many defect taxonomies available, ranging from formal taxonomies that can be purchased to those designed for specific purposes by various organizations. Internally developed defect taxonomies can also be used to target specific defects commonly found within the organization. When creating a new defect taxonomy or customizing one that is available, it is important to first define the goals or objectives of the taxonomy. For example, the goal might be to identify user interface issues that have been discovered in production systems or to identify issues related to the handling of input fields.
To create a taxonomy:
- Create a goal and define the desired level of detail
- Select a given taxonomy to use as a basis
- Define values and common defects experienced in the organization and/or from practice outside
The more detailed the taxonomy, the more time it will take to develop and maintain it, but it will result in a higher level of reproducibility in the test results. Detailed taxonomies can be redundant, but they allow a test team to divide up the testing without a loss of information or coverage.
Once the appropriate taxonomy has been selected, it can be used for creating test conditions and test cases. A risk-based taxonomy can help the testing focus on a specific risk area. Taxonomies can also be used for non-functional areas such as usability, performance, etc. Taxonomy lists are available in various publications, from IEEE, and on the Internet.
Experience-based tests utilize the skill and intuition of the testers, along with their experience with similar applications or technologies. These tests are effective at finding defects but not as appropriate as other techniques to achieve specific test coverage levels or produce reusable test procedures. In cases where system documentation is poor, testing time is severely restricted or the test team has strong expertise in the system to be tested, experience-based testing may be a good alternative to more structured approaches. Experience-based testing may be inappropriate in systems requiring detailed test documentation, high-levels of repeatability or an ability to precisely assess test coverage.
When using dynamic and heuristic approaches, testers normally use experience-based tests, and testing is more reactive to events than pre-planned testing approaches. In addition execution and evaluation are concurrent tasks. Some structured approaches to experience-based tests are not entirely dynamic, i.e., the tests are not created entirely at the same time as the tester executes the test.
Note that although some ideas on coverage are presented for the techniques discussed here, experience-based techniques do not have formal coverage criteria.
When using the error guessing technique, the Test Analyst uses experience to guess the potential errors that might have been made when the code was being designed and developed. When the expected errors have been identified, the Test Analyst then determines the best methods to use to uncover the resulting defects. For example, if the Test Analyst expects the software will exhibit failures when an invalid password is entered, tests will be designed to enter a variety of different values in the password field to verify if the error was indeed made and has resulted in a defect that can be seen as a failure when the tests are run.
In addition to being used as a testing technique, error guessing is also useful during risk analysis to identify potential failure modes.
Error guessing is done primarily during integration and system testing, but can be used at any level of testing. This technique is often used with other techniques and helps to broaden the scope of the existing test cases. Error guessing can also be used effectively when testing a new release of the software to test for common mistakes and errors before starting more rigorous and scripted testing. Checklists and taxonomies may be helpful in guiding the testing.
Coverage is difficult to assess and varies widely with the capability and experience of the Test Analyst. It is best used by an experienced tester who is familiar with the types of defects that are commonly introduced in the type of code being tested. Error guessing is commonly used, but is frequently not documented and so may be less reproducible than other forms of testing.
When a taxonomy is used, coverage is determined by the appropriate data faults and types of defects. Without a taxonomy, coverage is limited by the experience and knowledge of the tester and the time available. The yield from this technique will vary based on how well the tester can target problematic areas.
Types of Defects
Typical defects are usually those defined in the particular taxonomy or “guessed” by the Test Analyst, that might not have been found in specification-based testing.
When applying the checklist-based testing technique, the experienced Test Analyst uses a high-level, generalized list of items to be noted, checked, or remembered, or a set of rules or criteria against which a product has to be verified. These checklists are built based on a set of standards, experience, and other considerations. A user interface standards checklist employed as the basis for testing an application is an example of a checklist-based test.
Checklist-based testing is used most effectively in projects with an experienced test team that is familiar with the software under test or familiar with the area covered by the checklist (e.g., to successfully apply a user interface checklist, the Test Analyst may be familiar with user interface testing but not the specific software under test). Because checklists are high-level and tend to lack the detailed steps commonly found in test cases and test procedures, the knowledge of the tester is used to fill in the gaps. By removing the detailed steps, checklists are low maintenance and can be applied to multiple similar releases. Checklists can be used for any level of testing. Checklists are also used for regression testing and smoke testing.
The high-level nature of the checklists can affect the reproducibility of test results. It is possible that several testers will interpret the checklists differently and will follow different approaches to fulfil the checklist items. This may cause different results, even though the same checklist is used. This can result in wider coverage but reproducibility is sometimes sacrificed. Checklists may also result in over- confidence regarding the level of coverage that is achieved since the actual testing depends on the tester’s judgment. Checklists can be derived from more detailed test cases or lists and tend to grow over time. Maintenance is required to ensure that the checklists are covering the important aspects of the software being tested.
The coverage is as good as the checklist but, because of the high-level nature of the checklist, the results will vary based on the Test Analyst who executes the checklist.
Types of Defects
Typical defects found with this technique include failures resulting from varying the data, the sequence of steps or the general workflow during testing. Using checklists can help keep the testing fresh as new combinations of data and processes are allowed during testing.
Exploratory testing is characterized by the tester simultaneously learning about the product and its defects, planning the testing work to be done, designing and executing the tests, and reporting the
results. The tester dynamically adjusts test goals during execution and prepares only lightweight documentation.
Good exploratory testing is planned, interactive, and creative. It requires little documentation about the system to be tested and is often used in situations where the documentation is not available or is not adequate for other testing techniques. Exploratory testing is often used to augment other testing and to serve as a basis for the development of additional test cases.
Exploratory testing can be difficult to manage and schedule. Coverage can be sporadic and reproducibility is difficult. Using charters to designate the areas to be covered in a testing session and time-boxing to determine the time allowed for the testing is one method used to manage exploratory testing. At the end of a testing session or set of sessions, the test manager may hold a debriefing session to gather the results of the tests and determine the charters for the next sessions. Debriefing sessions are difficult to scale for large testing teams or large projects.
Another difficulty with exploratory sessions is to accurately track them in a test management system. This is sometimes done by creating test cases that are actually exploratory sessions. This allows the time allocated for the exploratory testing and the planned coverage to be tracked with the other testing efforts.
Since reproducibility may be difficult with exploratory testing, this can also cause problems when needing to recall the steps to reproduce a failure. Some organizations use the capture/playback capability of a test automation tool to record the steps taken by an exploratory tester. This provides a complete record of all activities during the exploratory session (or any experience-based testing session). Digging through the details to find the actual cause of the failure can be tedious, but at least there is a record of all the steps that were involved.
Charters may be created to specify tasks, objectives, and deliverables. Exploratory sessions are then planned to achieve those objectives. The charter may also identify where to focus the testing effort, what is in and out of scope of the testing session, and what resources should be committed to complete the planned tests. A session may be used to focus on particular defect types and other potentially problematic areas that can be addressed without the formality of scripted testing.
Types of Defects
Typical defects found with exploratory testing are scenario-based issues that were missed during scripted functional testing, issues that fall between functional boundaries, and workflow related issues. Performance and security issues are also sometimes uncovered during exploratory testing.
Applying the Best Technique
Defect- and experience-based techniques require the application of knowledge about defects and other testing experiences to target testing in order to increase defect detection. They range from “quick tests” in which the tester has no formally pre-planned activities to perform, through pre-planned sessions to scripted sessions. They are almost always useful but have particular value in the following circumstances:
- No specifications are available
- There is poor documentation of the system under test
- Insufficient time is allowed to design and create detailed tests Testers are experienced in the domain and/or the technology
- Diversity from scripted testing is a goal to maximize test coverage
- Operational failures are to be analyzed
Defect- and experience-based techniques are also useful when used in conjunction with specification- based techniques, as they fill the gaps in test coverage that result from systematic weaknesses in these techniques. As with the specification-based techniques, there is not one perfect technique for all situations. It is important for the Test Analyst to understand the advantages and disadvantages of each technique and to be able to select the best technique or set of techniques for the situation, considering the project type, schedule, access to information, skills of the tester and other factors that can influence the selection.