• Download PDF
  • Share X Facebook Email LinkedIn
  • Permissions

What Are Factorial Experiments and Why Can They Be Helpful?

  • 1 Johnson & Johnson Global Epidemiology, Titusville, New Jersey

Kaplan and colleagues 1 report the findings of 2 studies that were part of a randomized clinical trial that aimed to determine whether a combined intervention for adolescents, using both light exposure during sleep and cognitive behavioral therapy, would encourage them to get to sleep earlier than usual, thereby increasing total sleep time. They conducted 2 similar, but somewhat different, studies.

In the first study, participants were randomly assigned to receive either 3 weeks of light flashes (light alone) or a sham light intervention. The light flashes were brief light pulses administered in 3-millisecond bursts delivered 20 seconds apart, starting 3 hours before the targeted wake time for the individual participant. In the second study, participants were randomized to a combination of light plus cognitive behavioral therapy or sham light therapy plus cognitive behavioral therapy. The main outcomes were self-reported sleep times, momentary ratings of evening sleepiness, and subjective measures of sleepiness and sleep quality.

Kaplan et al 1 found that, in study 1, light therapy alone did not change sleep timing. However, in the second study, light plus behavioral therapy significantly moved sleep onset approximately 50 minutes earlier, on average, and increased nightly sleep time by approximately 43 minutes. There were also improvements in several secondary outcomes.

This combination of studies is similar in some respects to a factorial experiment but lacks certain unique advantages of factorial studies. The purpose of this commentary is to elaborate on those potential advantages of factorial studies, referring back to the article by Kaplan et al 1 for context.

Factorial designs are used to test more than 1 experimental factor (whence the name) in the context of a single study. In the studies by Kaplan and colleagues, 1 the 2 experimental factors were light and cognitive therapy. The first study addressed the efficacy of light in the absence of behavioral therapy. The second study tested the efficacy of light in the presence of behavioral therapy. The 2 studies were conducted sequentially. In a traditional factorial experiment, participants would have been randomized to 1 of 4 groups: light alone, cognitive therapy alone, both light and cognitive therapy, and neither. Another way to think of the design is that participants would be randomized to light vs sham; then, within each of those groups, they would be randomized again to cognitive behavioral therapy or no therapy.

The potential for statistical interaction, ie, when the effect of an intervention depends on the presence or absence of another intervention, is both a strength and a limitation. The strength is that the question of differential effects may actually be of scientific and clinical interest. The limitation is that if the interest of the investigators is solely on the effects of each intervention, assuming that there is no interaction, then the statistical power to address those individual questions is potentially compromised. 2 Again, in the studies by Kaplan et al, 1 the question is whether light flashes have different effects in the presence or absence of behavioral therapy.

In this situation, the investigators were uninterested in the effects of behavioral therapy alone. Nonetheless, the conducted studies confounded several design features that slightly complicate the interpretation of their findings. Just by virtue of doing separate studies, the opportunity for increased variability is introduced. The expectation in conducting randomized trials is that results can vary from study to study even when the designs are similar. The fact that the studies were conducted sequentially allows for further variability due to calendar time. It may be that neither of these sources of variability would be meaningful for these studies, but the design does not allow disaggregation of these extraneous features. The timing of the administration of light flashes (starting 3 hours before wake time in study 1 vs 2 hours before wake time in study 2) also changed. In this case, different findings about the efficacy of light therapy could be (and probably are) due to the presence of behavioral therapy, but it is also possible, especially given the small sample size in the second study, that unrelated study-to-study variability could have produced the findings (not to mention the timing of the administration of light flashes).

The use of factorial designs in medicine is not new. A very famous example was the Physicians’ Health Study, the design of which is discussed in a methodological article. 3 Their goal was to test 2 hypotheses in the same study: efficacy of aspirin in prevention of cardiovascular disease and of beta carotene in the prevention of cancer among US physicians.

For beta carotene, 4 with follow-up over nearly 13 years, the trial found no significant decrease in cancer risk overall, or in risk of prostate, colon, or lung cancer. For myocardial infarction, 5 there was a 44% reduction in risk (relative risk, 0.56; 95% CI, 0.45-0.70). There was no reduction in overall mortality (relative risk, 0.96; 95% CI, 0.60-1.54).

Given the potential advantages of factorial experiments, one might wonder why they are not used more frequently. As noted above, there can be reduced statistical power for testing the effects of each treatment when there is a statistical interaction between treatments characterized by 1 treatment (either presence or absence) reducing the effect of the other. An example of this is provided in the study by Jaki and Vasileiou. 2 For any study with more than 1 treatment, there may also be logistical complexities introduced by the need for multiple placebos, with the associated burden on trial participants, who are asked to take multiple medications. My own view is that factorial studies are underused, for the reasons I have described. In particular, the ability to eliminate confounding between study, calendar time, and the 2 factors should be very appealing. Furthermore, if we are confident about the lack of interaction, the factorial design can be very efficient (because we have 2 studies done at the same time). When we are actually interested in testing the interaction, the factorial design offers a great opportunity. That said, Jaki and Vasileiou 2 suggest the use of an alternative design that accommodates multiple treatments and note that these other designs can be equally, or more, statistically efficient. The theme is the same, though, that testing multiple treatments in the same study is advantageous compared with doing separate studies.

In summary, the authors’ conclusions about the efficacy of light flashes are likely to be correct (within the limits of self-reported outcomes, which they note). That said, when designing studies that address more than 1 question, when possible, one should consider the potential advantages of unconfounding those factors by conducting a factorial experiment.

Published: September 25, 2019. doi:10.1001/jamanetworkopen.2019.11917

Open Access: This is an open access article distributed under the terms of the CC-BY License . © 2019 Berlin JA. JAMA Network Open .

Corresponding Author: Jesse A. Berlin, ScD, Johnson & Johnson Global Epidemiology, 1125 Trenton-Harbourton Road, J3, Zone 1, Titusville, NJ 08560 ( [email protected] ).

Conflict of Interest Disclosures: None reported.

See More About

Berlin JA. What Are Factorial Experiments and Why Can They Be Helpful? JAMA Netw Open. 2019;2(9):e1911917. doi:10.1001/jamanetworkopen.2019.11917

Manage citations:

© 2024

Select Your Interests

Customize your JAMA Network experience by selecting one or more topics from the list below.

  • Academic Medicine
  • Acid Base, Electrolytes, Fluids
  • Allergy and Clinical Immunology
  • American Indian or Alaska Natives
  • Anesthesiology
  • Anticoagulation
  • Art and Images in Psychiatry
  • Artificial Intelligence
  • Assisted Reproduction
  • Bleeding and Transfusion
  • Caring for the Critically Ill Patient
  • Challenges in Clinical Electrocardiography
  • Climate and Health
  • Climate Change
  • Clinical Challenge
  • Clinical Decision Support
  • Clinical Implications of Basic Neuroscience
  • Clinical Pharmacy and Pharmacology
  • Complementary and Alternative Medicine
  • Consensus Statements
  • Coronavirus (COVID-19)
  • Critical Care Medicine
  • Cultural Competency
  • Dental Medicine
  • Dermatology
  • Diabetes and Endocrinology
  • Diagnostic Test Interpretation
  • Drug Development
  • Electronic Health Records
  • Emergency Medicine
  • End of Life, Hospice, Palliative Care
  • Environmental Health
  • Equity, Diversity, and Inclusion
  • Facial Plastic Surgery
  • Gastroenterology and Hepatology
  • Genetics and Genomics
  • Genomics and Precision Health
  • Global Health
  • Guide to Statistics and Methods
  • Hair Disorders
  • Health Care Delivery Models
  • Health Care Economics, Insurance, Payment
  • Health Care Quality
  • Health Care Reform
  • Health Care Safety
  • Health Care Workforce
  • Health Disparities
  • Health Inequities
  • Health Policy
  • Health Systems Science
  • History of Medicine
  • Hypertension
  • Images in Neurology
  • Implementation Science
  • Infectious Diseases
  • Innovations in Health Care Delivery
  • JAMA Infographic
  • Law and Medicine
  • Leading Change
  • Less is More
  • LGBTQIA Medicine
  • Lifestyle Behaviors
  • Medical Coding
  • Medical Devices and Equipment
  • Medical Education
  • Medical Education and Training
  • Medical Journals and Publishing
  • Mobile Health and Telemedicine
  • Narrative Medicine
  • Neuroscience and Psychiatry
  • Notable Notes
  • Nutrition, Obesity, Exercise
  • Obstetrics and Gynecology
  • Occupational Health
  • Ophthalmology
  • Orthopedics
  • Otolaryngology
  • Pain Medicine
  • Palliative Care
  • Pathology and Laboratory Medicine
  • Patient Care
  • Patient Information
  • Performance Improvement
  • Performance Measures
  • Perioperative Care and Consultation
  • Pharmacoeconomics
  • Pharmacoepidemiology
  • Pharmacogenetics
  • Pharmacy and Clinical Pharmacology
  • Physical Medicine and Rehabilitation
  • Physical Therapy
  • Physician Leadership
  • Population Health
  • Primary Care
  • Professional Well-being
  • Professionalism
  • Psychiatry and Behavioral Health
  • Public Health
  • Pulmonary Medicine
  • Regulatory Agencies
  • Reproductive Health
  • Research, Methods, Statistics
  • Resuscitation
  • Rheumatology
  • Risk Management
  • Scientific Discovery and the Future of Medicine
  • Shared Decision Making and Communication
  • Sleep Medicine
  • Sports Medicine
  • Stem Cell Transplantation
  • Substance Use and Addiction Medicine
  • Surgical Innovation
  • Surgical Pearls
  • Teachable Moment
  • Technology and Finance
  • The Art of JAMA
  • The Arts and Medicine
  • The Rational Clinical Examination
  • Tobacco and e-Cigarettes
  • Translational Medicine
  • Trauma and Injury
  • Treatment Adherence
  • Ultrasonography
  • Users' Guide to the Medical Literature
  • Vaccination
  • Venous Thromboembolism
  • Veterans Health
  • Women's Health
  • Workflow and Process
  • Wound Care, Infection, Healing

Get the latest research based on your areas of interest.

Others also liked.

  • Register for email alerts with links to free full-text articles
  • Access PDFs of free articles
  • Manage your interests
  • Save searches and receive search alerts

Explore Psychology

What Is a Factorial Design? Definition and Examples

Categories Dictionary

advantages of factorial experimental design

A factorial design is a type of experiment that involves manipulating two or more variables. While simple psychology experiments look at how one independent variable affects one dependent variable, researchers often want to know more about the effects of multiple independent variables.

Table of Contents

How a Factorial Design Works

Let’s take a closer look at how a factorial design might work in a psychology experiment:

  • The independent variable is the variable of interest that the experimenter will manipulate.
  • The dependent variable is the variable that the researcher then measures.

By doing this, psychologists can see if changing the independent variable results in some type of change in the dependent variable.

For example, imagine that a researcher wants to do an experiment looking at whether sleep deprivation hurts reaction times during a driving test. If she were only to perform the experiment using these variables–the sleep deprivation being the independent variable and the performance on the driving test being the dependent variable–it would be an example of a simple experiment.

However, let’s imagine that she is also interested in learning if sleep deprivation impacts the driving abilities of men and women differently. She has just added a second independent variable of interest (sex of the driver) into her study, which now makes it a factorial design.

Types of Factorial Designs

One common type of experiment is known as a 2×2 factorial design. In this type of study, there are two factors (or independent variables), each with two levels.

The number of digits tells you how many independent variables (IVs) there are in an experiment, while the value of each number tells you how many levels there are for each independent variable.

So, for example, a 4×3 factorial design would involve two independent variables with four levels for one IV and three levels for the other IV.

Advantages of a Factorial Design

One of the big advantages of factorial designs is that they allow researchers to look for interactions between independent variables.

An interaction is a result in which the effects of one experimental manipulation depends upon the experimental manipulation of another independent variable.

Example of a Factorial Design

For example, imagine that researchers want to test the effects of a memory-enhancing drug. Participants are given one of three different drug doses, and then asked to either complete a simple or complex memory task.

The researchers note that the effects of the memory drug are more pronounced with the simple memory tasks, but not as apparent when it comes to the complex tasks. In this 3×2 factorial design, there is an interaction effect between the drug dosage and the complexity of the memory task.

Understanding Variable Effects in Factorial Designs

So if researchers are manipulating two or more independent variables, how exactly do they know which effects are linked to which variables?

“It is true that when two manipulations are operating simultaneously, it is impossible to disentangle their effects completely,” explain authors Breckler, Olson, and Wiggins in their book Social Psychology Alive .

“Nevertheless, the researchers can explore the effects of each independent variable separately by averaging across all levels of the other independent variable . This procedure is called looking at the main effect.”

Examples of Factorial Designs

A university wants to assess the starting salaries of their MBA graduates. The study looks at graduates working in four different employment areas: accounting, management, finance, and marketing.

In addition to looking at the employment sector, the researchers also look at gender. In this example, the employment sector and gender of the graduates are the independent variables, and the starting salaries are the dependent variables. This would be considered a 4×2 factorial design.

Researchers want to determine how the amount of sleep a person gets the night before an exam impacts performance on a math test the next day. But the experimenters also know that many people like to have a cup of coffee (or two) in the morning to help them get going.

So, the researchers decided to look at how the amount of sleep and caffeine influence test performance. 

The researchers then decided to look at three levels of sleep (4 hours, 6 hours, and 8 hours) and only two levels of caffeine consumption (2 cups versus no coffee). In this case, the study is a 3×2 factorial design.

Baker TB, Smith SS, Bolt DM, et al. Implementing clinical research using factorial designs: A primer .  Behav Ther . 2017;48(4):567-580. doi:10.1016/j.beth.2016.12.005

Collins LM, Dziak JJ, Li R. Design of experiments with multiple independent variables: a resource management perspective on complete and reduced factorial designs .  Psychol Methods . 2009;14(3):202-224. doi:10.1037/a0015826

Haerling Adamson K, Prion S. Two-by-two factorial design .  Clin Simul Nurs . 2020;49:90-91. doi:10.1016/j.ecns.2020.06.004

Watkins ER, Newbold A. Factorial designs help to understand how psychological therapy works .  Front Psychiatry . 2020;11:429. doi:10.3389/fpsyt.2020.00429

advantages of factorial experimental design

Wallstreet Logo

Trending Courses

Course Categories

Certification Programs

  • Free Courses

Statistics Resources

  • Free Practice Tests
  • On Demand Webinars

Factorial Design

Last Updated :

21 Aug, 2024

Blog Author :

Edited by :

Ashish Kumar Srivastav

Reviewed by :

Dheeraj Vaidya

What Is A Factorial Design?

Factorial design is a statistical experimental design used to investigate the effects of two or more independent variables (factors) on a dependent variable. By manipulating the levels of the characteristics and measuring the resulting impact on the dependent variable, researchers can identify each element's unique contributions and their combined or interactive effects.

Factorial Design

These are beneficial when investigating interactions between variables. They allow researchers to explore how one factor's effects may depend on another element's levels. This can provide valuable insights into the underlying mechanisms. It further drives the observed impacts and helps identify potential moderators or mediators of the relationship between variables.

Table of contents

Factorial design explained.

  • Advantages and disadvantages

Frequently Asked Questions (FAQs)

Recommended articles.

  • Factorial designs allow researchers to investigate the impacts of independent variables on a dependent variable in a single experiment. This can save time and resources and provide a nuanced understanding of the relationships between variables.
  • It can identify the main effects and interaction effects between independent variables. This provides insights into the unique contributions of each variable and how they interact with one another.
  • It can increase the statistical power of a study by manipulating multiple independent variables. This improves the likelihood of detecting meaningful effects.

Factorial designs are robust and widely used experimental designs in research. The origins of factorial designs trace back to the 20th century. This was with the development of analysis of variance (ANOVA) and the work of Ronald A. Fisher.

These are relevant because they allow researchers to investigate the effects of factors on a dependent variable in a controlled setting. By manipulating the levels of the elements, researchers can determine the unique contributions of each component, as well as any interactive effects that may exist between them. This information helps develop interventions or treatments tailored to a population's needs or context.

For example, a 2x2 factorial design would involve testing two factors, each with two levels. This would result in four different conditions (2x2=4), representing all possible combinations of the levels of the two factors. By manipulating the stories of the factors in each state and measuring the effect on the dependent variable, researchers can determine the main results of each element and any interaction effects between them.

It can be classified into several types. Based on the number of independent variables (factors) and levels used in the experiment. Some common types of it include:

  • 2x2 factorial design : It involves two independent variables, each with two levels. It is popular in psychological research to investigate the effects of two factors on behavior or outcome.
  • 3x3 factorial design : It involves three independent variables, each with three levels. It helps investigate the effects of multiple factors on behavior or outcome and can be particularly useful in medical research.
  • Mixed factorial design : This design involves at least one independent variable manipulated within subjects (i.e., each participant experiences all levels of the variable). And at least one independent variable between subjects (i.e., each participant experiences only one level of the variable).
  • Nested factorial design: This design involves one independent variable that is in alignment with another independent variable. For instance, a study on different types of therapy for depression might have one independent variable that represents the type of therapy (cognitive-behavioral therapy, psychoanalytic therapy, etc.) and another independent variable that represents the therapist administering the treatment.
  • Fractional factorial design : This design involves testing only a subset of possible combinations of levels of the independent variables. This can be useful when resources are limited or when trying all possible combinations would be impractical.

Let us understand it better with the help of some examples:

Suppose a study on the effects of different types of online learning environments and study strategies on academic performance in college students is carried out. The study could use a 2x2 factorial design, with two independent variables (learning environment and study strategy) and two levels of each independent variable.

The learning environments could be synchronous online learning (i.e., live classes with real-time interaction), asynchronous online learning (i.e., pre-recorded lessons with discussion boards), or a combination. The study strategies are self-regulated learning (i.e., self-paced and self-directed study), collaborative learning (i.e., group work and peer feedback), or a combination.

Participants would be assigned groups: synchronous learning with self-regulated study, asynchronous understanding with collaborative study, synchronous and asynchronous learning with self-regulated study, or synchronous and asynchronous learning with collaborative research. The dependent variable would be the participants' academic performance, measured by their grades in a specific course or course.

By manipulating the levels of the learning environment and study strategy and measuring their combined and individual effects on academic performance, this study could provide valuable insights into the most effective approaches to online learning for college students.

Recently a study was published in The Lancet medical journal in 2021 . The study investigated the effects of different COVID-19 vaccines on the immune response in older adults at higher risk of illness and death from COVID-19.

The study used a 2x2 factorial design. Participants were assigned to groups based on two independent variables: vaccine type (Pfizer-BioNTech or Oxford-AstraZeneca) and dosing interval (3 or 12 weeks between doses). The dependent variable was the level of antibodies against the SARS-CoV-2 virus in the participants' blood.

The study results showed that the Pfizer-BioNTech vaccine produced a higher antibody response than the Oxford-AstraZeneca vaccine, regardless of the dosing interval. However, the dosing break significantly affected the antibody response, with a longer gap between doses resulting in higher levels of antibodies in both vaccine groups.

Advantages And Disadvantages

The advantages are as follows:

  • Ability to investigate multiple factors : These allow researchers to investigate the effects of various independent on a dependent variable in a single experiment, which can save time and resources.
  • Identification of main effects and interactions : These enable researchers to identify the main products of each independent variable and any interaction effects between them, providing a more nuanced understanding of the relationships between variables.
  • Increased statistical power : By manipulating multiple independent variables, factorial designs can increase the statistical power of a study and improve the likelihood of detecting meaningful effects.
  • Flexibility : These adapt to various research questions and uses in multiple fields, including psychology, education, medicine, and engineering.

The disadvantages are as follows:

  • Increased complexity : Using multiple independent variables can interpret results more complexly, mainly when interaction effects are present.
  • Increased sample size requirements : A larger sample size is preferable to a more straightforward design with fewer independent variables to power such a design adequately.
  • Potential for confounding : The presence of interaction effects can make it challenging to determine which independent variable is responsible for observed effects, potentially confounding the results.
  • Limited generalizability : The specific conditions of it may not be generalizable to other contexts.

A main effect refers to the impact of a single independent variable on the dependent variable. In contrast, an interaction effect refers to the combined effect of two or more independent variables on the dependent variable. Here, both direct and interaction effects can be present.

In a between-subjects design, participants are with only one independent variable level. In such scenarios, each participant is open to all possible combinations of stories of the independent variables.

Yes, these can be helpful in observational and experimental research. In observational studies, the independent variables are not manipulated, but their effects on the dependent variable can still be investigated using a factorial design.

This article has been a guide to what is Factorial Design. Here, we explain the topic in detail with its examples, advantages, disadvantages, and types. You may also find some helpful articles here -

  • Permutation
  • Poisson Distribution
  • Binomial Distribution Formula

Youtube

Teach yourself statistics

What is a Full Factorial Experiment?

This lesson describes full factorial experiments. Specifically, the lesson answers four questions:

  • What is a full factorial experiment?
  • What causal effects can we test in a full factorial experiment?
  • How should we interpret causal effects?
  • What are the advantages and disadvantages of a full factorial experiment?

What is a Factorial Experiment?

A factorial experiment allows researchers to study the joint effect of two or more factors on a dependent variable . Factorial experiments come in two flavors: full factorials and fractional factorials. In this lesson, we will focus on the full factorial experiment, not the fractional factorial.

Full Factorial Experiment

A full factorial experiment includes a treatment group for every combination of factor levels. Therefore, the number of treatment groups is the product of factor levels. For example, consider the full factorial design shown below:

  A A
B B B B B B
C Grp 1 Grp 2 Grp 3 Grp 4 Grp 5 Grp 6
C Grp 7 Grp 8 Grp 9 Grp 10 Grp 11 Grp 12
C Grp 13 Grp 14 Grp 15 Grp 16 Grp 17 Grp 18
C Grp 19 Grp 20 Grp 21 Grp 22 Grp 23 Grp 24

Factor A has two levels, factor B has three levels, and factor C has four levels. Therefore, this full factorial design has 2 x 3 x 4 = 24 treatment groups.

Full factorial designs can be characterized by the number of treatment levels associated with each factor, or by the number of factors in the design. Thus, the design above could be described as a 2 x 3 x 4 design (number of treatment levels) or as a three-factor design (number of factors).

Fractional Factorial Experiments

The other type of factorial experiment is a fractional factorial. Unlike full factorial experiments, which include a treatment group for every combination of factor levels, fractional factorial experiments include only a subset of possible treatment groups.

Causal Effects

A full factorial experiment allows researchers to examine two types of causal effects: main effects and interaction effects. To facilitate the discussion of these effects, we will examine results (mean scores) from three 2 x 2 factorial experiments:

Experiment I: Mean Scores

  A A
B 5 2
B 2 5

Experiment II: Mean Scores

  C C
D 5 4
D 4 1

Experiment III: Mean Scores

  E E
F 5 3
F 3 1

Main Effects

In a full factorial experiment, a main effect is the effect of one factor on a dependent variable, averaged over all levels of other factors. A two-factor factorial experiment will have two main effects; a three-factor factorial, three main effects; a four-factor factorial, four main effects; and so on.

How to Measure Main Effects

To illustrate what is going on with main effects, let's look more closely at the main effects from Experiment I:

Assuming there were an equal number of observations in each treatment group, we can compute the main effect for Factor A as shown below:

Effect of A at level B 1 = A 2 B 1 - A 1 B 1 = 2 - 5 = -3

Effect of A at level B 2 = A 2 B 2 - A 1 B 2 = 5 - 2 = +3

Main effect of A = ( -3 + 3 ) / 2 = 0

And we can compute the main effect for Factor B as shown below:

Effect of B at level A 1 = A 1 B 2 - A 1 B 1 = 5 - 2 = +3

Effect of B at level A 2 = A 2 B 2 - A 2 B 1 = 2 - 5 = -3

Main effect of B = ( 3 - 3 ) / 2 = 0

In a similar fashion, we can compute main effects for Experiment II (see Problem 1 ) and Experiment III (see Problem 2 ).

Warning: In a full factorial experiment, you should not attempt to interpret main effects until you have looked at interaction effects. With that in mind, let's look at interaction effects for Experiments I, II, and III.

Interaction Effects

In a full factorial experiment, an interaction effect exists when the effect of one independent variable depends on the level of another independent variable.

When Interactions Are Present

The presence of an interaction can often be discerned when factorial data are plotted. For example, the charts below plot mean scores from Experiment I and from Experiment II:

Experiment I

Experiment II

In Experiment I, consider how the dependent variable score is affected by level A1 versus level A2. In the presence of B1, the dependent variable score is bigger for A1 than for A2. But in the presense of B2, the reverse is true - the dependent variable score is bigger for A2 than for A1.

In Experiment II, level C1 is associated with a little bit bigger dependent variable score in the presence of D1; but a much bigger dependent variable score in the presence of D2.

In both charts, the way that one factor affects the dependent variable depends on the level of another factor. This is the definition of an interaction effect. In charts like these, the presence of an interaction is indicated by non-parallel plotted lines.

Note: These charts are called interaction plots. For guidance on creating and interpreting interaction plots, see Interaction Plots .

When Interactions Are Absent

Now, look at the chart below, which plots mean scores from Experiment III:

Experiment III

In this chart, E1 has the same effect on the dependent variable, regardless of the level of Factor F. At each level of Factor F, the dependent variable is 2 units bigger with E1 than with E2. So, in this chart, there is no interaction between Factors E and F. And you can tell at a glance that there is no interaction, because the plotted lines are parallel.

Number of Interactions

The number of interaction effects in a full factorial experiment is determined by the number of factors. A two-factor design (with factors A and B) has one two-way interaction (the AB interaction). A three-factor design (with factors A, B, and C) has one three-way interaction (the ABC interaction) and three two-way interactions (the AB, AC, and BC interactions).

A general formula for finding the number of interaction effects (NIE) in a full factorial experiment is:

where k C r is the number of combinations of k things taken r at a time, k is the number of factors in the full factorial experiment, and r is the number of factors in the interaction term.

Note: If you are unfamiliar with combinations, see Combinations and Permutations .

How to Interpret Causal Effects

Recall that the purpose of conducting a full factorial experiment is to understand the joint effects (main effects and interaction effects) of two or more independent variables on a dependent variable. When a researcher looks at actual data from an experiment, small differences in group means are expected, even when independent variables have no causal connection to the dependent variable. These small differences might be attributable to random effects of unmeasured extraneous variables .

So the real question becomes: Are observed effects significantly bigger than would be expected by chance - big enough to be attributable to a main or interaction effect rather than to an extraneous variable? One way to answer this question is with analysis of variance. Analysis of variance will test all main effects and interaction effects for statistical significance. Here is how to interpret the results of that test:

  • If no effects (main effects or interaction effects) are statistically significant, conclude that the independent variables do not affect the dependent variable.
  • If a main effect is statistically significant, conclude that the main effect does affect the dependent variable.
  • If an interaction effect is statistically significant, conclude that the interaction factors act in combination to affect the dependent variable.

Recognize that it is possible for factors to affect the dependent variable, even when the main effects are not statistically significant. We saw an example of that in Experiment I.

In Experiment I, both main effects were zero; yet, the interaction effect is dramatic. The moral here is: Do not attempt to interpret main effects until you have looked at interaction effects.

Note: To learn how to implement analysis of variance for a full factorial experiment, see ANOVA With Full Factorial Experiments .

Advantages and Disadvantages

Analysis of variance with a full factorial experiment has advantages and disadvantages. Advantages include the following:

  • The design permits a researcher to examine multiple factors in a single experiment.
  • The design permits a researcher to examine all interaction effects.
  • The design requires subjects to participate in only one treatment group.

Disadvantages include the following:

  • When the experiment includes many factors and levels, sample size requirements may be excessive.
  • The need to include all treatment combinations, regardless of importance, may waste resources.

Test Your Understanding

The table below shows results from a 2 x 2 factorial experiment.

Assuming equal sample size in each treatment group, what is the main effect for both factors?

(A) -2 (B) 3.5 (C) 4 (D) 7 (E) 14

The correct answer is (A). We can compute the main effect for Factor C as shown below:

Effect of C at level D 1 = C 2 D 1 - C 1 D 1 = 4 - 5 = -1

Effect of C at level D 2 = C 2 D 2 - C 1 D 2 = 1 - 4 = -3

Main effect of C = ( -1 + -3 ) / 2 = -2

And we can compute the main effect for Factor D as shown below:

Effect of D at level C 1 = C 1 D 2 - C 1 D 1 = 4 - 5 = -1

Effect of D at level C 2 = C 2 D 2 - C 2 D 1 = 1 - 4 = -3

Main effect of D = ( -1 + -3 ) / 2 = -2

(A) -12 (B) -2 (C) 0 (D) 3 (E) 4

The correct answer is (B). We can compute the main effect for Factor E as shown below:

Effect of E at level F 1 = E 2 F 1 - E 1 F 1 = 3 - 5 = -2

Effect of E at level F 2 = E 2 F 2 - E 1 F 2 = 1 - 3 = -2

Main effect of E = ( -2 + -2 ) / 2 = -2

And we can compute the main effect for Factor F as shown below:

Effect of F at level C 1 = E 1 F 2 - E 1 F 1 = 3 - 5 = -2

Effect of F at level C 2 = E 2 F 2 - E 2 F 1 = 1 - 3 = -2

Main effect of F = ( -2 + -2 ) / 2 = -2

Consider the interaction plot shown below. Which of the following statements are true?

(A) There is a non-zero interaction between Factors A and B. (B) There is zero interaction between Factors A and B. (C) The plot provides insufficient information to describe the interaction.

The correct answer is (B). At every level of Factor B, the difference between A1 and A2 is 3 units. Because the effect of Factor A is constant (always 3 units) at every level of Factor B, there is no interaction between Factors A and B.

Note: The parallel pattern of lines in the interaction plot indicates that the AB interaction is zero.

Icon Partners

  • Quality Improvement
  • Talk To Minitab

5 Reasons Factorial Experiments Are So Successful

Last week we began an experimental design trying to get at how to drive the golf ball the farthest off the tee by characterizing the process and defining the problem. The next step in our DOE problem-solving methodology is to design the data collection plan we’ll use to study the factors in the experiment.

We will construct a full factorial design, fractionate that design to half the number runs for each golfer, and then discuss the benefits of running our experiment as a factorial design.  

advantages of factorial experimental design

The four factors in our experiment and the low / high settings used in the study are:

  • Club Face Tilt  (Tilt) –  Continuous Factor : 8.5 degrees  &  10.5 degrees
  • Ball Characteristics (Ball)  – Categorical Factor :  Economy & Expensive 
  • Club Shaft Flexibility (Shaft)  – Continuous Factor :  291 & 306  vibration cycles per minute
  • Tee Height  (TeeHght) – Continuous Factor : 1 inch & 1 3/4 inch

To develop a full understanding of the effects of  2 – 5 factors on your response variables, a full factorial experiment requiring 2 k runs ( k = of factors) is commonly used. Many industrial factorial designs study 2 to 5 factors in 4 to 16 runs (2 5-1 runs, the half fraction, is the best choice for studying 5 factors) because 4 to 16 runs is not unreasonable in most situations. The data collection plan for a full factorial consists of all combinations of the high and low setting for each of the factors. A cube plot, like the one   for our golf experiment  shown below, is a good way to display the design space the experiment will cover. 

There are a number of good reasons for choosing this data collection plan over other possible designs. The details are discussed in  many   excellent   texts . Here are my top five.

1. Factorial and fractional factorial designs are more cost-efficient.

Factorial and fractional factorial designs provide the most run efficient (economical) data collection plan to learn the relationship between your response variables and predictor variables. They achieve this efficiency by assuming that each effect on the response is linear and therefore can be estimated by studying only two levels of each predictor variable.

After all, it only takes two points to establish a line.

2. Factorial designs estimate the interactions of each input variable with every other input variable.

Often the effect of one variable on your response is dependent on the level or setting of another variable. The effectiveness of a college quarterback is a good analogy. A good quarterback can have good skills on his own. However, a great quarterback will achieve outstanding results only if he and his wide receiver have synergy. As a combination, the results of the pair can exceed the skill level of each individual player. This is an example of a synergistic interaction. Complex industrial processes commonly have interactions, both synergistic and antagonistic, occurring between input variables. We cannot fully quantify the effects of input variables on our responses unless we have identified all active interactions in addition to the main effects of each variable. Factorial experiments are specifically designed to estimate all possible interactions.   

3. Factorial designs are orthogonal.

We analyze our final experiment results using least squares regression to fit a linear model for the response as a function of the main effects and two-way interactions of each of the input variables. A key concern in least squares regression arises if the settings of the input variables or their interactions are correlated with each other. If this correlation occurs, the effect of one variable may be masked or confounded with another variable or interaction making it difficult to determine which variables actually cause the change in the response. When analyzing historical or observational data, there is no control over which variable settings are correlated with other input variable settings and this casts a doubt on the conclusiveness of the results. Orthogonal experimental designs have zero correlation between any variable or interaction effects specifically to avoid this problem. Therefore, our regression results for each effect are independent of all other effects and the results are clear and conclusive.

4. Factorial designs encourage a comprehensive approach to problem-solving.

First, intuition leads many researchers to reduce the list of possible input variables before the experiment in order to simplify the experiment execution and analysis. This intuition is wrong. The power of an experiment to determine the effect of an input variable on the response is reduced to zero the minute that variable is removed from the study (in the name of simplicity). Through the use of fractional factorial designs and experience in DOE, you quickly learn that it is just as easy to run a 7 factor experiment as a 3 factor experiment, while being much more effective.

Second, factorial experiments study each variable’s effect over a range of settings of the other variables. Therefore, our results apply to the full scope of all the process parameter settings rather than just specific settings of the other variables. Our results are more widely applicable to all conditions than the results from studying one variable at a time.

5. Two-level factorial designs provide an excellent foundation for a variety of follow-up experiments.

This will lead to the solution to your process problem. A fold-over of your initial fractional factorial can be used to complement an initial lower resolution experiment, providing a complete understanding of all your input variable effects. Augmenting your original design with axial points results in a response surface design to optimize your response with greater precision. The initial factorial design can provide a path of steepest ascent / descent to move out of your current design space into one with even better response values. Finally, and perhaps most commonly, a second factorial design with fewer variables and a smaller design space can be created to better understand the highest potential region for your response within the original design space.

I hope this short discussion has convinced you that any researcher in academics or industry will be well rewarded for the time spent learning to design, execute, analyze, and communicate the results from factorial experiments. The earlier in your career you learn these skills, the … well, you know the rest.

For these reasons, we can be quite confident about our selection of a full factorial data collection to study the 4 variables for our golf experiment. Each golfer will be responsible for executing only one half of the runs, called a half fraction, of the full factorial. Even so, the results for each golfer can be analyzed independently as a complete experiment.

In my next post, I’ll answer the question: How do we calculate the number of replicates needed for each set of run conditions from each golfer so that our results have a high enough power that we can be confident in our conclusions? Many thanks to  Toftrees Golf Resort  and  Tussey Mountain  for use of their facilities to conduct our golf experiment.

Catch Up with the other Golf DOE Posts:

Part 1:  A (Golf) Course in Design of Experiments Part 3:  Mulligan? How Many Runs Do You Need to Produce a Complete Data Set?   Part 4:  ANCOVA and Blocking: 2 Vital Parts to DOE Part 5:  Concluding Our Golf DOE: Time to Quantify, Understand and Optimize

  • Trust Center

© 2023 Minitab, LLC. All Rights Reserved.

  • Terms of Use
  • Privacy Policy
  • Cookies Settings

METHODS article

Factorial designs help to understand how psychological therapy works.

Edward R. Watkins*

  • College of Life and Environmental Sciences, University of Exeter, Exeter, United Kingdom

A large amount of research time and resources are spent trying to develop or improve psychological therapies. However, treatment development is challenging and time-consuming, and the typical research process followed—a series of standard randomized controlled trials—is inefficient and sub-optimal for answering many important clinical research questions. In other areas of health research, recognition of these challenges has led to the development of sophisticated designs tailored to increase research efficiency and answer more targeted research questions about treatment mechanisms or optimal delivery. However, these innovations have largely not permeated into psychological treatment development research. There is a recognition of the need to understand how treatments work and what their active ingredients might be, and a call for the use of innovative trial designs to support such discovery. One approach to unpack the active ingredients and mechanisms of therapy is the factorial design as exemplified in the Multiphase Optimization Strategy (MOST) approach. The MOST design allows identification of the active components of a complex multi-component intervention (such as CBT) using a sophisticated factorial design, allowing the development of more efficient interventions and elucidating their mechanisms of action. The rationale, design, and potential advantages of this approach will be illustrated with reference to the IMPROVE-2 study, which conducts a fractional factorial design to investigate which elements (e.g., thought challenging, activity scheduling, compassion, relaxation, concreteness, functional analysis) within therapist-supported internet-delivered CBT are most effective at reducing symptoms of depression in 767 adults with major depression. By using this innovative approach, we can first begin to work out what components within the overall treatment package are most efficacious on average allowing us to build an overall more streamlined and potent therapy. This approach also has potential to distinguish the role of specific versus non-specific common treatment components within treatment.

Introduction: The Need to Understand How Psychological Therapies Work

Psychological treatments for mental health disorders have been robustly established as proven and evidence-based interventions through multiple clinical trials and meta-analyses ( 1 – 3 ). Nonetheless, there is a pressing need to further improve psychological interventions: even the best treatments do not work for everyone. Many patients do not have sustained improvement, and treatments need to scaled up to tackle the global burden of mental health ( 4 ). For example, psychological treatments for depression only achieve remission rates of 30%–40% and have limited sustained efficacy (at least 50% relapse and recurrence) ( 1 , 5 ). Further, it is estimated that current treatments, if delivered optimally, would only reduce the burden of depression by one third ( 6 ). As such, psychological treatments for depression need to be significantly enhanced.

One pathway to improving the efficacy and effectiveness of therapies is to develop our understanding of how complex psychological interventions work. Despite determining that a number of psychological treatments are effective, for example, cognitive-behavioral therapy (CBT), we still do not know how psychological treatments work. There is little evidence on the precise mechanisms through which psychological treatments work or what are the active ingredients of treatments ( 7 – 10 ), especially for disorders involving general distress such as depression and generalized anxiety disorder. Historically, there has been little progress in specifying the active ingredients of CBT for depression, and as a consequence, there have been no significant gains in the effectiveness of CBT for depression for over 40 years.

Resolving the active mechanisms and active ingredients of psychological interventions has been repeatedly identified as a major priority for research ( 4 , 7 , 10 , 11 ). For example, the Institute of Medicine (2015) highlighted the need to identify the key elements of psychosocial interventions that casually drive its effects ( 11 ).

To be clear, we distinguish between the active components of therapy, operationalized as the active elements or ingredients within a therapy that produce clinical benefit, which could be therapist-based, client activities, specific techniques, or related to therapy structure and delivery, versus the active mechanisms of the therapy, operationalized as the underlying change processes that causally underpin therapeutic benefit. While active components will necessarily impact on one or more active mechanisms, knowing the most effective components of a therapy is distinct from knowing how this component leads to symptom change [i.e., its underlying mechanism(s)]. For example, in CBT, identifying behavioral activation as an active therapy component does not necessarily confirm that the mechanism-of-action is behavioral as behavioral activation may work through changing cognitions.

Understanding the mechanisms or the active components of psychological treatments are important because either potentially enables the development of more direct, precise, potent, simpler, briefer, and effective treatments. Understanding the active components of a psychological therapy is necessary in order to parse and distil the therapy to focus on what is essential and most engaging to patients.

Psychological treatments are complex interventions, typically made up of multiple elements and components, including the particular content and techniques of the therapy, the interaction between the therapist and patient, the structure of the therapy, and the mode and organization of delivery, each of which potentially acts via distinct mechanisms. Therapy is thus a complex multifactorial process. Any or none of these factors could contribute to the efficacy of an intervention, alone or in interaction with the other factors. It is therefore critical to determine the beneficially active, inactive, or inert, and iatrogenic components within an intervention so that the intervention can be honed to become optimally effective, by focusing on the active elements and by removing irrelevant or unhelpful elements ( 12 ).

Relatedly, if we know the active mechanisms of an intervention, we may be able to adapt the intervention or develop novel approaches to more directly target this mechanism and, thereby, increase the efficacy of the intervention.

Because of the high prevalence of common mental health problems, there is also a scalability gap because there are not sufficiently available therapists to tackle the global burden of poor mental health ( 13 ). It is therefore critical that ways are found to make treatments more efficient, scalable, and easier to train and disseminate. Understanding the underlying components of therapy and being able to remove unnecessary elements may make psychological therapies more effective and more cost-effective by streamlining and simplifying the treatment. For example, the same treatment benefit could be achieved from fewer sessions, enabling a greater volume of patients to be treated for the same volume of therapists. Understanding the critical active components of therapy will also help to adapt treatments for the alternative delivery means that are necessary for increased scalability (for example, to convert for self-help, lay provision, or digital interventions), without losing the core elements needed for efficacy. Understanding how therapy works will also make it easier to effectively train and disseminate therapies, facilitating wider treatment coverage. This understanding may also help to identify moderators of treatment outcome and more effectively personalize therapy to each individual.

Common Versus Specific Treatment Factors

One key issue with respect to resolving the underlying mechanisms underpinning the efficacy of psychological treatments concerns the question of whether treatment works through specific versus non-specific common factors ( 8 , 14 ). Specific factors are procedures or techniques arising from the particular therapy approach, such as those typically described in structured treatment manuals, for example, cognitive restructuring in CBT; exposure in CBT for anxiety disorders. Common (or non-specific) factors are those that are hypothesized to be common across all psychological interventions. The most important of these include a positive and genuine relationship between the therapist and patient, engendering positive expectancies and hope in the patient, and a convincing rationale that explains the symptoms experienced and gives credible reasons for the treatment to be helpful ( 15 ). There is a long-standing and still unresolved debate between those who propose that psychotherapies mainly work through specific factors versus those who propose that psychotherapies mainly work through common factors.

One argument made in support of common factors is that different specific psychotherapies are generally not found to differ in efficacy, although this does not logically rule out that treatments may work via different mechanisms ( 16 ). A recent review concludes that there is as yet no conclusive evidence that either common or specific factors can be considered a validated working mechanism for psychotherapy, in other words, the evidence is insufficient to determine the role of either ( 8 ).

The relative contribution of common versus specific factors in the efficacy of psychological interventions has important implications for how therapists should be trained, how therapies should be delivered, and for how treatment services should be organized. If the substantive part of the treatment effect is due to common factors, then therapy training should predominantly emphasize therapists learning how to develop a strong therapeutic relationship, develop a rationale etc. In parallel, therapy research should focus on understanding how to strengthen positive common factor effects. However, if specific factors are important then these also need to be emphasized in training and delineated in further research. Furthermore, the increasing importance of specific factors indicates a potentially greater need for discriminating and selecting therapy to match the individual clinical presentation.

Methodologies to Examine the Mechanisms of Psychotherapy

Comparative randomized controlled trials.

One reason for limited progress in understanding the mechanisms of psychological treatments is the focus on parallel group comparative randomized controlled trials (RCTs). Parallel group RCTs are the gold standard for establishing if an intervention works more than another intervention or against a control and the best means for establishing the relative efficacy of one treatment intervention versus another. However, they are not designed for investigating the specific mechanisms of how interventions work or identifying the active components of therapy. Because comparative RCTs can only compare the overall effects of each intervention package, they are not intended to and unable to provide information about the performance of the individual elements within complex multifactorial interventions. In standard comparative RCTs, all of the multiple treatment components and factors in an intervention package and their hypothetical mechanisms are aggregated and confounded together in the comparison of one treatment versus another. As a consequence, this design is unable to test specific main effects of treatment components nor any possible synergistic or antagonistic interactions between individual treatment components, limiting advances in mechanistic understanding. If an RCT finds one treatment better than another, we do not know which components made a difference; if there is no difference, we do not know whether there are any components that effected an improvement.

This limitation of standard comparative RCTs also applies to their ability to resolve the relative contribution of specific versus common factors. One major issue concerns the difficulty in finding an adequate control arm to compare against a putative active treatment to distinguish the role of specific versus non-specific factors. Some comparative RCTs and meta-analyses have found that one therapy has outperformed another therapy ( 17 , 18 ), which proponents of specific factors have argued as evidence for specific treatment effects. However, proponents of the common factors model have counter-argued that sometimes the comparison treatments used are not bona fide therapies, defined as viable treatments that are based on psychological principles and delivered by trained therapists, and thus that this is not a fair comparison. When comparisons are made between bona fide therapies, no differences in efficacy are found ( 19 ).

Relatedly, other designs have compared an active treatment to a psychotherapy placebo or attentional control on the argument that any differential beneficial effect observed for the active treatment will then be due to specific factors as the effects of the attentional control can only be due to common factors. However, most psychotherapy placebos do not control for all the potential common factors hypothesized in therapy, and thus, any difference found between a placebo and an active treatment could be due to either specific or common factors or some combination thereof ( 20 ). For example, it is hard to generate psychotherapy placebos that are exactly matched to active treatments in therapy rationale and credibility, without the placebo itself becoming a bona fide treatment. Similarly, psychotherapy placebos tend to differ from active treatments with respect to the structure of the therapy, for example, the number and duration of sessions, training of therapist, format of therapy, and range of topics covered. A meta-analysis of comparative trials found that there were larger effect sizes found between active treatments and structurally inequivalent placebos than between active treatments and structurally equivalent placebos, for which there were negligible differences ( 20 ). These difficulties in finding matched placebo controls or bona fide interventions have limited the conclusions that can be reached about the relative contribution of specific or common factors examined in parallel RCTs.

Attempts have also been made in RCTs to determine mechanisms by examining changes in putative mediators. For example, in trials of CBT, measures of change in negative thinking are examined as a mediator of symptom change. However, these mediational approaches are necessarily limited because they are still indirect and correlational ( 7 ). Even if an intervening variable is found to statistically account for the relationship between the treatment and its outcome, this does not provide strong evidence of a mechanism of change, because it does not support a strong causal inference that the mediator influences outcome. In such associations, the mediator may be a proxy to another variable(s) and there may be another unknown or unmeasured variable that is related to both the outcome and the mediator. Ultimately, direct experimental manipulation of the relevant factor is required for strong causal inference, and this is not possible for multiple elements of psychological interventions within a parallel group comparison RCT.

Component Study Designs

One experimental approach that has been used to examine the specific elements of psychological interventions is the component study ( 9 ), in which the full intervention is compared with the intervention with at least one component removed (a dismantling study) or in which a component is added to an existing intervention to test whether it improves outcomes (an additive study) ( 21 ). In principle, this approach can enable a strong causal inference that a component has a direct effect on outcome if there is a significant difference in outcomes between the variant of the therapy with a component and the variant without that component.

Nonetheless, there are limitations of component designs. First and critically, the component design does not necessarily test the main effect of a component, that is, the difference between the mean response in the presence of a particular component and the mean response in the absence of the particular component collapsing over the levels of all remaining factors. This can be illustrated with reference to one of the seminal dismantling studies—the dismantling study of CBT for depression by Jacobson and colleagues ( 22 ). In this study, patients with depression were randomized to either the full CBT treatment package including behavioral activation, cognitive restructuring to modify negative automatic thoughts, and work on core schema, or to behavioral activation plus cognitive restructuring or to just behavioral activation element alone, with 50 patients in each arm. No significant difference was found between the three versions, leading some observers to suggest that behavioral activation alone is sufficient for the effects of CBT on depression. However, it is important to realize that all versions of the treatment involved behavioral activation: as a consequence, for example, the trial is testing the effect of cognitive restructuring in the context of behavioral activation versus behavioral activation alone. It can only tell us the effect of that component in the context of the other component. Thus, the effects estimated are only the simple effects of each component with the remaining component set to one specific level. For example, for cognitive restructuring, this design only reveals the effect of cognitive restructuring in the presence of behavioral activation. It does not test the main effect of cognitive restructuring, i.e., does the presence of cognitive restructuring have a treatment effect relative to the absence of cognitive restructuring. Similarly, because there is no condition without behavioral activation, it is not possible to estimate the direct main effect of behavioral activation.

Second, the component design assumes that there is no interaction between the components, that is, that the effect of one component is independent of the presence or absence of other components. This may not always be a realistic assumption. For example, it is possible that behavioral activation and cognitive restructuring either complement each other or are antagonistic to each other.

Third, there is a concern that most component studies are not sufficiently powered to detect a difference between two potentially active treatment arms. For example, it has been estimated based on the assumption that a minimally clinically important difference for depression is d=0.24 that a trial would need 274 participants in each condition.

The Factorial Approach

We propose the use of factorial and fractional factorial designs as an alternative methodological approach to standard comparative RCTs and component designs, which has advantages over both for resolving the active components of psychotherapy. Factorial experiments allow one to explore main effects of factors and interactions among factors ( 23 – 27 ).

Factorial designs systematically experimentally manipulate multiple components or factors of interest. Indeed, factorial designs are commonly used to test the role of different factors simultaneously in experimental psychology. As such, they meet the requirement for delineating active components raised by multiple commentators ( 8 , 10 , 14 ). For example, the Institute of Medicine (2015, p3-10) recently proposed that “determination of which elements are critical depends on testing of the presence or absence of individual elements in rigorous study designs,” which is exactly what a factorial design delivers.

To give a clinical example, if the Jacobson and colleagues dismantling study of CBT for depression was redesigned as a full factorial study, patients would be randomized across three factors [presence or absence of behavioral activation (BA + vs BA - ); presence or absence of cognitive restructuring (CR + vs CR - ); presence or absence of work on core schema (CS + vs CS - )]. This means that patients would be randomized to be balanced across 8 treatments cells reflecting all of the possible combinations: all three elements (BA + : CR + :CS + ); 2 of the 3 elements (BA + : CR + :CS - ; BA + : CR - :CS + ; BA - : CR + :CS + ); 1 of the 3 elements (BA + : CR - :CS - ; BA - : CR + :CS - ; BA - : CR - :CS + ); or none of these elements (BA - :CR - :CS - ). This design can test the main effect of each factor as well as their interactions by comparing the mean effects of combined sets of cells against each other. For example, comparing all 4 cells with BA versus all 4 cells without BA tests the main effect of behavioral activation. The difference from the dismantling design is clear because the dismantling design only has 3 of these 8 combinations (BA + : CR - :CS - ; BA + : CR + :CS - ; BA + : CR + :CS + ), which limits it to only testing simple effects.

Factorial designs have been used extensively in engineering to optimize processes. In the last decade, they have been used to good effect in behavioral health, for example, in enhancing interventions for HIV care and prevention ( 28 ) and smoking cessation ( 29 , 30 ). This approach seems well-suited to expanding to the further understanding of psychological treatments and has been recently adopted in several recent trials ( 31 , 32 ). We believe that factorial designs have advantages for investigating how psychotherapy works that overcome many of the disadvantages noted earlier for comparative RCTs and component trials, as we will outline throughout this paper.

A fractional factorial design is a variation on the factorial design that employs a systematic approach to reduce the number of experimental conditions to allow a more manageable study, at the cost of allowing only main effects and a pre-specified set of interactions to be tested. Fractional factorial designs require the assumption that higher-order interactions are negligible in size, because they are confounded, or aliased, with lower-order effects.

The IMPROVE-2 Study as an Example of a Factorial Design

We illustrate the use of a fractional factorial design to identify the active ingredients and mechanisms of an intervention, with respect to a specific example - the IMPROVE-2 study (Implementing Multifactorial Psychotherapy Research in Online Virtual Environments) [see ( 32 ) for further detail). The IMPROVE-2 study is a Phase III randomized, single-blind balanced fractional factorial trial based in England and conducted on the internet. Adults with depression (operationalized as Patient Health Questionnaire-9 scores ≥ 10) recruited directly from the internet and from an UK National Health Service Improving Access to Psychological Therapies service were randomized across seven experimental factors, each reflecting the presence versus absence of specific treatment components within internet-delivered CBT, guided by an online therapist (activity scheduling, functional analysis, thought challenging, relaxation, concreteness training, absorption, self-compassion training) using a 32 condition balanced fractional factorial design (2 iv 7-2 ) (see Table 1 ).

www.frontiersin.org

Table 1 Experimental groups of the IMPROVE-2 fractional factorial design.

All components involved brief prescribed therapist online support to improve retention and adherence, in which secure online written feedback was provided at the end of each completed module (typically fortnightly), with the option for additional secure messaging between therapist and patient. Therapist feedback highlighted positive steps made, encouraged participants to continue to practice previously introduced components, addressed questions and homework, and pointed out areas to focus on in the next module. Therapists were low-intensity Psychological Wellbeing Practitioners and an experienced clinical psychologist.

The IMPROVE-2 trial used a fractional factorial design to retain the benefits of a factorial design while making the study more logistically manageable and feasible to deliver: this fractional factorial design reduces the total number of conditions from 128 to 32. Each component has two “levels” to be compared in the fractional factorial design: either present or absent, i.e., the respective treatment modules are either provided or not provided in the internet platform. IMPROVE-2 therefore tests the main effects and selected interactions for these 7 components within internet CBT for depression to determine the active ingredients of internet CBT. We first outline the general framework used for this study—the Multiphase Optimization Strategy (MOST)—and then explore the particular benefits and methodological issues of using the factorial design to study psychotherapy.

The Multiphase Optimization Strategy (MOST)

Within IMPROVE-2, the factorial design is used as one stage within a wider framework for improving interventions—the Multiphase Optimization Strategy (MOST) ( 33 – 38 ) approach. MOST, rooted in engineering, agriculture, and behavioral science, is a principled and comprehensive framework for optimizing and evaluating behavioral interventions ( 33 – 38 ).

MOST consists of three stages: a preparation stage in which the relevant factors and components to be investigated are identified; an optimization stage in which a factorial experiment is used to evaluate the main effects and interactions of each factors; and then an evaluation stage, in which an optimized intervention based on the results of the previous trial is tested in a RCT. MOST has been established to enhance treatments for smoking cessation, with earlier factorial designs identifying active components ( 29 ), which were then combined into a novel intervention which outperformed recommended standard care in a RCT ( 39 ). MOST is well-validated ( 29 , 30 , 34 , 40 ) and recommended within the Medical Research Council Complex Intervention guidelines ( 41 , 42 ). A key advantage is greater experimental efficiency, with a focus on identifying “active ingredients” versus “inactive” or extraneous components before moving onto large-scale comparative trials, resulting in fewer overall resources required to answer the research questions in the long run than with the traditional approach ( 43 ). However, to date, MOST has not been applied to psychological interventions for mental health.

The IMPROVE-2 trial is one of the first attempts to apply the MOST approach to psychological interventions, building on the preparation and optimization phases so far. It incorporates the MOST approach with an internet delivery format for CBT to build in treatment reach, scalability, and increased treatment coverage for the optimized treatment from the start, as the goal is to develop an optimized and scalable evidence-based treatment. Another benefit of using such an internet-delivered therapy is that treatment content can be standardized and fixed, and written therapist responses can be closely demarcated, reducing unwanted “drift” from treatment protocols. This helps prevent potential contamination between different treatment components, which is an important consideration for a factorial design.

The Preparation Stage in MOST

During the preparation stage, a conceptual model for the intervention is developed, and discrete and distinct intervention components are selected. These components are then pilot tested for acceptability, feasibility, evidence of effectiveness, and ease of implementation, and refined as needed. MOST also involves the identification of the optimization criterion, which is the operational definition of the target change sought that is used to judge the optimal intervention, subject to resource or other constraints. For example, this might be greatest symptom improvement that could be obtained for a particular cost or for a particular duration of treatment.

With respect to the IMPROVE-2 study, a previous feasibility study (IMPROVE-1) established that it was feasible to maintain treatment integrity and fidelity across randomization into multiple treatment conditions and to avoid contamination across treatment conditions. Because the IMPROVE-2 study is focused on determining the ingredients of internet-CBT that are most effective for treating major depression in adults, the operational definition for the optimization criterion was the largest reduction in depressive symptoms, as indexed by using change in scores on the Patient Health Questionnaire-9 score (PHQ-9) ( 44 ) as the primary outcome.

Components Within the Psychological Intervention

A key step within this preparation phase is to identify the components that are to be targeted. When planning a factorial study, the best components to choose are those that are: related to a specific conceptual model; distinct from each other in content, approach or delivery method; have some evidence of efficacy, that can be independently administered, i.e., one component is not dependent on another for delivery; and that are hypothesized to address one or two theoretical mediators. In essence, it is important that components can be distinguished from each other in a meaningful way and that they are conceptually related to different mechanisms.

The elements or components selected can be at different levels of analysis and abstraction. The level selected will depend on the specific question or conceptual model. For example, for CBT, the components chosen could relate to the main hypothesized theoretical mechanisms of change and their associated elements, such as activity monitoring and scheduling and detecting and testing automatic thoughts. Alternatively, the components could relate to lower-level, more discrete elements within the treatment techniques such as the behavioral change techniques outlined in a recent taxonomy ( 45 ). These behavioral change techniques include behaviors such as self-monitoring, goal-setting, and feedback, which are common across different CBT components as well as other psychotherapy modalities. Alternatively, the components could relate to process-related aspects of therapy such as whether the intervention is therapist-supported versus unsupported, or structural aspects, such as the frequency of treatment sessions.

IMPROVE-2 illustrates the selection of components to be examined. Consistent with the principles above, the IMPROVE-2 study chose treatment components that were conceptually and operationally distinct from each other, so that each can be evaluated independently. As the first attempt to disentangle the active components within CBT for depression, components were chosen that were clearly distinct and that could be linked to the main theorized mechanisms of action in CBT. These components were operationalized at a relatively high-level (e.g., thought challenging to reflect cognitive theories of change; activity scheduling to reflect behavioral theories of change) rather than in terms of the more localized behavioral change taxonomy because the goal was to determine the core components relating to key theoretical conceptualizations of CBT and to maximize the likelihood of finding a positive effect. If, for example, thought challenging was found to be a strong active ingredient, then further studies could dissect which elements including more specific behavioral change techniques are critical to the effects of thought challenging. Three of the components chosen had been identified as elements for CBT for depression, using a Delphi technique ( 46 ): applied relaxation; activity monitoring and scheduling; detecting and reality testing automatic thoughts. A further component—functional analysis—is a mainstay of behavioral approaches to depression including behavioral activation ( 47 ). Three components related to recent treatment innovations in CBT derived from experimental research ( 48 , 49 ), with each hypothesized to specifically target distinct mechanisms arising from different theoretical models: self-compassion, concreteness training, and absorption. The components selected relate to three theoretical accounts of how CBT might work: a behavioral account, a cognitive account, and a self-regulation account.

Three components related to behavioral models of depression and of how CBT works. Depression has been hypothesized to result from a reduction in response-contingent positive reinforcement ( 50 ), in which the individual with depression experiences less reward and sense of agency as a consequence of changing circumstances (e.g., loss), poor skills, or avoidance and withdrawal. Within the behavioral conceptualization, activity scheduling is hypothesized to increase response-contingent positive reinforcement by increasing frequency of positive reinforcement thorough building up positive activities. This treatment component provides psychoeducation about the negative effects of avoidance, includes questionnaires to help patients identify their own patterns of avoidance, provides guidance on activity scheduling to build up positive activities and reduce avoidance (e.g., breaking plans into smaller steps; specifying when and where to implement activities), and exercises in which participants generate their own activity plans.

In parallel, functional analysis seeks to determine the functions and contexts under which desired and unwanted behaviors do and don't occur and, thereby, find ways to systematically increase or reduce these behaviors, by exploring their antecedents, consequences, and variability, and then either alter the environment to remove antecedent stimuli that trigger unwanted behaviors and/or practice incompatible and constructive alternative responses to these antecedents. This approach is based on Behavioral Activation (BA) ( 51 ) and rumination-focused CBT ( 49 ) approaches to depression. More specifically, functional analysis is proposed to target habitual avoidance and rumination by identifying antecedent cues, controlling exposure to these cues, and practicing alternative responses to them ( 52 ).

Absorption training is also hypothesized to increase response-contingent positive reinforcement by increasing direct contact with positive reinforcers. Absorption training is focused on teaching an individual to mentally engage and become immersed in what he or she is doing in the present moment to improve direct connection with the experience and enhance contact with positive reinforcers. It is designed to overcome the effects of detachment and rumination which can prevent an individual experiencing the benefits of doing positive activities. When delivered within the internet treatment, patients complete a behavioral experiment using audio-recorded exercises to compare visualizations of memories of being absorbed versus not being absorbed in a task, practice generating a more absorbed mind-set using downloadable audio exercises, and identify absorbing activities.

Two components within the factorial design are based on a cognitive conceptualization of depression, in which the negative thinking characteristic of depression, is hypothesized to play a causal role in the onset and maintenance of depression, and, thereby, reducing negative thinking is hypothesized to be an active mechanism in treating depression ( 53 , 54 ). Central within CBT for depression is the use of thought challenging or cognitive restructuring to reduce negative thinking ( 55 ), and this forms one component in the IMPROVE-2 trial. The internet treatment module that delivers the thought challenging component involves psychoeducation about negative automatic thoughts and cognitive distortions, vignettes of identifying and challenging negative thoughts, and written exercises in which patients practice identifying and then challenging negative thoughts using thought records.

The other cognitive-based component involves concreteness training, based on an intervention found to reduce symptoms of depression in a previous RCT ( 48 ) and derived from experimental research indicating the benefits of shifting into a concrete processing style ( 56 , 57 ). Within the IMPROVE-2 trial, the internet treatment module that delivers this component involves psycho-education about depression, rumination, and overgeneralization, a behavioral experiment using audio-recorded exercises to compare abstract versus concrete processing styles, and downloadable audio exercises to practice thinking about negative events in a concrete way. Unlike thought challenging, concreteness training does not test the accuracy or veridicality of negative thoughts but rather trains patients to focus on the specific and distinctive details, context, sequence (“How did it happen?”), and sensory features of upsetting events to reduce overgeneralization and improve problem-solving. Concreteness training is therefore hypothesized to specifically reduce the overgeneralization cognitive bias identified as important in depression ( 53 , 58 ).

The remaining treatment components are hypothesized to directly improve emotional regulation. Relaxation is hypothesized to improve self-regulation by targeting physiological arousal and tension. In IMPROVE-2, a variant of progressive muscle relaxation and breathing exercises was used to reduce physiological arousal and tension in response to warning signs, based on trial evidence that this intervention alone reduces depression ( 48 ). The treatment component introduces a rationale for relaxation, provides an online relaxation exercise as a behavioral experiment to test if it reduces tension, and a downloadable relaxation exercise.

Self-compassion training is proposed to activate the soothing and safeness emotional system, hypothesized to be downregulated in depression ( 59 ). Recent research has highlighted the potential benefit of increasing self-compassion in treatments for depression ( 49 , 60 – 62 ), although self-compassion has not yet been directly tested within a full-scale clinical trial for patients with major depression. Within this treatment component, patients read psychoeducation about compassion including useful self-statements to encourage and support oneself, complete a behavioral experiment that compares their own self-talk to how they talk to others, try an audio-recorded exercise visualizing past experiences of self-compassion to activate this mind-set and test its benefits, which is downloadable for further practice, and identify activities they would do more of and activities they would do less of to be kinder to themselves.

The Optimization Stage of MOST: Factorial Experiments and Their Benefits

The second stage of MOST involves optimization of the intervention, typically through a component selection experiment (sometimes called a component screening experiment), using a factorial or fractional factorial design. This factorial experiment is used to specifically determine the individual effects of each component and any interactions between components. It is important to note that this step could involve multiple experiments and an iterative process of further refining the intervention. For example, if the first component screening experiment observed statistically significant moderators of treatment outcome, such as mode of treatment delivery or location of treatment, a further experiment could be conducted in which the moderators are introduced as factors into the factorial experiment so that they are directly manipulated to enable stronger causal inference about their potential contribution to outcome.

Advantages of Factorial Design

There are at least four advantages to the use of a factorial design in resolving how therapy works and what its active mechanisms are.

Advantage 1: Directly Testing Individual Components and Their Interactions

The factorial experiment provides direct evidence about the effects and interactions of individual components within a treatment package, which is necessary for methodically enhancing and simplifying complex interventions ( 41 ). It can test each individual component and determine its main effect. Critically, it can also determine possible interactions between components, which other experimental designs are unable to do. Thus, a factorial design has distinct advantages when one needs to determine whether the presence of one component enhances or reduces the effect of another. This approach enables us to identify the active components of therapy and to select active and reject inactive/counter-productive components or elements. By comparing the presence versus absence of each component, this factorial design can examine the main effect of each component on the primary outcome, for example, testing whether thought challenging reduces symptoms of depression.

With respect to the IMPROVE-2 study, it is important to note that despite the many trials of CBT for depression, no trials have directly tested the main effect of each of the selected treatment components—for example, does thought challenging have a direct effect on reducing depression relative to no thought challenging? This design therefore provides the first fully-powered test of the main effects of these ingredients of CBT for depression. Table 1 describes the specific combinations of the two-level intervention factors in the experimental design.

To illustrate how the factorial design works, consider Table 1 . Main effects and interactions are estimated based on aggregates across experimental conditions. For each main effect, half of the study population are randomized to one level of the factor (e.g., in conditions 9–16, 25–32, presence of concreteness training) and half will be randomized to the other level of the factor (e.g., in conditions 1–8, 17–24, absence of concreteness training). Therefore, the main effect of concreteness training can be determined by comparing the average effect of conditions 9–16, 25–32 versus conditions 1–8, 17–24.

Technically, the IMPROVE-2 study is an internet-delivered component selection experiment with seven experimental factors evaluated, each at two levels ((presence, coded as +1 versus absence, coded as -1 of component, effect coded), using a 32-condition balanced fractional factorial design (2 IV 7-2 ). Effect coding is used because it ensures that main effects and interactions are independent.

A full factorial design of seven factors would have required 2 7 = 128 conditions, which was deemed to be impractical and too complex to program and administer, and thus a fractional factorial design was chosen. For IMPROVE-2, a 2 7-2 fractional factorial design was chosen, which reduces the number of experimental conditions by a factor of four, down to 32 conditions. While the full factorial design necessarily includes all possible combinations of all factors, within a fractional factorial design the researcher has to strategically and carefully select a subset of the experimental conditions available.

The first consideration when selecting the subset of the experimental conditions is statistical, with a need to maintain a balanced design in which every factor occurs at an equal number of times at each of the two levels, and in which all factors are orthogonal to each other. This necessarily limits the potential configurations of subsets available. These designs can be mapped out using factorial design tables ( 63 ) or statistical packages (e.g., PROC FACTEX in SAS).

The second key consideration is to select the subset of experimental conditions that maximizes the ability to estimate the main effects and interactions that are of highest priority for the research question. Typically, estimating the main effects of the intervention components is a priority. For a fractional factorial design, some of the main effects are going to be confounded (typically referred to as “aliased” within the factorial literature) with higher-order interactions, and thus the subset of experimental conditions needs to be carefully selected so that the main effects are only aliased with higher-order interactions that are judged to be less likely to be significant (e.g., 3-way or 4 way-interactions) or of less theoretical interest.

For IMPROVE-2, the selected design allows the estimation of all main effects and several pre-specified 2-factor interactions among the seven intervention factors; in statistical terminology, it is a Resolution IV design because main effects are only aliased with 3-way and higher interactions. This means that if a potential effect is observed for a particular component, technically the observed effect is due to the sum of the main effect itself and the specific aliased higher-order interactions, i.e., the estimated lower-order effect may include contribution from these higher-order effects. For example, the main effect of concreteness is aliased with the 4-way interaction of functional analysis by compassion by absorption by thought challenging, and the 4-way interaction of functional analysis by compassion by relaxation by activity scheduling and the 5-way interaction of absorption by concreteness by relaxation by thought challenging by activity scheduling. Thus, the actual effect observed is due to the sum of the main effect plus the 4-way and 5-way interactions. If this comparison is significant, the most likely explanation is that the presence of concreteness training produces better treatment outcomes than the absence of concreteness training although we cannot rule out in the fractional design that configurations of 4 and 5 components, albeit unlikely, could contribute to this effect. In interpreting the results, the assumption is that the 3-way and higher interactions are highly likely to be negligible, based on extensive research and principles within factorial experiment research ( 27 , 63 ). Although in most cases this assumption is reasonable, it may not always apply.

In designing the study, several 2-way interactions were pre-specified as being of particular interest, where it was hypothesized that components might interact with each other, and the design was explicitly chosen so that these 2-way interactions were only aliased with 3-way or 4-way interactions, which we typically expect to be negligible. For example, it was hypothesized that activity scheduling and absorption treatment components may have a positive synergistic effect because the former increases the number of positive activities engaged in, whereas the latter increases the potential absorption and connection with these activities. Similarly, it was hypothesized that thought challenging and self-compassion components may have a positive synergistic effect because thought challenging helps individuals to look logically for evidence against and alternatives to negative self-critical thoughts, while self-compassion encourages a more kindly and tolerant approach to tackle self-criticism.

One choice within the design of the fractional factorial is whether or not it includes the experimental condition in which all intervention components are set to the low level or absent, i.e., a no-treatment control. For the purposes of investigating the active ingredients of therapy, this condition is not necessarily required, since the logic of the factorial experiment is not to compare all the conditions directly with each other, as we would in a comparative RCT, but rather to identify the active components by aggregating mean effects across each factor.

For IMPROVE-2, the fractional factorial design explicitly excluded the condition in which participants receive no treatment components. This has several potential advantages. First, it means that there is not a no-treatment or treatment-as-usual condition, so that the design and trial was suitable for use in a clinical service, where it would not be possible or ethical to randomize patients to not receive any active treatment. Second, because all participants are randomized to active treatment, they are more likely to remain engaged in the trial and to not judge that they are receiving the “inferior option” as can sometimes occur for control conditions.

Within the IMPROVE-2 fractional factorial design, all participants were randomized to receive at least one component of CBT and in the majority of cases 3 or 4 components of CBT. Based on the experience of the IMPROVE-1 feasibility study, in which many patients only completed their first few treatment modules, the IMPROVE-2 counter-balanced the order in which the treatment modules delivering each treatment component were received in the internet platform to ensure that each component was received equally often across all participants as patients progressed through the therapy. In this way, the number and order of treatment components was equivalent between the high (presence) and low levels (absence) of each factor. Of course, this leaves open the question of whether the order of receiving treatment components might be important or not: given the iterative nature of the MOST approach, the effect of sequencing treatment components on efficacy could be a further question for a subsequent component screening experiment.

Advantage 2: Manipulation of Hypothesized Mechanisms and Examination of Individual Mediators

The factorial design allows research on the working mechanisms and mediators that allows strong causal inference because each factor associated with a hypothesized specific mechanism is manipulated and the effect of manipulating this factor can be tested directly on secondary measures indexing the putative mediator. The design also enables examination of the mediators of each individual intervention component, because each factor is manipulated independently. For example, this design can test whether the presence of a thought challenging component has a main effect on reducing self-reported negative thinking relative to the absence of thought challenging, and whether this change in thinking mediates change in depression.

To maximize this opportunity to test mediators, the IMPROVE-2 trial required all patients to complete a series of self-report questionnaires at baseline and at each follow-up assessment (at 12 weeks and 6 months post-randomization), as well as after each completed treatment module that index all the putative mediators across all the treatment components. For each treatment component, the putative mediator was related to the primary mechanism which each treatment component is hypothesized to most strongly influence, including rumination (5-item Brooding scale) ( 64 ) for the functional analysis component, overgeneralization (adapted Attitudes to Self Scale – Revised) ( 58 ) for the concreteness component, self-compassion scale ( 65 ) for the self-compassion component, negative thinking (Automatic Thoughts Questionnaire) ( 66 ) for the thought challenging component; increased behavioral activity and reduced avoidance (Behavioral Activation for Depression Scale Short-form) for the activity scheduling component ( 67 ), and absorption and engagement in positive activities, adapted from measures of “flow” for the absorption component ( 68 ). Mediational analyses can then be used to test the hypotheses that each treatment component primarily works through the hypothesized mediator, using the analytical approach outlined by Kraemer et al. ( 69 ) and modern causal inference methods. In addition, IMPROVE-2 will investigate potential moderation of the treatment components by site, age, sex, severity of depression, co-morbid illness, and antidepressant use. This design enables us to test whether manipulating a particular component influences the underlying process it is hypothesized to change, and whether that process in fact mediates symptom change. By assessing all putative mediators for all components, we can also test whether components influence other processes, e.g., whether components tackling behavior change cognition or vice versa.

Advantage 3: Improved Delineation of Specific Versus Common Treatment Factors

The factorial design provides a stronger test of the relative contribution of specific versus non-specific common treatment factors than existing designs. As noted earlier, the majority of control comparisons are inadequate for disentangling specific from non-specific treatment effects because of the difficulty in creating psychotherapy placebos (attentional controls) that match a bona fide psychotherapy for credibility, rationale, and structure. However, the factorial design overcomes this limitation because for any treatment component (e.g., the relaxation component in IMPROVE-2), the aggregate of the conditions where it is present (i.e., Table 1 , conditions 17–32) are equivalent for treatment credibility, structure, delivery, rationale, therapist contact, therapist content and techniques and therapist allegiance with the aggregate of the 16 conditions where it is absent (i.e., Table 1 , conditions 1–16), except for the specific treatment component itself. Moreover, these conditions are also matched in aggregate for all the other six treatment components, since these are balanced in the design. The evaluation of the main effect of relaxation involves the comparison of the average effect for the conditions where relaxation is present versus for the conditions where relaxation is absent. This design therefore provides the strongest control condition available and one that is able to disentangle specific from non-specific common treatment factors. More specifically, this approach is a rigorous test of whether there are specific treatment effects arising from particular treatment components in addition to any non-specific factors common across the treatment components. If there is a significant main effect for any component in IMPROVE-2, then this is strong evidence for a specific treatment effect above and beyond all the non-specific common therapy factors present in CBT. The nature of the non-specific factors tested will depend on the specific components compared in the trial design: because IMPROVE-2 exclusively examines components within internet-CBT, it confounds non-specific factors common across therapies (e.g., therapeutic alliance, rationale) and those specific to internet-CBT and common to all components (e.g., self-monitoring; homework). A different study that took components from different treatment interventions could better delineate non-specific effects common to all therapies. This approach would not rule out some contribution of common factors to treatment outcome, as common factors would be matched across the two levels of the factor, but would be definitive evidence for a specific treatment effect. Conversely, if none of the components were found to have a significant main effect (assuming sufficient power), this would suggest that any treatment benefit was due to common factors.

Advantage 4: Factorial Designs Are Efficient and Economical

Factorial designs are efficient and economical compared to alternative designs such as individual experiments and single factor designs because they often require substantially fewer trials and participants to achieve the same statistical power for component effects, producing significant savings in recruitment, time, effort and resources ( 23 , 43 ).

For example, as an alternative to the factorial design used in IMPROVE-2, a research program could investigate each of the components separately in seven individual experiments or conduct a comparative RCT or a component trial (dismantling or additive design). For IMPROVE-2, it was assumed that the smallest Meaningful Clinical Important Difference (MCID) would be a small effect size (Cohen's d or standardized mean difference=.2) for the main effect of an individual treatment component or interaction between components on pre-to-post change in depression. An alpha of 0.1 was chosen as this is recommended for component selection experiments to decrease the relative risk of Type II to Type I error when selecting treatment components; i.e., to avoid prematurely ruling out potentially active treatment components ( 23 , 36 ). In order to detect a MCID of d = 0.20 with 80% power at α = 0.10 per treatment, a sample size of N=632 was required (NQuery 7.0). Because participants provide at least five repeated measures on the primary outcome, latent growth curve modeling can be used, which was conservatively estimated to reduce sample size by 30% relative to only using first and last time-point as in an Analysis of Covariance, but then numbers were increased to account for estimated 40% dropout attrition post-treatment, giving a required total sample of N= 736 for the fractional factorial design.

However, the same MCID, power and attrition issues apply for all other trial designs. Thus, each individual experiment would need 736 participants to be adequately powered to examine each component: conducting seven separate experiments to investigate each of the seven components would require N= 5,152, or seven times as many participants as the factorial experiment. A parallel comparative RCT to compare each of the components against each other and against a no-treatment control would have 8 arms and require 368 participants per arm, thus requiring N=2,944, or four times as many participants as the factorial experiment. Similar calculations apply for component experiments – for example a dismantling study that compares a full treatment package (all seven treatment components combined), with incrementally dismantled packages, each with a component removed (i.e., all components minus compassion; all components minus compassion and absorption, etc.) would have 7 arms (assuming there is not a no-treatment control), each requiring 368 participants per arm, requiring N=2,576, or 3.5 times as many participants as the factorial design.

Factorial and fractional factorial designs are efficient and economical because rather than making direct comparisons between experimental conditions as in the other designs, the factorial design compares means based on aggregate combinations of experimental conditions. To illustrate within IMPROVE-2, as indicated in Table 1 , the estimate of the main effect of concreteness training is based on comparing the aggregate of conditions 9–16, 25–32 where it is present, versus aggregate of conditions 1–8, 17–24 where it is absent; the estimate of the main effect of relaxation is based on comparing sum of conditions 1-16 versus sum of conditions 17–32; the estimate of the main effect of thought challenging is based on comparing sum of conditions 1, 4, 6, 7, 10, 11, 13, 16, 17, 20, 22, 23, 26, 27, 29, 32 versus the sum of conditions 2, 3,5,8, 9, 12, 14, 15, 18, 19, 21, 24, 25, 28, 30, 31, etc. In this way all participants are involved in every effect estimate—it effectively recycles each participant by placing each participant in one of the levels of every factor. As such, the full sample size can be used to determine each of the main effects, making this design efficient for power and sample size.

The Evaluation Stage of MOST

The third stage in MOST is the evaluation of the optimized intervention. An optimized intervention is systematically built from the results of the factorial experiment by including the most active components with strongest effect sizes relative to the pre-specified optimization criterion, but excluding and eliminating weak inert or antagonistic components. This optimized intervention is tested against the standard evidence-based treatment in a parallel comparative RCT. Thus, to be clear, the MOST approach still retains the parallel comparative RCT as the best method to evaluate one treatment package against another, but adds the factorial design as the most efficient means to investigate the treatment components. In this way, the MOST framework uses rigorous design to identify active elements of a treatment, build a potentially better therapy and then test whether it is an improvement on existing active treatments.

IMPROVE-2 has not yet reached the optimized intervention and evaluation stage. Nonetheless, the logic is clear: based on the results of the IMPROVE-2 factorial experiment, a refined internet CBT treatment package would be produced by retaining those treatment components that had the largest effect sizes for depression, and by removing those components that had minimal or even negative effect sizes. Both the Pareto principle and prior MOST studies suggest that there will be variability in the treatment effect sizes of different components and their interactions, that not all components will be active in the therapeutic benefit of CBT, and indeed, that many will have insignificant effect sizes ( 30 ). As such, it should be possible to concentrate the therapy elements to make CBT more potent, and as a minimum more effective.

This process also considers any potential interactions between components. For example, if there was a significant positive two-way interaction between two components, such that adding one component to the another produced larger treatment effects than either on their own, then these factors may be added to the treatment package. In contrast, if there was a significant negative antagonistic interaction between two components, such that together the treatment benefit was less than either on their own, the component with the weakest positive main effect would be probably removed from the treatment package.

If an examination of the estimated effect size of the optimized intervention from the component selection experiment looked favorable, then this optimized intervention would then be tested against an established internet CBT for depression treatment package, to test whether these modifications improved treatment outcome. If the optimized intervention looked unlikely to out-perform existing treatments in the modeling of the treatment estimates, or was found to not be superior in a subsequent comparative RCT, then the MOST logic is that further iterations through the three phases are needed. If this approach indicates that some but not all components within internet CBT for depression have a significant effect size in reducing depression, it will lead to the building of better therapies that focus on the active ingredients and discard inert or iatrogenic elements.

Potential Limitations

The IMPROVE-2 trial is only one illustration of how the factorial approach could be used to delineate the active components of psychological therapies. As is true for any single study, it has specific limitations. First, it is relatively complex in utilizing seven components. This has the advantage of testing multiple putative active ingredients at once but the risk that with this complex design main treatment effects may be diluted. Adequate testing of treatment components in the factorial design requires each component to be delivered with sufficient difference between the presence and absence of the component to provide a fair test of its main effect. Because the components in IMPROVE-2 each reflect exposure to specific treatment content and techniques, this means that participants need to receive a sufficient dose of the respective content and techniques, that is, complete the relevant modules and practice the relevant behaviors. We sought to achieve this by having each component as a distinct module that is completed over several weeks, and whose content and techniques are then referenced and checked and practised in all subsequent modules and explicitly referred to in the subsequent written feedback from the therapist, to maintain their ongoing use. This meant that the “dose” of treatment elements should be comparable to proven internet CBT treatments and sufficient for testing the main effects.

Nonetheless, there are alternative approaches to tackling this issue. One alternative way to increase treatment dose would be to have a simpler design with fewer treatment components that each run over multiple modules. Another alternative is to test process-focused components such as the degree or nature of therapist support (e.g., support versus no support), or structural components such as the frequency of treatment sessions (e.g., weekly or twice weekly), both of which involving keeping therapy content constant. Such designs straightforwardly deliver a sufficient difference between the presence and absence of the treatment component. Of course, the selection of different components necessarily tests different hypotheses as to the active ingredients of therapy. At this point, it remains an empirical question which of these different components most contributes to treatment outcome. Each approach is equally valid. This is why we strongly advocate for multiple factorial trials to test these different dimensions so that we can systematically enhance therapy.

Related to this limitation, IMPROVE-2 used a fractional factorial design, which raises the potential risk of main effects being confounded with higher-order effects. While this risk is deemed to be very low because 3-way and 4-way interactions are unlikely to be significant, a full factorial design would avoid this assumption. A full factorial would be more suitable for designs utilising fewer components.

A further limitation of the IMPROVE-2 design is that all the components utilize a CBT framework and include generic CBT elements such as self-monitoring, planning, homework and homework review, Socratic review, building new activities, collaboration with the therapist, and a common CBT rationale focusing on thoughts and behavior. As such, if we were to find no main effects for any of the treatment components, we could not determine to what extent any treatment benefit observed was due to non-specific effects common across therapies (such as therapist alliance, remoralization) or due to non-specific effects particular to CBT. Nonetheless, this design still provides a better matched control to investigate specific main effects than prior designs and to test if there any specific main effects. Either pattern of findings (identifying one or more specific main effects of treatment components versus no main effects observed) would still be an advance on our current knowledge and could then be further explored further within the MOST framework.

We have reviewed the importance of better understanding the mechanisms and active ingredients of psychological treatments in order to refine, condense, and strengthen the potency and effectiveness of these treatments. We have shown that standard comparative RCTs and component trials have limitations for determining the specific treatment contributions of individual treatment components within a psychological treatment package and for inferring causality concerning treatment mechanisms. We have shown how factorial and fractional factorial trials can overcome these limitations and have the particular advantages of directly testing individual components and their interactions, of examination of individual mediators and experimental manipulation of hypothesized mechanisms, of being able to distinguish specific factors from common treatment factors, and of being economical and efficient with respect to sample size and resources.

This approach has been illustrated with respect to the IMPROVE-2 trial ( 32 ), which will provide the first examination of the underlying active treatment components within internet CBT for depression. Understanding the active components of therapy will enhance our understanding of therapeutic mechanisms and potentially enable the systematic building of more effective interventions. The IMPROVE-2 trial has completed the recruitment, treatment and follow-up stages, with 767 adult patients with depression recruited, and statistical analyses underway. It is anticipated that these analyses will significantly extend our understanding of how CBT works. We believe that this innovative approach may provide a useful means to address recent requests for rigorous study designs to determine which elements within psychological interventions are core active components ( 4 , 7 , 10 , 11 ).

Data Availability Statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Ethics Statement

The study protocol for IMPROVE-2 was reviewed and approved by the South West National Research Ethics Committee, NHS National Research Ethics Committee SW Frenchay (reference number, 14/SW/1091, 30/4/2015). The trial sponsor is the University of Exeter, contact person Gail Seymour, Research Manager.

Author Contributions

EW and AN both designed, prepared, and delivered the IMPROVE-2 study. EW prepared the first draft of the manuscript, AN commented on the draft, and both EW and AN finalized the manuscript.

Funding for the IMPROVE-2 study was provided by grants from the Cornwall NHS Partnership Foundation Trust and South West Peninsula Academic Health Research Network. Funding sponsors did not participate in the study design; collection, management, analysis, and interpretation of data; or writing of the report. They did not participate in the decision to submit the report for publication, nor had ultimate authority over any of these activities.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

The authors wish to thank profusely all the staff within Cornwall NHS Partnership Foundation Trust who supported this research and all the patients who volunteered to participate in the IMPROVE-2 study.

1. Cuijpers P, Karyotaki E, Weitz E, Andersson G, Hollon SD, van Straten A. The effects of psychotherapies for major depression in adults on remission, recovery and improvement: A meta-analysis. J Affect Disord (2014) 159:118–26. doi: 10.1016/j.jad.2014.02.026

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Cuijpers P, Turner EH, Mohr DC, Hofmann SG, Andersson G, Berking M, et al. Comparison of psychotherapies for adult depression to pill placebo control groups: a meta-analysis. psychol Med (2014) 44(4):685–95. doi: 10.1017/S0033291713000457

3. Cuijpers P, Sijbrandij M, Koole S, Huibers M, Berking M, Andersson G. Psychological treatment of generalized anxiety disorder: A meta-analysis. Clin Psychol Rev (2014) 34(2):130–40. doi: 10.1016/j.cpr.2014.01.002

4. Holmes EA, Craske MG, Graybiel AM. A call for mental-health science. Nature. (2014) 511(7509):287–9. doi: 10.1038/511287a

5. Hollon SD, Munoz RF, Barlow DH, Beardslee WR, Bell CC, Bernal G, et al. Psychosocial intervention development for the prevention and treatment of depression: Promoting innovation and increasing access. Biol Psychiatry (2002) 52(6):610–30. doi: 10.1016/S0006-3223(02)01384-7

6. Andrews G, Issakidis C, Sanderson K, Corry J, Lapsley H. Utilising survey data to inform public policy: Comparison of the cost-effectiveness of treatment of ten mental disorders. Br J Psychiatry (2004) 184:526–33. doi: 10.1192/bjp.184.6.526

7. Kazdin AE. Mediators and mechanisms of change in psychotherapy research. In: . Annual Review of Clinical Psychology. Annual Review of Clinical Psychology , vol. 3. . Palo Alto: Annual Reviews; (2007). p. 1–27.

Google Scholar

8. Cuijpers P, Reijners M, Huibers MJH. The role of common factors in psychotherapy outcomes. Annu Rev Clin Psychol (2018) 15(5):20. doi: 10.1146/annurev-clinpsy-050718-095424

CrossRef Full Text | Google Scholar

9. Cuijpers P, Cristea IA, Karyotaki E, Reijnders M, Hollon SD. Component studies of psychological treatments of adult depression: A systematic review and meta-analysis. Psychother Res (2019) 29(1):15–29. doi: 10.1080/10503307.2017.1395922

10. Holmes EA, Ghaderi A, Harmer CJ, Ramchandani PG, Cuijpers P, Morrison AP, et al. The Lancet Psychiatry Commission on psychological treatments research in tomorrow's science. Lancet Psychiatry (2018) 5(3):237–86. doi: 10.1016/S2215-0366(17)30513-8

11. Institute of Medicine. Psychosocial interventions for mental and substance use disorders: A framework for establishing evidence-based standards. Washington: The National Academies Press; (2015).

12. Kazdin AE. Evidence-based psychotherapies I: qualifiers and limitations in what we know. South Afr J Psychol (2014) 44(4):381–403. doi: 10.1177/0081246314533750

13. Kazdin AE, Blase SL. Rebooting Psychotherapy Research and Practice to Reduce the Burden of Mental Illness. Perspect psychol Sci (2011) 6(1):21–37. doi: 10.1177/1745691610393527

14. Mulder R, Murray G, Rucklidge J. Common versus specific factors in psychotherapy: opening the black box. Lancet Psychiatry (2017) 4(12):953–62. doi: 10.1016/S2215-0366(17)30100-1

15. Wampold BE. How important are the common factors in psychotherapy? An update. World Psychiatry (2015) 14(3):270–7. doi: 10.1002/wps.20238

16. DeRubeis RJ, Brotman MA, Gibbons CJ. A conceptual and methodological analysis of the nonspecifics argument. Clin Psychol-Sci Pract (2005) 12(2):174–83. doi: 10.1093/clipsy.bpi022

17. Gloaguen V, Cottraux J, Cucherat M, Blackburn IM. A meta-analysis of the effects of cognitive therapy in depressed patients. J Affect Disord (1998) 49(1):59–72. doi: 10.1016/S0165-0327(97)00199-7

18. Marcus DK, O'Connell D, Norris AL, Sawaqdeh A. Is the Dodo bird endangered in the 21st century? A meta-analysis of treatment comparison studies. Clin Psychol Rev (2014) 34(7):519–30. doi: 10.1016/j.cpr.2014.08.001

19. Wampold BE, Minami T, Baskin TW, Tierney SC. A meta-(re)analysis of the effects of cognitive therapy versus ‘other therapies' for depression. J Affect Disord (2002) 68(2-3):159–65. doi: 10.1016/S0165-0327(00)00287-1

20. Baskin TW, Tierney SC, Minami T, Wampold BE. Establishing specificity in psychotherapy: A meta-analysis of structural equivalence of placebo controls. J Consulting Clin Psychol (2003) 71(6):973–9. doi: 10.1037/0022-006X.71.6.973

21. Bell EC, Marcus DK, Goodlad JK. Are the Parts as Good as the Whole? A Meta-Analysis of Component Treatment Studies. J Consulting Clin Psychol (2013) 81(4):722–36. doi: 10.1037/a0033004

22. Jacobson NS, Dobson KS, Truax PA, Addis ME, Koerner K, Gollan JK, et al. A component analysis of cognitive-behavioral treatment for depression. J Consulting Clin Psychol (1996) 64(2):295–304. doi: 10.1037/0022-006X.64.2.295

23. Collins LM, Dziak JJ, Kugler KC, Trail JB. Factorial Experiments Efficient Tools for Evaluation of Intervention Components. Am J Prev Med (2014) 47(4):498–504. doi: 10.1016/j.amepre.2014.06.021

24. Baker TB, Smith SS, Bolt DM, Loh WY, Mermelstein R, Fiore MC, et al. Implementing Clinical Research Using Factorial Designs: A Primer. Behav Ther (2017) 48(4):567–80. doi: 10.1016/j.beth.2016.12.005

25. Dziak JJ, Nahum-Shani I, Collins LM. Multilevel Factorial Experiments for Developing Behavioral Interventions: Power, Sample Size, and Resource Considerations. Psychol Methods (2012) 17(2):153–75. doi: 10.1037/a0026972

26. Chakraborty B, Collins LM, Strecher VJ, Murphy SA. Developing multicomponent interventions using fractional factorial designs. Stat Med (2009) 28(21):2687–708. doi: 10.1002/sim.3643

27. Collins LM, Dziak JJ, Li RZ. Design of Experiments With Multiple Independent Variables: A Resource Management Perspective on Complete and Reduced Factorial Designs. Psychol Methods (2009) 14(3):202–24. doi: 10.1037/a0015826

28. Gwadz MV, Collins LM, Cleland CM, Leonard NR, Wilton L, Gandhi M, et al. Using the multiphase optimization strategy (MOST) to optimize an HIV care continuum intervention for vulnerable populations: a study protocol. BMC Public Health (2017) 17(1):383. doi: 10.1186/s12889-017-4279-7

29. Piper ME, Fiore MC, Smith SS, Fraser D, Bolt DM, Collins LM, et al. Identifying effective intervention components for smoking cessation: a factorial screening experiment. Addiction (2016) 111(1):129–41. doi: 10.1111/add.13162

30. Strecher VJ, McClure JB, Alexander GL, Chakraborty B, Nair VN, Konkel JM, et al. Web-based smoking-cessation programs - Results of a randomized trial. Am J Prev Med (2008) 34(5):373–81. doi: 10.1016/j.amepre.2007.12.024

31. Uwatoko T, Luo Y, Sakata M, Kobayashi D, Sakagami Y, Takemoto K, et al. Healthy Campus Trial: a multiphase optimization strategy (MOST) fully factorial trial to optimize the smartphone cognitive behavioral therapy (CBT) app for mental health promotion among university students: study protocol for a randomized controlled trial. Trials (2018) 19(1):353. doi: 10.1186/s13063-018-2719-z

32. Watkins E, Newbold A, Tester-Jones M, Javaid M, Cadman J, Collins LM, et al. Implementing multifactorial psychotherapy research in online virtual environments (IMPROVE-2): study protocol for a phase III trial of the MOST randomized component selection method for internet cognitive-behavioural therapy for depression. BMC Psychiatry (2016) 16:13. doi: 10.1186/s12888-016-1054-8

33. Wilbur J, Kolanowski AM, Collins LM. Utilizing MOST frameworks and SMART designs for intervention research. Nurs Outlook (2016) 64(4):287–9. doi: 10.1016/j.outlook.2016.04.005

34. Collins LM, Baker TB, Mermelstein RJ, Piper ME, Jorenby DE, Smith SS, et al. The Multiphase Optimization Strategy for Engineering Effective Tobacco Use Interventions. Ann Behav Med (2011) 41(2):208–26. doi: 10.1007/s12160-010-9253-x

35. Collins LM, Murphy SA, Strecher V. The Multiphase Optimization Strategy (MOST) and the Sequential Multiple Assignment Randomized Trial (SMART) - New methods for more potent eHealth interventions. Am J Prev Med (2007) 32(5):S112–S8. doi: 10.1016/j.amepre.2007.01.022

36. Collins LM, Murphy SA, Nair VN, Strecher VJ. A strategy for optimizing and evaluating behavioral interventions. Ann Behav Med (2005) 30(1):65–73. doi: 10.1207/s15324796abm3001_8

37. Spring B, Phillips SM, Hoffman SA, Millstein RA, Collins LM. Optimization Experiments In The Field: The Most Framework Through 3 Clinical Trials. Ann Behav Med (2018) 52:S180–S. doi: 10.1093/abm/kay013

38. Collins LM, Kugler KC, Gwadz MV. Optimization of Multicomponent Behavioral and Biobehavioral Interventions for the Prevention and Treatment of HIV/AIDS. AIDS Behav (2016) 20:S197–214. doi: 10.1007/s10461-015-1145-4

39. Piper ME, Cook JW, Sehlam TR, Jorenby DE, Smith SS, Collins LM, et al. A Randomized Controlled Trial of an Optimized Smoking Treatment Delivered in Primary Care. Ann Behav Med (2018) 52(10):854–64. doi: 10.1093/abm/kax059

40. Schlam TR, Fiore MC, Smith SS, Fraser D, Bolt DM, Collins LM, et al. Comparative effectiveness of intervention components for producing long-term abstinence from smoking: a factorial screening experiment. Addiction (2016) 111(1):142–55. doi: 10.1111/add.13153

41. Craig P, Dieppe P, Macintyre S, Michie S, Nazareth I, Petticrew M. Developing and evaluating complex interventions: the new Medical Research Council guidance. Br Med J (2008) 337(7676). doi: 10.1136/bmj.a1655

42. Medical Research Council. Developing and evaluating complex interventions. (2008). https://mrc.ukri.org/documents/pdf/complex-interventions-guidance/

43. Collins LM, Chakraborty B, Murphy SA, Strecher V. Comparison of a phased experimental approach and a single randomized clinical trial for developing multicomponent behavioural interventions. Clin Trials (2009) 6(1):5–15. doi: 10.1177/1740774508100973

44. Kroenke K, Spitzer RL, Williams JBW. The PHQ-9 - Validity of a brief depression severity measure. J Gen Intern Med (2001) 16(9):606–13. doi: 10.1046/j.1525-1497.2001.016009606.x

45. Abraham C, Michie S. A taxonomy of behavior change techniques used in interventions. Health Psychol (2008) 27(3):379–87. doi: 10.1037/0278-6133.27.3.379

46. Roth AD, Pilling S. Using an Evidence-Based Methodology to Identify the Competences Required to Deliver Effective Cognitive and Behavioural Therapy for Depression and Anxiety Disorders. Behav Cogn Psychother (2008) 36(2):129–47. doi: 10.1017/S1352465808004141

47. Dimidjian S, Barrera M, Martell C, Munoz RF, Lewinsohn PM. The Origins and Current Status of Behavioral Activation Treatments for Depression. In: NolenHoeksema S, Cannon TD, Widiger T, editors. Annual Review of Clinical Psychology. Annual Review of Clinical Psychology. Palo Alto: Annual Reviews; (2011). p. 1–38. p. 7.

48. Watkins ER, Taylor RS, Byng R, Baeyens C, Read R, Pearson K, et al. Guided self-help concreteness training as an intervention for major depression in primary care: a Phase II randomized controlled trial. psychol Med (2012) 42(7):1359–71. doi: 10.1017/S0033291711002480

49. Watkins ER, Mullan E, Wingrove J, Rimes K, Steiner H, Bathurst N, et al. Rumination-focused cognitive-behavioural therapy for residual depression: phase II randomised controlled trial. Br J Psychiatry (2011) 199(4):317–22. doi: 10.1192/bjp.bp.110.090282

50. Ferster CB. Functional Analysis Of Depression. Am Psychol (1973) 28(10):857–70. doi: 10.1037/h0035605

51. Jacobson NS, Martell CR, Dimidjian S. Behavioral activation treatment for depression: Returning to contextual roots. Clin Psychol-Sci Pract (2001) 8(3):255–70. doi: 10.1093/clipsy.8.3.255

52. Watkins ER, Nolen-Hoeksema S. A Habit-Goal Framework of Depressive Rumination. J Abnormal Psychol (2014) 123(1):24–34. doi: 10.1037/a0035540

53. Beck AT. Cognitive Therapy and the Emotional Disorders. New York: International Universities Press; (1976).

54. Beck AT. Depression: Clinical, Experimental and Theoretical Aspects. New York: Hoeber; (1967).

55. Beck ATR AJ, Shaw BF, Emery G. Cognitive Therapy of Depression. New York: Guilford Press; (1979).

56. Watkins. Constructive and unconstructive repetitive thought. psychol Bull (2008) 134(2):163–206. doi: 10.1037/0033-2909.134.2.163

57. Watkins E, Moberly NJ, Moulds ML. Processing mode causally influences emotional reactivity: distinct effects of abstract versus concrete construal on emotional response. Emotion. (2008) 8(3):364–78. doi: 10.1037/1528-3542.8.3.364

58. Carver CS. Generalization, adverse events, and development of depressive symptoms. J Pers (1998) 66(4):607–19. doi: 10.1111/1467-6494.00026

59. Gilbert P. The origins and nature of compassion focused therapy. Br J Clin Psychol (2014) 53(1):6–41. doi: 10.1111/bjc.12043

60. Matos M, Duarte C, Duarte J, Pinto-Gouveia J, Petrocchi N, Basran J, et al. Psychological and Physiological Effects of Compassionate Mind Training: a Pilot Randomised Controlled Study. Mindfulness (2017) 8(6):1699–712. doi: 10.1007/s12671-017-0745-7

61. Gilbert P, Procter S. Compassionate mind training for people with high shame and self-criticism: Overview and pilot study of a group therapy approach. Clin Psychol Psychother (2006) 13(6):353–79. doi: 10.1002/cpp.507

62. Leary MR, Tate EB, Adams CE, Allen AB, Hancock J. Self-compassion and reactions to unpleasant self-relevant events: the implications of treating oneself kindly. J Pers Soc Psychol (2007) 92(5):887–904. doi: 10.1037/0022-3514.92.5.887

63. Wu CJ, Hamada MS. Experiments: planning, analysis, and optimization. 3rd ed. New York: John Wiley & Sons; (2011).

64. Treynor W, Gonzalez R, Nolen-Hoeksema S. Rumination reconsidered: A psychometric analysis. Cogn Ther Res (2003) 27(3):247–59. doi: 10.1023/A:1023910315561

65. Raes F, Pommier E, Neff KD, Van Gucht D. Construction and Factorial Validation of a Short Form of the Self-Compassion Scale. Clin Psychol Psychother (2011) 18(3):250–5. doi: 10.1002/cpp.702

66. Hollon SD, Kendall PC. Cognitive Self-Statements In Depression - Development Of An Automatic Thoughts Questionnaire. Cogn Ther Res (1980) 4(4):383–95. doi: 10.1007/BF01178214

67. Manos RC, Kanter JW, Luo W. The Behavioral Activation for Depression Scale Short Form: Development and Validation. Behav Ther (2011) 42(4):726–39. doi: 10.1016/j.beth.2011.04.004

68. Engeser S, Rheinberg F. Flow, performance and moderators of challenge-skill balance. Motiv Emot (2008) 32(3):158–72. doi: 10.1007/s11031-008-9102-4

69. Kraemer HC, Wilson GT, Fairburn CG, Agras WS. Mediators and moderators of treatment effects in randomized clinical trials. Arch Gen Psychiatry (2002) 59(10):877–83. doi: 10.1001/archpsyc.59.10.877

Keywords: factorial, mechanism, psychotherapy, specific factors, common factors, cognitive behavioral therapy

Citation: Watkins ER and Newbold A (2020) Factorial Designs Help to Understand How Psychological Therapy Works. Front. Psychiatry 11:429. doi: 10.3389/fpsyt.2020.00429

Received: 19 June 2019; Accepted: 27 April 2020; Published: 14 May 2020.

Reviewed by:

Copyright © 2020 Watkins and Newbold. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Edward R. Watkins, [email protected]

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

  • Access through  your organization
  • Purchase PDF

Article preview

Cited by (1), translational surgery, chapter 52 - factorial design : design, measures, and classic examples, references (0), application of experimental design for optimization of interfacial adhesion between rubber and surface modified polyester fabric-factory experience.

advantages of factorial experimental design

  • Voxco Online
  • Voxco Panel Management
  • Voxco Panel Portal
  • Voxco Audience
  • Voxco Mobile Offline
  • Voxco Dialer Cloud
  • Voxco Dialer On-premise
  • Voxco TCPA Connect
  • Voxco Analytics
  • Voxco Text & Sentiment Analysis

advantages of factorial experimental design

  • 40+ question types
  • Drag-and-drop interface
  • Skip logic and branching
  • Multi-lingual survey
  • Text piping
  • Question library
  • CSS customization
  • White-label surveys
  • Customizable ‘Thank You’ page
  • Customizable survey theme
  • Reminder send-outs
  • Survey rewards
  • Social media
  • Website surveys
  • Correlation analysis
  • Cross-tabulation analysis
  • Trend analysis
  • Real-time dashboard
  • Customizable report
  • Email address validation
  • Recaptcha validation
  • SSL security

Take a peek at our powerful survey features to design surveys that scale discoveries.

Download feature sheet.

  • Hospitality
  • Academic Research
  • Customer Experience
  • Employee Experience
  • Product Experience
  • Market Research
  • Social Research
  • Data Analysis

Explore Voxco 

Need to map Voxco’s features & offerings? We can help!

Watch a Demo 

Download Brochures 

Get a Quote

  • NPS Calculator
  • CES Calculator
  • A/B Testing Calculator
  • Margin of Error Calculator
  • Sample Size Calculator
  • CX Strategy & Management Hub
  • Market Research Hub
  • Patient Experience Hub
  • Employee Experience Hub
  • NPS Knowledge Hub
  • Market Research Guide
  • Customer Experience Guide
  • Survey Research Guides
  • Survey Template Library
  • Webinars and Events
  • Feature Sheets
  • Try a sample survey
  • Professional Services

advantages of factorial experimental design

Get exclusive insights into research trends and best practices from top experts! Access Voxco’s ‘State of Research Report 2024 edition’ .

We’ve been avid users of the Voxco platform now for over 20 years. It gives us the flexibility to routinely enhance our survey toolkit and provides our clients with a more robust dataset and story to tell their clients.

VP Innovation & Strategic Partnerships, The Logit Group

  • Client Stories
  • Voxco Reviews
  • Why Voxco Research?
  • Careers at Voxco
  • Vulnerabilities and Ethical Hacking

Explore Regional Offices

  • Survey Software The world’s leading omnichannel survey software
  • Online Survey Tools Create sophisticated surveys with ease.
  • Mobile Offline Conduct efficient field surveys.
  • Text Analysis
  • Close The Loop
  • Automated Translations
  • NPS Dashboard
  • CATI Manage high volume phone surveys efficiently
  • Cloud/On-premise Dialer TCPA compliant Cloud on-premise dialer
  • IVR Survey Software Boost productivity with automated call workflows.
  • Analytics Analyze survey data with visual dashboards
  • Panel Manager Nurture a loyal community of respondents.
  • Survey Portal Best-in-class user friendly survey portal.
  • Voxco Audience Conduct targeted sample research in hours.
  • Predictive Analytics
  • Customer 360
  • Customer Loyalty
  • Fraud & Risk Management
  • AI/ML Enablement Services
  • Credit Underwriting

advantages of factorial experimental design

Find the best survey software for you! (Along with a checklist to compare platforms)

Get Buyer’s Guide

  • 100+ question types
  • SMS surveys
  • Financial Services
  • Banking & Financial Services
  • Retail Solution
  • Risk Management
  • Customer Lifecycle Solutions
  • Net Promoter Score
  • Customer Behaviour Analytics
  • Customer Segmentation
  • Data Unification

Explore Voxco 

Watch a Demo 

Download Brochures 

  • CX Strategy & Management Hub
  • The Voxco Guide to Customer Experience
  • Professional services
  • Blogs & White papers
  • Case Studies

Find the best customer experience platform

Uncover customer pain points, analyze feedback and run successful CX programs with the best CX platform for your team.

Get the Guide Now

advantages of factorial experimental design

VP Innovation & Strategic Partnerships, The Logit Group

  • Why Voxco Intelligence?
  • Our clients
  • Client stories
  • Featuresheets

Adherence to Schedule cvr 1

Factorial Experimental Design: A Comprehensive Guide For Researchers

  • September 30, 2021

SHARE THE ARTICLE ON

Factorial design is a statistical method used in experimental research that helps you study the effects of multiple factors simultaneously. It allows you to investigate the influence of multiple variables on survey outcomes in a systematic manner. 

By monitoring interactions between factors, factorial experimental designs provides you insights into how these variables may interact to produce certain outcomes, which can help inform more nuanced interpretation of your survey findings. 

Additionally, factorial designs are efficient in data collection. The method allows you to examine multiple factors within the same study, thus reducing the need to conduct separate experiments for each intended variable. 

In this blog, we will understand the application of this experimental design in real world and explore its advantages.

New call-to-action

What is Factorial Experimental Design?

Some experiments involve the study of the effects of multiple factors. For such studies, the factorial experimental design is very useful. A full factorial design, also known as fully crossed design, refers to an experimental design that consists of two or more factors, with each factor having multiple discrete possible values or “levels”. 

Using this design, all the possible combinations of factor levels can be investigated in each replication. Although several factors can affect the variable being studied in factorial experiments, this design specifically aims to identify the main effects and the interaction effects among the different factors.

Read how Voxco helped Brain Research improve research productivity by 60%.

Example of factorial design in real-world.

Let’s assume a company wants to launch a new product and is conducting a market research survey to understand the preferences of its target consumers. The objective of MR survey is to identify the factors that may potentially influence consumers’ purchase decisions and to use the insight to optimize the product features accordingly. 

You can design the survey using factorial design principles by identifying the key factors that may influence purchase decisions, such as price, product features, marketing/advertising channels, brand reputation, and brand awareness.

You can manipulate each factor at different levels, such as high or low prices, marketing/advertising platforms, and product features, resulting in multiple conditions. 

After gathering the survey responses, you can implement statistical data analysis to examine the main effects of each factor and any interactions between factors. 

This provides valuable insights into which factors result in the greatest impact on consumer preferences. 

Components of factorial experimental design

To understand the factorial experimental design, you must be well-acquainted with the following terms:

1. Factors: 

This is a broad term used to describe the independent variable that is manipulated in the experiment by the researcher or through selection. 

2. Main Effects: 

The main effect of a factor refers to the change produced in response to a change in the level of the factor. Therefore, the effect of factor A is the difference between the average response at A1 and A2.

3. Interaction: 

An Interaction between factors occurs when the difference in response between the levels of one factor is not the same at all the levels of the other factor.

There are three main types of interactions:

  • Antagonistic Interaction: When the main effect is non-significant, and interaction is significant. In this interaction, the two independent variables are likely to reverse the effect of each other. 
  • Synergistic Interaction: When the higher level of one independent variable enhances the effect of the other. 
  • Ceiling Effect Interaction: When the higher level of one independent variable lowers the differential effect of another variable. 

When there is a large interaction, the main effects have little practical meaning, as a significant interaction often masks the significance of the main effects.

See how easily you can create, test, distribute, and design the surveys.

  • 50+ question types
  • Advanced logic 
  • White-label
  • Auto-translations in 100+ languages

Types of factorial design

There are three main types of factorial designs, namely “Within Subject Factorial Design”, “Between Subject Factorial Design”, and “Mixed Factorial Design”. 

1. Within Subject Factorial Design: In this factorial design, all of the independent variables are manipulated within subjects.  

2. Between Subject Factorial Design: In the Between Subject Factorial Design, the subjects are assigned to different conditions and each subject only experiences one of the experimental conditions. 

3. Mixed Factorial Design: This design is most commonly used in the study of psychology. It is named the ‘Mixed Factorial Design’ because it has at least one Within Subject variable and one Between Subject variable.

Advantages of factorial experimental design

The following are a few advantages of using the factorial experimental design: 

1. Efficient: 

When compared to one-factor-at-a-time (OFAT) experiments, factorial designs are significantly more efficient and can provide more information at a similar or lower cost. It can also help find optimal conditions quicker than OFAT experiments can.

2. Comprehensive results: 

Researchers can employ the factorial design to calculate the effects of a factor as an estimate at several levels of other factors. This can yield conclusions that are valid over a range of different experimental conditions. 

3. Flexibility: 

The factorial design offers flexibility in experimental design by allowing you to manipulate multiple variables simultaneously and evaluate various combinations of factor levels. The flexibility enables you to explore complex relationships between factors in a controlled setting.

Voxco helps the top 50 MR firms & 450+ global brands gather omnichannel feedback, measure sentiment, uncover insights, and act on them.

See how Voxco can enhance your research efficiency.

4. Increased statistical efficiency: 

This experimental design is statistically efficient, as it helps yield more precise estimates of main effects and interactions with fewer observations. The shared variance among factors allows you the sensitivity to detect effects within the same sample size. 

5. Enhanced external validity: 

Factorial experimental design enhances the external validity or generalizability of research findings by systematically manipulating multiple factors. When you employ factorial design in surveys, it provides insights that represent the complexities and nuances present in the natural environment. This increases the applicability of the survey findings to a broader population.

Market Research toolkit to start your market research surveys and studies.

Factorial design is a powerful tool for survey research. By incorporating its principles into your survey, you can enhance the quality of the research findings, leading to more informed and actionable insights.

Explore Voxco Survey Software

Online page new product image3 02.png 1

+ Omnichannel Survey Software 

+ Online Survey Software 

+ CATI Survey Software 

+ IVR Survey Software 

+ Market Research Tool

+ Customer Experience Tool 

+ Product Experience Software 

+ Enterprise Survey Software 

Types of Quantitative Research

Types of Quantitative Research

What Are The 4 Types Of Quantitative Research? SHARE THE ARTICLE ON Table of Contents Statistical analysis, surveys, numerical data, and questionnaires are few words

Survey question 12

Open-ended question

Survey Features Open-ended question Build effective and detailed surveys using open-ended questions! Get a free evaluation Unlock your Sample Survey Get your current survey solution

iStock 1189302451 S cover

Dropout Analysis: Definition, Parameters, Advantages ,and Tips

Survey Features Dropout Analysis: Definition, Parameters, Advantages ,and Tips Get a free evaluation Unlock your Sample Survey Get your current survey solution evaluated by our

Descriptive Research cvr 1

SELF-SERVICE ANALYTICS

SELF-SERVICE ANALYTICS The 2022 Buyer’s Guide to Customer Experience Software To help you sort through thousands of CX software, we have compiled this buyer’s guide

Customer 360 data model

Customer 360 data model SHARE THE ARTICLE ON Table of Contents What is customer 360? When it comes to data management and data warehousing, many

MicrosoftTeams image 19

What is Customer Experience

What is Customer Experience? Free Download: Enhance NPS® Scores using our NPS® guide. Download Now SHARE THE ARTICLE ON Table of Contents Customer experience (CX)

We use cookies in our website to give you the best browsing experience and to tailor advertising. By continuing to use our website, you give us consent to the use of cookies. Read More

Name Domain Purpose Expiry Type
hubspotutk www.voxco.com HubSpot functional cookie. 1 year HTTP
lhc_dir_locale amplifyreach.com --- 52 years ---
lhc_dirclass amplifyreach.com --- 52 years ---
Name Domain Purpose Expiry Type
_fbp www.voxco.com Facebook Pixel advertising first-party cookie 3 months HTTP
__hstc www.voxco.com Hubspot marketing platform cookie. 1 year HTTP
__hssrc www.voxco.com Hubspot marketing platform cookie. 52 years HTTP
__hssc www.voxco.com Hubspot marketing platform cookie. Session HTTP
Name Domain Purpose Expiry Type
_gid www.voxco.com Google Universal Analytics short-time unique user tracking identifier. 1 days HTTP
MUID bing.com Microsoft User Identifier tracking cookie used by Bing Ads. 1 year HTTP
MR bat.bing.com Microsoft User Identifier tracking cookie used by Bing Ads. 7 days HTTP
IDE doubleclick.net Google advertising cookie used for user tracking and ad targeting purposes. 2 years HTTP
_vwo_uuid_v2 www.voxco.com Generic Visual Website Optimizer (VWO) user tracking cookie. 1 year HTTP
_vis_opt_s www.voxco.com Generic Visual Website Optimizer (VWO) user tracking cookie that detects if the user is new or returning to a particular campaign. 3 months HTTP
_vis_opt_test_cookie www.voxco.com A session (temporary) cookie used by Generic Visual Website Optimizer (VWO) to detect if the cookies are enabled on the browser of the user or not. 52 years HTTP
_ga www.voxco.com Google Universal Analytics long-time unique user tracking identifier. 2 years HTTP
_uetsid www.voxco.com Microsoft Bing Ads Universal Event Tracking (UET) tracking cookie. 1 days HTTP
vuid vimeo.com Vimeo tracking cookie 2 years HTTP
Name Domain Purpose Expiry Type
__cf_bm hubspot.com Generic CloudFlare functional cookie. Session HTTP
Name Domain Purpose Expiry Type
_gcl_au www.voxco.com --- 3 months ---
_gat_gtag_UA_3262734_1 www.voxco.com --- Session ---
_clck www.voxco.com --- 1 year ---
_ga_HNFQQ528PZ www.voxco.com --- 2 years ---
_clsk www.voxco.com --- 1 days ---
visitor_id18452 pardot.com --- 10 years ---
visitor_id18452-hash pardot.com --- 10 years ---
lpv18452 pi.pardot.com --- Session ---
lhc_per www.voxco.com --- 6 months ---
_uetvid www.voxco.com --- 1 year ---

Lean Six Sigma Training Certification

6sigma.us

  • Facebook Instagram Twitter LinkedIn YouTube
  • (877) 497-4462

SixSigma.us

Full Factorial Design: Comprehensive Guide for Optimal Experimentation

June 4th, 2024

The full factorial design distinguishes itself as a robust, enlightening experimentation approach across industries.

At its core, this systematic method examines multiple metrics’ collective effects on an outcome simultaneously.

Considering all factor level combinations furnishes holistic comprehension beyond individual impacts—illuminating intricate relationships shaping complex systems’ behaviors.

The factorial design’s strength lies in realistically emulating dynamics’ nuances where variables interact intricately nonlinearly. Accounting for interplays guards against oversimplification. This casts light on underlying realities profoundly, priming informed resolutions and refinement pursuits.

From manufacturing to research frontiers, ponder occasions where exhaustively investigating factor relations meaningfully addresses specific problems or opportunities.

Key Highlights

  • A comprehensive exploration of factor effects and interactions
  • Robust methodology for process understanding and optimization
  • Applications across diverse industries, including manufacturing, pharmaceuticals, and marketing
  • Rigorous statistical analysis techniques, such as ANOVA and regression modeling
  • Insights into factor-level optimization for enhanced process performance
  • Evaluation of alternative designs, including fractional factorial and response surface methodologies
  • Best practices for experimental design, data collection, and analysis

Introduction to Full Factorial Design

The full factorial design stands out as a comprehensive and robust approach, enabling researchers and practitioners to unlock valuable insights and drive meaningful change across diverse industries.

The full factorial design is a systematic way to investigate the effects of multiple factors on a response variable simultaneously.

Image: Full Factorial Design

By considering all possible combinations of factor levels, this experimental strategy provides a holistic understanding of not only the individual factor effects but also the intricate interactions that can shape outcomes in complex systems.

A full factorial design is an experimental design that considers the effects of multiple factors simultaneously on a response variable.

It involves manipulating all possible combinations of the levels of each factor, enabling researchers to determine the main effects of individual factors as well as their interactions statistically.

This comprehensive approach ensures that no potential interaction is overlooked, providing a complete picture of the system under investigation.

The key benefits of employing a full factorial design include:

  • Main effects : Researchers can identify which factors have the most significant impact on the response variable, allowing them to focus their efforts on the most influential factors.
  • Interaction effects : By accounting for interactions between factors, full factorial designs reveal how the effect of one factor depends on the level of another factor, providing insights into the complex relationships within the system.
  • Optimization: With a comprehensive understanding of the main effects and interactions, researchers can estimate the optimal settings for the independent variables, leading to the best possible outcome for the response variable.

The versatility of the full factorial design makes it a valuable tool across various industries, including:

  • Manufacturing : Optimizing processes, improving product quality, and reducing defects by identifying key variables and their interactions.
  • Pharmaceuticals : Formulating and developing drugs by assessing factors such as excipient concentrations, drug particle size, and processing conditions on bioavailability, stability, and release profiles.
  • Marketing: Optimizing promotional strategies by evaluating the effects of factors like ad content, media channels, target audience segments, and pricing on consumer response.

Fundamentals of Design of Experiments (DOE)

Before delving into the intricacies of full factorial design, it is essential to understand the fundamental principles of Design of Experiments (DOE) , a systematic approach to investigating the relationships between input variables (factors) and output variables (responses) .

DOE provides a structured framework for planning, executing, and analyzing experiments, ensuring reliable and insightful results.

Understanding Factors and Levels

In a DOE study, the variables that are manipulated or controlled by the experimenter are known as independent variables or factors.

These can be further classified into:

  • Numerical factors : Variables that can take on a range of numerical values, such as temperature, pressure, or time.
  • Categorical factors : Variables that have distinct, non-numerical levels, such as material type or production method.

The outcome or characteristic of interest that is measured and analyzed in an experiment is referred to as the response variable or dependent variable.

Common examples include product yield, strength, purity, or customer satisfaction.

Each independent variable in a DOE study can be set at different levels or values.

The choice of factor levels is crucial, as it determines the range of conditions under which the experiment is conducted.

Carefully selecting factor levels ensures that the study captures the relevant region of interest and provides meaningful insights into the system’s behavior.

Principles of DOE

Replication refers to the practice of repeating the same experimental run multiple times under identical conditions.

This allows researchers to estimate the inherent variability in the experimental process and ensures the reliability of the results by providing a measure of experimental error.

Randomization is the process of randomly assigning experimental runs to different factor level combinations.

This helps to mitigate the potential impact of nuisance variables and ensures that any observed effects can be attributed to the factors under investigation, rather than uncontrolled sources of variation .

Blocking is a technique used to account for known sources of variability in an experiment, such as differences in equipment, operators, or environmental conditions.

By grouping experimental runs into homogeneous blocks, researchers can isolate and quantify the effects of these nuisance variables, ensuring more precise estimates of the factor effects.

Types of Full Factorial Designs

Full factorial designs can be classified into different types based on the number of levels for each factor and the nature of the factors themselves.

Understanding the various types of full factorial designs is crucial for selecting the appropriate experimental strategy to address specific research questions or process optimization objectives.

2-Level Full Factorial Design

The 2-level full factorial design, where each factor has two levels (typically labeled as “low” and “high”), is commonly employed in screening experiments.

These experiments aim to identify the most significant factors influencing the response variable, allowing researchers to focus their efforts on the most promising factors in subsequent, more in-depth investigations.

By evaluating the main effects and interactions in a 2-level full factorial design, researchers can determine which factors have a statistically significant impact on the response variable.

This information is invaluable in prioritizing factors for further optimization or confirming their negligible influence, thereby streamlining the overall experimental process.

3-Level Full Factorial Design

Unlike the 2-level design, which assumes a linear relationship between factors and the response variable, the 3-level full factorial design allows for the investigation of quadratic effects.

These nonlinear effects can be important in scenarios where the response variable exhibits curvature or a peak/valley behavior within the explored factor ranges.

By incorporating three levels for each factor, researchers can model the curvature in the response surface more accurately.

This enhanced understanding of the system’s behavior enables more precise optimization and provides insights into potential optimal operating regions or factor level combinations.

Mixed-Level Full Factorial Design

In many real-world applications, experiments may involve a combination of categorical factors (e.g., material type, production method) and continuous factors (e.g., temperature, pressure).

The mixed-level full factorial design accommodates this scenario by allowing researchers to investigate the effects of both types of factors simultaneously, providing a comprehensive understanding of the system.

Analyzing Full Factorial Design Experiments

Once the experimental data has been collected, the next step is to analyze the results to gain insights into the main effects, interactions, and optimal factor level combinations.

Several statistical techniques are employed in the analysis of full factorial experiments, each serving a specific purpose and providing valuable information for process understanding and optimization.

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) is a powerful statistical tool used to determine the significance of main effects (individual factor effects) and interaction effects (combined effects of multiple factors) on the response variable.

By partitioning the total variability in the data into components attributable to each factor and their interactions, ANOVA enables researchers to identify the most influential factors and their relationships.

ANOVA also provides a framework for hypothesis testing, allowing researchers to assess whether the observed effects are statistically significant or simply due to random variability.

This rigorous approach ensures that conclusions drawn from the experimental data are statistically valid and reliable.

Regression Analysis

Regression analysis is another essential tool in the analysis of full factorial experiments.

It involves fitting a mathematical model to the experimental data , relating the response variable to the independent variables (factors) and their interactions.

This model can be used to predict the response variable for any combination of factor levels within the experimental region.

Once a suitable regression model has been obtained, researchers can employ optimization techniques to identify the factor level combinations that maximize or minimize the response variable.

These techniques often involve solving the regression equation subject to relevant constraints, enabling the determination of optimal operating conditions for the process under investigation.

Graphical Analysis & Full Factorial Design

Graphical analysis is a powerful tool for visualizing and interpreting the results of full factorial experiments.

Interaction plots are particularly useful for examining the presence and nature of interactions between factors.

These plots display the response variable as a function of one factor at different levels of another factor, allowing researchers to identify and understand complex relationships within the system.

Main effects plots , on the other hand, illustrate the individual impact of each factor on the response variable, providing a visual representation of the main effects.

These plots can aid in quickly identifying the most influential factors and assessing the relative importance of each factor in the experimental domain.

Advantages and Limitations of Full Factorial Design

While the full factorial design offers numerous advantages in terms of comprehensiveness and insight, it is important to recognize its limitations and potential drawbacks.

Understanding both the strengths and limitations of this experimental approach is crucial for making informed decisions and optimizing the trade-offs between resource allocation and the desired level of process understanding.

Advantages of Full Factorial Design

One of the primary advantages of the full factorial design is its ability to provide comprehensive insights into the system under investigation.

By considering all possible factor combinations, researchers can obtain a complete picture of the main effects, interactions, and potential curvature in the response surface, leading to a thorough understanding of the process dynamics.

Unlike some experimental designs that may overlook or confound interactions, the full factorial design explicitly accounts for interactions between factors.

This capability is particularly valuable in complex systems where the effect of one factor may depend on the level of another factor, allowing researchers to unravel these intricate relationships and optimize processes accordingly.

With a comprehensive understanding of the main effects and interactions, full factorial experiments enable researchers to estimate the optimal settings for the independent variables, leading to the best possible outcome for the response variable.

This optimization potential is invaluable in various industries, where process efficiency , product quality, and cost-effectiveness are paramount.

Limitations of Full Factorial Design

One of the primary limitations of the full factorial design is its resource-intensive nature.

As the number of factors and levels increases, the number of experimental runs required grows exponentially, leading to higher costs, longer experimental durations, and greater logistical challenges.

Related to the resource-intensive aspect, full factorial designs often require large sample sizes to ensure statistical validity and reliable estimates of main effects and interactions.

This can be particularly challenging in situations where resources are limited or experimental conditions are difficult to replicate.

The comprehensiveness of the full factorial design can also lead to an overwhelming amount of data, especially when dealing with numerous factors and levels.

Analyzing and interpreting such large datasets can be a daunting task, requiring advanced statistical techniques and computational resources.

Alternative Designs and Extensions

While the full factorial design is a powerful and comprehensive experimental strategy, some alternative designs and extensions can be employed depending on the specific requirements and constraints of the research or industrial application.

These alternative approaches can offer trade-offs between experimental complexity, resource requirements, and the level of information obtained.

Fractional Factorial Designs

Fractional factorial designs are a class of experimental designs that involve studying only a carefully chosen fraction of the full factorial design.

By sacrificing the ability to estimate certain higher-order interactions, fractional factorial designs can significantly reduce the number of experimental runs required, making them more resource-efficient.

Fractional factorial designs are particularly useful in screening experiments, where the primary goal is to identify the most influential factors before conducting more detailed investigations.

These designs can help researchers prioritize their efforts and allocate resources more effectively.

Response Surface Methodology

Response Surface Methodology (RSM) is a collection of statistical techniques used to model and optimize processes with multiple input variables.

The Central Composite Design (CCD) is a widely used RSM design that combines a factorial design with additional axial and center points, allowing for the estimation of quadratic effects and potential curvature in the response surface.

Another popular RSM design is the Box-Behnken Design, which is a spherical, rotatable, or nearly rotatable design.

This design is particularly efficient for exploring quadratic response surfaces and optimizing processes with three or more factors.

Unlike the Central Composite Design, the Box-Behnken Design does not include any points at the vertices of the cubic region defined by the upper and lower limits of the factors.

Taguchi Methods

Developed by Genichi Taguchi, the Taguchi methods are a set of techniques for robust parameter design and quality improvement.

One of the key elements of the Taguchi approach is the use of orthogonal arrays, which are a special class of fractional factorial designs.

Orthogonal arrays allow for the simultaneous investigation of multiple factors with a minimal number of experimental runs, making them an attractive option when resources are limited.

The Taguchi methods emphasize the concept of robust parameter design, which aims to identify factor-level combinations that minimize the variability in the response variable while achieving the desired target value.

This approach is particularly valuable in manufacturing and product development, where robustness to environmental and operational variations is critical for maintaining consistent performance and quality.

The full factorial design stands as a potent experimental strategy.

Considering every factor combination furnishes holistic comprehension of effects, interplays, and nonlinearities.

This lights pathways empowering better comprehension and optimization across varied sectors.

Advancing capabilities and statistical techniques envision expanding factorial applications.

Machine learning may analyze data efficiently while adaptive designs responsively calibrate based on real-time insights.

Additionally, sustainability priorities could drive factor prioritizations maximizing resource optimization and lessening environmental impacts.

Full factorials offer rigorous yet flexible methods unlocking enriched wisdom from intricate systems.

While exhaustive, generated learnings inspire noteworthy refinements throughout performance, quality, and workflows.

By following best practices, judiciously leveraging other designs, and tracking innovations, researchers and specialists harness factorials’ full gifts in driving innovations remarkably within respective realms.

Their gifts in illuminating intricate relations deserve recognition and prudent application wherever circumstances permit comprehensive experimentation.

SixSigma.us offers both Live Virtual classes as well as Online Self-Paced training. Most option includes access to the same great Master Black Belt instructors that teach our World Class in-person sessions. Sign-up today!

Virtual Classroom Training Programs Self-Paced Online Training Programs

SixSigma.us Accreditation & Affiliations

PMI-logo-6sigma-us

Monthly Management Tips

  • Be the first one to receive the latest updates and information from 6Sigma
  • Get curated resources from industry-experts
  • Gain an edge with complete guides and other exclusive materials
  • Become a part of one of the largest Six Sigma community
  • Unlock your path to become a Six Sigma professional

" * " indicates required fields

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

The PMC website is updating on October 15, 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Front Psychiatry

Factorial Designs Help to Understand How Psychological Therapy Works

Associated data.

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

A large amount of research time and resources are spent trying to develop or improve psychological therapies. However, treatment development is challenging and time-consuming, and the typical research process followed—a series of standard randomized controlled trials—is inefficient and sub-optimal for answering many important clinical research questions. In other areas of health research, recognition of these challenges has led to the development of sophisticated designs tailored to increase research efficiency and answer more targeted research questions about treatment mechanisms or optimal delivery. However, these innovations have largely not permeated into psychological treatment development research. There is a recognition of the need to understand how treatments work and what their active ingredients might be, and a call for the use of innovative trial designs to support such discovery. One approach to unpack the active ingredients and mechanisms of therapy is the factorial design as exemplified in the Multiphase Optimization Strategy (MOST) approach. The MOST design allows identification of the active components of a complex multi-component intervention (such as CBT) using a sophisticated factorial design, allowing the development of more efficient interventions and elucidating their mechanisms of action. The rationale, design, and potential advantages of this approach will be illustrated with reference to the IMPROVE-2 study, which conducts a fractional factorial design to investigate which elements (e.g., thought challenging, activity scheduling, compassion, relaxation, concreteness, functional analysis) within therapist-supported internet-delivered CBT are most effective at reducing symptoms of depression in 767 adults with major depression. By using this innovative approach, we can first begin to work out what components within the overall treatment package are most efficacious on average allowing us to build an overall more streamlined and potent therapy. This approach also has potential to distinguish the role of specific versus non-specific common treatment components within treatment.

Introduction: The Need to Understand How Psychological Therapies Work

Psychological treatments for mental health disorders have been robustly established as proven and evidence-based interventions through multiple clinical trials and meta-analyses ( 1 – 3 ). Nonetheless, there is a pressing need to further improve psychological interventions: even the best treatments do not work for everyone. Many patients do not have sustained improvement, and treatments need to scaled up to tackle the global burden of mental health ( 4 ). For example, psychological treatments for depression only achieve remission rates of 30%–40% and have limited sustained efficacy (at least 50% relapse and recurrence) ( 1 , 5 ). Further, it is estimated that current treatments, if delivered optimally, would only reduce the burden of depression by one third ( 6 ). As such, psychological treatments for depression need to be significantly enhanced.

One pathway to improving the efficacy and effectiveness of therapies is to develop our understanding of how complex psychological interventions work. Despite determining that a number of psychological treatments are effective, for example, cognitive-behavioral therapy (CBT), we still do not know how psychological treatments work. There is little evidence on the precise mechanisms through which psychological treatments work or what are the active ingredients of treatments ( 7 – 10 ), especially for disorders involving general distress such as depression and generalized anxiety disorder. Historically, there has been little progress in specifying the active ingredients of CBT for depression, and as a consequence, there have been no significant gains in the effectiveness of CBT for depression for over 40 years.

Resolving the active mechanisms and active ingredients of psychological interventions has been repeatedly identified as a major priority for research ( 4 , 7 , 10 , 11 ). For example, the Institute of Medicine (2015) highlighted the need to identify the key elements of psychosocial interventions that casually drive its effects ( 11 ).

To be clear, we distinguish between the active components of therapy, operationalized as the active elements or ingredients within a therapy that produce clinical benefit, which could be therapist-based, client activities, specific techniques, or related to therapy structure and delivery, versus the active mechanisms of the therapy, operationalized as the underlying change processes that causally underpin therapeutic benefit. While active components will necessarily impact on one or more active mechanisms, knowing the most effective components of a therapy is distinct from knowing how this component leads to symptom change [i.e., its underlying mechanism(s)]. For example, in CBT, identifying behavioral activation as an active therapy component does not necessarily confirm that the mechanism-of-action is behavioral as behavioral activation may work through changing cognitions.

Understanding the mechanisms or the active components of psychological treatments are important because either potentially enables the development of more direct, precise, potent, simpler, briefer, and effective treatments. Understanding the active components of a psychological therapy is necessary in order to parse and distil the therapy to focus on what is essential and most engaging to patients.

Psychological treatments are complex interventions, typically made up of multiple elements and components, including the particular content and techniques of the therapy, the interaction between the therapist and patient, the structure of the therapy, and the mode and organization of delivery, each of which potentially acts via distinct mechanisms. Therapy is thus a complex multifactorial process. Any or none of these factors could contribute to the efficacy of an intervention, alone or in interaction with the other factors. It is therefore critical to determine the beneficially active, inactive, or inert, and iatrogenic components within an intervention so that the intervention can be honed to become optimally effective, by focusing on the active elements and by removing irrelevant or unhelpful elements ( 12 ).

Relatedly, if we know the active mechanisms of an intervention, we may be able to adapt the intervention or develop novel approaches to more directly target this mechanism and, thereby, increase the efficacy of the intervention.

Because of the high prevalence of common mental health problems, there is also a scalability gap because there are not sufficiently available therapists to tackle the global burden of poor mental health ( 13 ). It is therefore critical that ways are found to make treatments more efficient, scalable, and easier to train and disseminate. Understanding the underlying components of therapy and being able to remove unnecessary elements may make psychological therapies more effective and more cost-effective by streamlining and simplifying the treatment. For example, the same treatment benefit could be achieved from fewer sessions, enabling a greater volume of patients to be treated for the same volume of therapists. Understanding the critical active components of therapy will also help to adapt treatments for the alternative delivery means that are necessary for increased scalability (for example, to convert for self-help, lay provision, or digital interventions), without losing the core elements needed for efficacy. Understanding how therapy works will also make it easier to effectively train and disseminate therapies, facilitating wider treatment coverage. This understanding may also help to identify moderators of treatment outcome and more effectively personalize therapy to each individual.

Common Versus Specific Treatment Factors

One key issue with respect to resolving the underlying mechanisms underpinning the efficacy of psychological treatments concerns the question of whether treatment works through specific versus non-specific common factors ( 8 , 14 ). Specific factors are procedures or techniques arising from the particular therapy approach, such as those typically described in structured treatment manuals, for example, cognitive restructuring in CBT; exposure in CBT for anxiety disorders. Common (or non-specific) factors are those that are hypothesized to be common across all psychological interventions. The most important of these include a positive and genuine relationship between the therapist and patient, engendering positive expectancies and hope in the patient, and a convincing rationale that explains the symptoms experienced and gives credible reasons for the treatment to be helpful ( 15 ). There is a long-standing and still unresolved debate between those who propose that psychotherapies mainly work through specific factors versus those who propose that psychotherapies mainly work through common factors.

One argument made in support of common factors is that different specific psychotherapies are generally not found to differ in efficacy, although this does not logically rule out that treatments may work via different mechanisms ( 16 ). A recent review concludes that there is as yet no conclusive evidence that either common or specific factors can be considered a validated working mechanism for psychotherapy, in other words, the evidence is insufficient to determine the role of either ( 8 ).

The relative contribution of common versus specific factors in the efficacy of psychological interventions has important implications for how therapists should be trained, how therapies should be delivered, and for how treatment services should be organized. If the substantive part of the treatment effect is due to common factors, then therapy training should predominantly emphasize therapists learning how to develop a strong therapeutic relationship, develop a rationale etc. In parallel, therapy research should focus on understanding how to strengthen positive common factor effects. However, if specific factors are important then these also need to be emphasized in training and delineated in further research. Furthermore, the increasing importance of specific factors indicates a potentially greater need for discriminating and selecting therapy to match the individual clinical presentation.

Methodologies to Examine the Mechanisms of Psychotherapy

Comparative randomized controlled trials.

One reason for limited progress in understanding the mechanisms of psychological treatments is the focus on parallel group comparative randomized controlled trials (RCTs). Parallel group RCTs are the gold standard for establishing if an intervention works more than another intervention or against a control and the best means for establishing the relative efficacy of one treatment intervention versus another. However, they are not designed for investigating the specific mechanisms of how interventions work or identifying the active components of therapy. Because comparative RCTs can only compare the overall effects of each intervention package, they are not intended to and unable to provide information about the performance of the individual elements within complex multifactorial interventions. In standard comparative RCTs, all of the multiple treatment components and factors in an intervention package and their hypothetical mechanisms are aggregated and confounded together in the comparison of one treatment versus another. As a consequence, this design is unable to test specific main effects of treatment components nor any possible synergistic or antagonistic interactions between individual treatment components, limiting advances in mechanistic understanding. If an RCT finds one treatment better than another, we do not know which components made a difference; if there is no difference, we do not know whether there are any components that effected an improvement.

This limitation of standard comparative RCTs also applies to their ability to resolve the relative contribution of specific versus common factors. One major issue concerns the difficulty in finding an adequate control arm to compare against a putative active treatment to distinguish the role of specific versus non-specific factors. Some comparative RCTs and meta-analyses have found that one therapy has outperformed another therapy ( 17 , 18 ), which proponents of specific factors have argued as evidence for specific treatment effects. However, proponents of the common factors model have counter-argued that sometimes the comparison treatments used are not bona fide therapies, defined as viable treatments that are based on psychological principles and delivered by trained therapists, and thus that this is not a fair comparison. When comparisons are made between bona fide therapies, no differences in efficacy are found ( 19 ).

Relatedly, other designs have compared an active treatment to a psychotherapy placebo or attentional control on the argument that any differential beneficial effect observed for the active treatment will then be due to specific factors as the effects of the attentional control can only be due to common factors. However, most psychotherapy placebos do not control for all the potential common factors hypothesized in therapy, and thus, any difference found between a placebo and an active treatment could be due to either specific or common factors or some combination thereof ( 20 ). For example, it is hard to generate psychotherapy placebos that are exactly matched to active treatments in therapy rationale and credibility, without the placebo itself becoming a bona fide treatment. Similarly, psychotherapy placebos tend to differ from active treatments with respect to the structure of the therapy, for example, the number and duration of sessions, training of therapist, format of therapy, and range of topics covered. A meta-analysis of comparative trials found that there were larger effect sizes found between active treatments and structurally inequivalent placebos than between active treatments and structurally equivalent placebos, for which there were negligible differences ( 20 ). These difficulties in finding matched placebo controls or bona fide interventions have limited the conclusions that can be reached about the relative contribution of specific or common factors examined in parallel RCTs.

Attempts have also been made in RCTs to determine mechanisms by examining changes in putative mediators. For example, in trials of CBT, measures of change in negative thinking are examined as a mediator of symptom change. However, these mediational approaches are necessarily limited because they are still indirect and correlational ( 7 ). Even if an intervening variable is found to statistically account for the relationship between the treatment and its outcome, this does not provide strong evidence of a mechanism of change, because it does not support a strong causal inference that the mediator influences outcome. In such associations, the mediator may be a proxy to another variable(s) and there may be another unknown or unmeasured variable that is related to both the outcome and the mediator. Ultimately, direct experimental manipulation of the relevant factor is required for strong causal inference, and this is not possible for multiple elements of psychological interventions within a parallel group comparison RCT.

Component Study Designs

One experimental approach that has been used to examine the specific elements of psychological interventions is the component study ( 9 ), in which the full intervention is compared with the intervention with at least one component removed (a dismantling study) or in which a component is added to an existing intervention to test whether it improves outcomes (an additive study) ( 21 ). In principle, this approach can enable a strong causal inference that a component has a direct effect on outcome if there is a significant difference in outcomes between the variant of the therapy with a component and the variant without that component.

Nonetheless, there are limitations of component designs. First and critically, the component design does not necessarily test the main effect of a component, that is, the difference between the mean response in the presence of a particular component and the mean response in the absence of the particular component collapsing over the levels of all remaining factors. This can be illustrated with reference to one of the seminal dismantling studies—the dismantling study of CBT for depression by Jacobson and colleagues ( 22 ). In this study, patients with depression were randomized to either the full CBT treatment package including behavioral activation, cognitive restructuring to modify negative automatic thoughts, and work on core schema, or to behavioral activation plus cognitive restructuring or to just behavioral activation element alone, with 50 patients in each arm. No significant difference was found between the three versions, leading some observers to suggest that behavioral activation alone is sufficient for the effects of CBT on depression. However, it is important to realize that all versions of the treatment involved behavioral activation: as a consequence, for example, the trial is testing the effect of cognitive restructuring in the context of behavioral activation versus behavioral activation alone. It can only tell us the effect of that component in the context of the other component. Thus, the effects estimated are only the simple effects of each component with the remaining component set to one specific level. For example, for cognitive restructuring, this design only reveals the effect of cognitive restructuring in the presence of behavioral activation. It does not test the main effect of cognitive restructuring, i.e., does the presence of cognitive restructuring have a treatment effect relative to the absence of cognitive restructuring. Similarly, because there is no condition without behavioral activation, it is not possible to estimate the direct main effect of behavioral activation.

Second, the component design assumes that there is no interaction between the components, that is, that the effect of one component is independent of the presence or absence of other components. This may not always be a realistic assumption. For example, it is possible that behavioral activation and cognitive restructuring either complement each other or are antagonistic to each other.

Third, there is a concern that most component studies are not sufficiently powered to detect a difference between two potentially active treatment arms. For example, it has been estimated based on the assumption that a minimally clinically important difference for depression is d=0.24 that a trial would need 274 participants in each condition.

The Factorial Approach

We propose the use of factorial and fractional factorial designs as an alternative methodological approach to standard comparative RCTs and component designs, which has advantages over both for resolving the active components of psychotherapy. Factorial experiments allow one to explore main effects of factors and interactions among factors ( 23 – 27 ).

Factorial designs systematically experimentally manipulate multiple components or factors of interest. Indeed, factorial designs are commonly used to test the role of different factors simultaneously in experimental psychology. As such, they meet the requirement for delineating active components raised by multiple commentators ( 8 , 10 , 14 ). For example, the Institute of Medicine (2015, p3-10) recently proposed that “determination of which elements are critical depends on testing of the presence or absence of individual elements in rigorous study designs,” which is exactly what a factorial design delivers.

To give a clinical example, if the Jacobson and colleagues dismantling study of CBT for depression was redesigned as a full factorial study, patients would be randomized across three factors [presence or absence of behavioral activation (BA + vs BA - ); presence or absence of cognitive restructuring (CR + vs CR - ); presence or absence of work on core schema (CS + vs CS - )]. This means that patients would be randomized to be balanced across 8 treatments cells reflecting all of the possible combinations: all three elements (BA + : CR + :CS + ); 2 of the 3 elements (BA + : CR + :CS - ; BA + : CR - :CS + ; BA - : CR + :CS + ); 1 of the 3 elements (BA + : CR - :CS - ; BA - : CR + :CS - ; BA - : CR - :CS + ); or none of these elements (BA - :CR - :CS - ). This design can test the main effect of each factor as well as their interactions by comparing the mean effects of combined sets of cells against each other. For example, comparing all 4 cells with BA versus all 4 cells without BA tests the main effect of behavioral activation. The difference from the dismantling design is clear because the dismantling design only has 3 of these 8 combinations (BA + : CR - :CS - ; BA + : CR + :CS - ; BA + : CR + :CS + ), which limits it to only testing simple effects.

Factorial designs have been used extensively in engineering to optimize processes. In the last decade, they have been used to good effect in behavioral health, for example, in enhancing interventions for HIV care and prevention ( 28 ) and smoking cessation ( 29 , 30 ). This approach seems well-suited to expanding to the further understanding of psychological treatments and has been recently adopted in several recent trials ( 31 , 32 ). We believe that factorial designs have advantages for investigating how psychotherapy works that overcome many of the disadvantages noted earlier for comparative RCTs and component trials, as we will outline throughout this paper.

A fractional factorial design is a variation on the factorial design that employs a systematic approach to reduce the number of experimental conditions to allow a more manageable study, at the cost of allowing only main effects and a pre-specified set of interactions to be tested. Fractional factorial designs require the assumption that higher-order interactions are negligible in size, because they are confounded, or aliased, with lower-order effects.

The IMPROVE-2 Study as an Example of a Factorial Design

We illustrate the use of a fractional factorial design to identify the active ingredients and mechanisms of an intervention, with respect to a specific example - the IMPROVE-2 study (Implementing Multifactorial Psychotherapy Research in Online Virtual Environments) [see ( 32 ) for further detail). The IMPROVE-2 study is a Phase III randomized, single-blind balanced fractional factorial trial based in England and conducted on the internet. Adults with depression (operationalized as Patient Health Questionnaire-9 scores ≥ 10) recruited directly from the internet and from an UK National Health Service Improving Access to Psychological Therapies service were randomized across seven experimental factors, each reflecting the presence versus absence of specific treatment components within internet-delivered CBT, guided by an online therapist (activity scheduling, functional analysis, thought challenging, relaxation, concreteness training, absorption, self-compassion training) using a 32 condition balanced fractional factorial design (2 iv 7-2 ) (see Table 1 ).

Experimental groups of the IMPROVE-2 fractional factorial design.

ConditionFunctional analysisConcrete trainingCompassionAbsorptionRelaxationActivity schedulingThought challenging
1nononononoyesyes
2yesnononononono
3nonoyesnononono
4yesnoyesnonoyesyes
5nononoyesnoyesno
6yesnonoyesnonoyes
7nonoyesyesnonoyes
8yesnoyesyesnoyesno
9noyesnonononono
10yesyesnononoyesyes
11Noyesyesnonoyesyes
12yesyesyesnononono
13noyesnoyesnonoyes
14yesyesnoyesnoyesno
15noyesyesyesnoyesno
16yesyesyesyesnonoyes
17nonononoyesnoyes
18yesnononoyesyesno
19nonoyesnoyesyesno
20yesnoyesnoyesnoyes
21nononoyesyesnono
22yesnonoyesyesyesyes
23nonoyesyesyesyesyes
24yesnoyesyesyesnono
25noyesnonoyesyesno
26yesyesnonoyesnoyes
27noyesyesnoyesnoyes
28yesyesyesnoyesyesno
29noyesnoyesyesyesyes
30yesyesnoyesyesnono
31noyesyesyesyesnono
32yesyesyesyesyesyesyes

Every factor occurs an equal number of times at high and low levels (i.e. balanced) and all factors are orthogonal to each other. Each effect estimate involves all 32 of the conditions in Table 1 , thereby maintaining the power associated with all participants. This Resolution IV design means that all main effects are aliased with 3-way and higher interactions, and all 2-way interactions are aliased with 2-way and higher interactions, on assumption that non-negligible 3-way interactions are unlikely. In contrast, a standard RCT is aliased for all main effects and interactions of treatment components.

All components involved brief prescribed therapist online support to improve retention and adherence, in which secure online written feedback was provided at the end of each completed module (typically fortnightly), with the option for additional secure messaging between therapist and patient. Therapist feedback highlighted positive steps made, encouraged participants to continue to practice previously introduced components, addressed questions and homework, and pointed out areas to focus on in the next module. Therapists were low-intensity Psychological Wellbeing Practitioners and an experienced clinical psychologist.

The IMPROVE-2 trial used a fractional factorial design to retain the benefits of a factorial design while making the study more logistically manageable and feasible to deliver: this fractional factorial design reduces the total number of conditions from 128 to 32. Each component has two “levels” to be compared in the fractional factorial design: either present or absent, i.e., the respective treatment modules are either provided or not provided in the internet platform. IMPROVE-2 therefore tests the main effects and selected interactions for these 7 components within internet CBT for depression to determine the active ingredients of internet CBT. We first outline the general framework used for this study—the Multiphase Optimization Strategy (MOST)—and then explore the particular benefits and methodological issues of using the factorial design to study psychotherapy.

The Multiphase Optimization Strategy (MOST)

Within IMPROVE-2, the factorial design is used as one stage within a wider framework for improving interventions—the Multiphase Optimization Strategy (MOST) ( 33 – 38 ) approach. MOST, rooted in engineering, agriculture, and behavioral science, is a principled and comprehensive framework for optimizing and evaluating behavioral interventions ( 33 – 38 ).

MOST consists of three stages: a preparation stage in which the relevant factors and components to be investigated are identified; an optimization stage in which a factorial experiment is used to evaluate the main effects and interactions of each factors; and then an evaluation stage, in which an optimized intervention based on the results of the previous trial is tested in a RCT. MOST has been established to enhance treatments for smoking cessation, with earlier factorial designs identifying active components ( 29 ), which were then combined into a novel intervention which outperformed recommended standard care in a RCT ( 39 ). MOST is well-validated ( 29 , 30 , 34 , 40 ) and recommended within the Medical Research Council Complex Intervention guidelines ( 41 , 42 ). A key advantage is greater experimental efficiency, with a focus on identifying “active ingredients” versus “inactive” or extraneous components before moving onto large-scale comparative trials, resulting in fewer overall resources required to answer the research questions in the long run than with the traditional approach ( 43 ). However, to date, MOST has not been applied to psychological interventions for mental health.

The IMPROVE-2 trial is one of the first attempts to apply the MOST approach to psychological interventions, building on the preparation and optimization phases so far. It incorporates the MOST approach with an internet delivery format for CBT to build in treatment reach, scalability, and increased treatment coverage for the optimized treatment from the start, as the goal is to develop an optimized and scalable evidence-based treatment. Another benefit of using such an internet-delivered therapy is that treatment content can be standardized and fixed, and written therapist responses can be closely demarcated, reducing unwanted “drift” from treatment protocols. This helps prevent potential contamination between different treatment components, which is an important consideration for a factorial design.

The Preparation Stage in MOST

During the preparation stage, a conceptual model for the intervention is developed, and discrete and distinct intervention components are selected. These components are then pilot tested for acceptability, feasibility, evidence of effectiveness, and ease of implementation, and refined as needed. MOST also involves the identification of the optimization criterion, which is the operational definition of the target change sought that is used to judge the optimal intervention, subject to resource or other constraints. For example, this might be greatest symptom improvement that could be obtained for a particular cost or for a particular duration of treatment.

With respect to the IMPROVE-2 study, a previous feasibility study (IMPROVE-1) established that it was feasible to maintain treatment integrity and fidelity across randomization into multiple treatment conditions and to avoid contamination across treatment conditions. Because the IMPROVE-2 study is focused on determining the ingredients of internet-CBT that are most effective for treating major depression in adults, the operational definition for the optimization criterion was the largest reduction in depressive symptoms, as indexed by using change in scores on the Patient Health Questionnaire-9 score (PHQ-9) ( 44 ) as the primary outcome.

Components Within the Psychological Intervention

A key step within this preparation phase is to identify the components that are to be targeted. When planning a factorial study, the best components to choose are those that are: related to a specific conceptual model; distinct from each other in content, approach or delivery method; have some evidence of efficacy, that can be independently administered, i.e., one component is not dependent on another for delivery; and that are hypothesized to address one or two theoretical mediators. In essence, it is important that components can be distinguished from each other in a meaningful way and that they are conceptually related to different mechanisms.

The elements or components selected can be at different levels of analysis and abstraction. The level selected will depend on the specific question or conceptual model. For example, for CBT, the components chosen could relate to the main hypothesized theoretical mechanisms of change and their associated elements, such as activity monitoring and scheduling and detecting and testing automatic thoughts. Alternatively, the components could relate to lower-level, more discrete elements within the treatment techniques such as the behavioral change techniques outlined in a recent taxonomy ( 45 ). These behavioral change techniques include behaviors such as self-monitoring, goal-setting, and feedback, which are common across different CBT components as well as other psychotherapy modalities. Alternatively, the components could relate to process-related aspects of therapy such as whether the intervention is therapist-supported versus unsupported, or structural aspects, such as the frequency of treatment sessions.

IMPROVE-2 illustrates the selection of components to be examined. Consistent with the principles above, the IMPROVE-2 study chose treatment components that were conceptually and operationally distinct from each other, so that each can be evaluated independently. As the first attempt to disentangle the active components within CBT for depression, components were chosen that were clearly distinct and that could be linked to the main theorized mechanisms of action in CBT. These components were operationalized at a relatively high-level (e.g., thought challenging to reflect cognitive theories of change; activity scheduling to reflect behavioral theories of change) rather than in terms of the more localized behavioral change taxonomy because the goal was to determine the core components relating to key theoretical conceptualizations of CBT and to maximize the likelihood of finding a positive effect. If, for example, thought challenging was found to be a strong active ingredient, then further studies could dissect which elements including more specific behavioral change techniques are critical to the effects of thought challenging. Three of the components chosen had been identified as elements for CBT for depression, using a Delphi technique ( 46 ): applied relaxation; activity monitoring and scheduling; detecting and reality testing automatic thoughts. A further component—functional analysis—is a mainstay of behavioral approaches to depression including behavioral activation ( 47 ). Three components related to recent treatment innovations in CBT derived from experimental research ( 48 , 49 ), with each hypothesized to specifically target distinct mechanisms arising from different theoretical models: self-compassion, concreteness training, and absorption. The components selected relate to three theoretical accounts of how CBT might work: a behavioral account, a cognitive account, and a self-regulation account.

Three components related to behavioral models of depression and of how CBT works. Depression has been hypothesized to result from a reduction in response-contingent positive reinforcement ( 50 ), in which the individual with depression experiences less reward and sense of agency as a consequence of changing circumstances (e.g., loss), poor skills, or avoidance and withdrawal. Within the behavioral conceptualization, activity scheduling is hypothesized to increase response-contingent positive reinforcement by increasing frequency of positive reinforcement thorough building up positive activities. This treatment component provides psychoeducation about the negative effects of avoidance, includes questionnaires to help patients identify their own patterns of avoidance, provides guidance on activity scheduling to build up positive activities and reduce avoidance (e.g., breaking plans into smaller steps; specifying when and where to implement activities), and exercises in which participants generate their own activity plans.

In parallel, functional analysis seeks to determine the functions and contexts under which desired and unwanted behaviors do and don't occur and, thereby, find ways to systematically increase or reduce these behaviors, by exploring their antecedents, consequences, and variability, and then either alter the environment to remove antecedent stimuli that trigger unwanted behaviors and/or practice incompatible and constructive alternative responses to these antecedents. This approach is based on Behavioral Activation (BA) ( 51 ) and rumination-focused CBT ( 49 ) approaches to depression. More specifically, functional analysis is proposed to target habitual avoidance and rumination by identifying antecedent cues, controlling exposure to these cues, and practicing alternative responses to them ( 52 ).

Absorption training is also hypothesized to increase response-contingent positive reinforcement by increasing direct contact with positive reinforcers. Absorption training is focused on teaching an individual to mentally engage and become immersed in what he or she is doing in the present moment to improve direct connection with the experience and enhance contact with positive reinforcers. It is designed to overcome the effects of detachment and rumination which can prevent an individual experiencing the benefits of doing positive activities. When delivered within the internet treatment, patients complete a behavioral experiment using audio-recorded exercises to compare visualizations of memories of being absorbed versus not being absorbed in a task, practice generating a more absorbed mind-set using downloadable audio exercises, and identify absorbing activities.

Two components within the factorial design are based on a cognitive conceptualization of depression, in which the negative thinking characteristic of depression, is hypothesized to play a causal role in the onset and maintenance of depression, and, thereby, reducing negative thinking is hypothesized to be an active mechanism in treating depression ( 53 , 54 ). Central within CBT for depression is the use of thought challenging or cognitive restructuring to reduce negative thinking ( 55 ), and this forms one component in the IMPROVE-2 trial. The internet treatment module that delivers the thought challenging component involves psychoeducation about negative automatic thoughts and cognitive distortions, vignettes of identifying and challenging negative thoughts, and written exercises in which patients practice identifying and then challenging negative thoughts using thought records.

The other cognitive-based component involves concreteness training, based on an intervention found to reduce symptoms of depression in a previous RCT ( 48 ) and derived from experimental research indicating the benefits of shifting into a concrete processing style ( 56 , 57 ). Within the IMPROVE-2 trial, the internet treatment module that delivers this component involves psycho-education about depression, rumination, and overgeneralization, a behavioral experiment using audio-recorded exercises to compare abstract versus concrete processing styles, and downloadable audio exercises to practice thinking about negative events in a concrete way. Unlike thought challenging, concreteness training does not test the accuracy or veridicality of negative thoughts but rather trains patients to focus on the specific and distinctive details, context, sequence (“How did it happen?”), and sensory features of upsetting events to reduce overgeneralization and improve problem-solving. Concreteness training is therefore hypothesized to specifically reduce the overgeneralization cognitive bias identified as important in depression ( 53 , 58 ).

The remaining treatment components are hypothesized to directly improve emotional regulation. Relaxation is hypothesized to improve self-regulation by targeting physiological arousal and tension. In IMPROVE-2, a variant of progressive muscle relaxation and breathing exercises was used to reduce physiological arousal and tension in response to warning signs, based on trial evidence that this intervention alone reduces depression ( 48 ). The treatment component introduces a rationale for relaxation, provides an online relaxation exercise as a behavioral experiment to test if it reduces tension, and a downloadable relaxation exercise.

Self-compassion training is proposed to activate the soothing and safeness emotional system, hypothesized to be downregulated in depression ( 59 ). Recent research has highlighted the potential benefit of increasing self-compassion in treatments for depression ( 49 , 60 – 62 ), although self-compassion has not yet been directly tested within a full-scale clinical trial for patients with major depression. Within this treatment component, patients read psychoeducation about compassion including useful self-statements to encourage and support oneself, complete a behavioral experiment that compares their own self-talk to how they talk to others, try an audio-recorded exercise visualizing past experiences of self-compassion to activate this mind-set and test its benefits, which is downloadable for further practice, and identify activities they would do more of and activities they would do less of to be kinder to themselves.

The Optimization Stage of MOST: Factorial Experiments and Their Benefits

The second stage of MOST involves optimization of the intervention, typically through a component selection experiment (sometimes called a component screening experiment), using a factorial or fractional factorial design. This factorial experiment is used to specifically determine the individual effects of each component and any interactions between components. It is important to note that this step could involve multiple experiments and an iterative process of further refining the intervention. For example, if the first component screening experiment observed statistically significant moderators of treatment outcome, such as mode of treatment delivery or location of treatment, a further experiment could be conducted in which the moderators are introduced as factors into the factorial experiment so that they are directly manipulated to enable stronger causal inference about their potential contribution to outcome.

Advantages of Factorial Design

There are at least four advantages to the use of a factorial design in resolving how therapy works and what its active mechanisms are.

Advantage 1: Directly Testing Individual Components and Their Interactions

The factorial experiment provides direct evidence about the effects and interactions of individual components within a treatment package, which is necessary for methodically enhancing and simplifying complex interventions ( 41 ). It can test each individual component and determine its main effect. Critically, it can also determine possible interactions between components, which other experimental designs are unable to do. Thus, a factorial design has distinct advantages when one needs to determine whether the presence of one component enhances or reduces the effect of another. This approach enables us to identify the active components of therapy and to select active and reject inactive/counter-productive components or elements. By comparing the presence versus absence of each component, this factorial design can examine the main effect of each component on the primary outcome, for example, testing whether thought challenging reduces symptoms of depression.

With respect to the IMPROVE-2 study, it is important to note that despite the many trials of CBT for depression, no trials have directly tested the main effect of each of the selected treatment components—for example, does thought challenging have a direct effect on reducing depression relative to no thought challenging? This design therefore provides the first fully-powered test of the main effects of these ingredients of CBT for depression. Table 1 describes the specific combinations of the two-level intervention factors in the experimental design.

To illustrate how the factorial design works, consider Table 1 . Main effects and interactions are estimated based on aggregates across experimental conditions. For each main effect, half of the study population are randomized to one level of the factor (e.g., in conditions 9–16, 25–32, presence of concreteness training) and half will be randomized to the other level of the factor (e.g., in conditions 1–8, 17–24, absence of concreteness training). Therefore, the main effect of concreteness training can be determined by comparing the average effect of conditions 9–16, 25–32 versus conditions 1–8, 17–24.

Technically, the IMPROVE-2 study is an internet-delivered component selection experiment with seven experimental factors evaluated, each at two levels ((presence, coded as +1 versus absence, coded as -1 of component, effect coded), using a 32-condition balanced fractional factorial design (2 IV 7-2 ). Effect coding is used because it ensures that main effects and interactions are independent.

A full factorial design of seven factors would have required 2 7 = 128 conditions, which was deemed to be impractical and too complex to program and administer, and thus a fractional factorial design was chosen. For IMPROVE-2, a 2 7-2 fractional factorial design was chosen, which reduces the number of experimental conditions by a factor of four, down to 32 conditions. While the full factorial design necessarily includes all possible combinations of all factors, within a fractional factorial design the researcher has to strategically and carefully select a subset of the experimental conditions available.

The first consideration when selecting the subset of the experimental conditions is statistical, with a need to maintain a balanced design in which every factor occurs at an equal number of times at each of the two levels, and in which all factors are orthogonal to each other. This necessarily limits the potential configurations of subsets available. These designs can be mapped out using factorial design tables ( 63 ) or statistical packages (e.g., PROC FACTEX in SAS).

The second key consideration is to select the subset of experimental conditions that maximizes the ability to estimate the main effects and interactions that are of highest priority for the research question. Typically, estimating the main effects of the intervention components is a priority. For a fractional factorial design, some of the main effects are going to be confounded (typically referred to as “aliased” within the factorial literature) with higher-order interactions, and thus the subset of experimental conditions needs to be carefully selected so that the main effects are only aliased with higher-order interactions that are judged to be less likely to be significant (e.g., 3-way or 4 way-interactions) or of less theoretical interest.

For IMPROVE-2, the selected design allows the estimation of all main effects and several pre-specified 2-factor interactions among the seven intervention factors; in statistical terminology, it is a Resolution IV design because main effects are only aliased with 3-way and higher interactions. This means that if a potential effect is observed for a particular component, technically the observed effect is due to the sum of the main effect itself and the specific aliased higher-order interactions, i.e., the estimated lower-order effect may include contribution from these higher-order effects. For example, the main effect of concreteness is aliased with the 4-way interaction of functional analysis by compassion by absorption by thought challenging, and the 4-way interaction of functional analysis by compassion by relaxation by activity scheduling and the 5-way interaction of absorption by concreteness by relaxation by thought challenging by activity scheduling. Thus, the actual effect observed is due to the sum of the main effect plus the 4-way and 5-way interactions. If this comparison is significant, the most likely explanation is that the presence of concreteness training produces better treatment outcomes than the absence of concreteness training although we cannot rule out in the fractional design that configurations of 4 and 5 components, albeit unlikely, could contribute to this effect. In interpreting the results, the assumption is that the 3-way and higher interactions are highly likely to be negligible, based on extensive research and principles within factorial experiment research ( 27 , 63 ). Although in most cases this assumption is reasonable, it may not always apply.

In designing the study, several 2-way interactions were pre-specified as being of particular interest, where it was hypothesized that components might interact with each other, and the design was explicitly chosen so that these 2-way interactions were only aliased with 3-way or 4-way interactions, which we typically expect to be negligible. For example, it was hypothesized that activity scheduling and absorption treatment components may have a positive synergistic effect because the former increases the number of positive activities engaged in, whereas the latter increases the potential absorption and connection with these activities. Similarly, it was hypothesized that thought challenging and self-compassion components may have a positive synergistic effect because thought challenging helps individuals to look logically for evidence against and alternatives to negative self-critical thoughts, while self-compassion encourages a more kindly and tolerant approach to tackle self-criticism.

One choice within the design of the fractional factorial is whether or not it includes the experimental condition in which all intervention components are set to the low level or absent, i.e., a no-treatment control. For the purposes of investigating the active ingredients of therapy, this condition is not necessarily required, since the logic of the factorial experiment is not to compare all the conditions directly with each other, as we would in a comparative RCT, but rather to identify the active components by aggregating mean effects across each factor.

For IMPROVE-2, the fractional factorial design explicitly excluded the condition in which participants receive no treatment components. This has several potential advantages. First, it means that there is not a no-treatment or treatment-as-usual condition, so that the design and trial was suitable for use in a clinical service, where it would not be possible or ethical to randomize patients to not receive any active treatment. Second, because all participants are randomized to active treatment, they are more likely to remain engaged in the trial and to not judge that they are receiving the “inferior option” as can sometimes occur for control conditions.

Within the IMPROVE-2 fractional factorial design, all participants were randomized to receive at least one component of CBT and in the majority of cases 3 or 4 components of CBT. Based on the experience of the IMPROVE-1 feasibility study, in which many patients only completed their first few treatment modules, the IMPROVE-2 counter-balanced the order in which the treatment modules delivering each treatment component were received in the internet platform to ensure that each component was received equally often across all participants as patients progressed through the therapy. In this way, the number and order of treatment components was equivalent between the high (presence) and low levels (absence) of each factor. Of course, this leaves open the question of whether the order of receiving treatment components might be important or not: given the iterative nature of the MOST approach, the effect of sequencing treatment components on efficacy could be a further question for a subsequent component screening experiment.

Advantage 2: Manipulation of Hypothesized Mechanisms and Examination of Individual Mediators

The factorial design allows research on the working mechanisms and mediators that allows strong causal inference because each factor associated with a hypothesized specific mechanism is manipulated and the effect of manipulating this factor can be tested directly on secondary measures indexing the putative mediator. The design also enables examination of the mediators of each individual intervention component, because each factor is manipulated independently. For example, this design can test whether the presence of a thought challenging component has a main effect on reducing self-reported negative thinking relative to the absence of thought challenging, and whether this change in thinking mediates change in depression.

To maximize this opportunity to test mediators, the IMPROVE-2 trial required all patients to complete a series of self-report questionnaires at baseline and at each follow-up assessment (at 12 weeks and 6 months post-randomization), as well as after each completed treatment module that index all the putative mediators across all the treatment components. For each treatment component, the putative mediator was related to the primary mechanism which each treatment component is hypothesized to most strongly influence, including rumination (5-item Brooding scale) ( 64 ) for the functional analysis component, overgeneralization (adapted Attitudes to Self Scale – Revised) ( 58 ) for the concreteness component, self-compassion scale ( 65 ) for the self-compassion component, negative thinking (Automatic Thoughts Questionnaire) ( 66 ) for the thought challenging component; increased behavioral activity and reduced avoidance (Behavioral Activation for Depression Scale Short-form) for the activity scheduling component ( 67 ), and absorption and engagement in positive activities, adapted from measures of “flow” for the absorption component ( 68 ). Mediational analyses can then be used to test the hypotheses that each treatment component primarily works through the hypothesized mediator, using the analytical approach outlined by Kraemer et al. ( 69 ) and modern causal inference methods. In addition, IMPROVE-2 will investigate potential moderation of the treatment components by site, age, sex, severity of depression, co-morbid illness, and antidepressant use. This design enables us to test whether manipulating a particular component influences the underlying process it is hypothesized to change, and whether that process in fact mediates symptom change. By assessing all putative mediators for all components, we can also test whether components influence other processes, e.g., whether components tackling behavior change cognition or vice versa.

Advantage 3: Improved Delineation of Specific Versus Common Treatment Factors

The factorial design provides a stronger test of the relative contribution of specific versus non-specific common treatment factors than existing designs. As noted earlier, the majority of control comparisons are inadequate for disentangling specific from non-specific treatment effects because of the difficulty in creating psychotherapy placebos (attentional controls) that match a bona fide psychotherapy for credibility, rationale, and structure. However, the factorial design overcomes this limitation because for any treatment component (e.g., the relaxation component in IMPROVE-2), the aggregate of the conditions where it is present (i.e., Table 1 , conditions 17–32) are equivalent for treatment credibility, structure, delivery, rationale, therapist contact, therapist content and techniques and therapist allegiance with the aggregate of the 16 conditions where it is absent (i.e., Table 1 , conditions 1–16), except for the specific treatment component itself. Moreover, these conditions are also matched in aggregate for all the other six treatment components, since these are balanced in the design. The evaluation of the main effect of relaxation involves the comparison of the average effect for the conditions where relaxation is present versus for the conditions where relaxation is absent. This design therefore provides the strongest control condition available and one that is able to disentangle specific from non-specific common treatment factors. More specifically, this approach is a rigorous test of whether there are specific treatment effects arising from particular treatment components in addition to any non-specific factors common across the treatment components. If there is a significant main effect for any component in IMPROVE-2, then this is strong evidence for a specific treatment effect above and beyond all the non-specific common therapy factors present in CBT. The nature of the non-specific factors tested will depend on the specific components compared in the trial design: because IMPROVE-2 exclusively examines components within internet-CBT, it confounds non-specific factors common across therapies (e.g., therapeutic alliance, rationale) and those specific to internet-CBT and common to all components (e.g., self-monitoring; homework). A different study that took components from different treatment interventions could better delineate non-specific effects common to all therapies. This approach would not rule out some contribution of common factors to treatment outcome, as common factors would be matched across the two levels of the factor, but would be definitive evidence for a specific treatment effect. Conversely, if none of the components were found to have a significant main effect (assuming sufficient power), this would suggest that any treatment benefit was due to common factors.

Advantage 4: Factorial Designs Are Efficient and Economical

Factorial designs are efficient and economical compared to alternative designs such as individual experiments and single factor designs because they often require substantially fewer trials and participants to achieve the same statistical power for component effects, producing significant savings in recruitment, time, effort and resources ( 23 , 43 ).

For example, as an alternative to the factorial design used in IMPROVE-2, a research program could investigate each of the components separately in seven individual experiments or conduct a comparative RCT or a component trial (dismantling or additive design). For IMPROVE-2, it was assumed that the smallest Meaningful Clinical Important Difference (MCID) would be a small effect size (Cohen's d or standardized mean difference=.2) for the main effect of an individual treatment component or interaction between components on pre-to-post change in depression. An alpha of 0.1 was chosen as this is recommended for component selection experiments to decrease the relative risk of Type II to Type I error when selecting treatment components; i.e., to avoid prematurely ruling out potentially active treatment components ( 23 , 36 ). In order to detect a MCID of d = 0.20 with 80% power at α = 0.10 per treatment, a sample size of N=632 was required (NQuery 7.0). Because participants provide at least five repeated measures on the primary outcome, latent growth curve modeling can be used, which was conservatively estimated to reduce sample size by 30% relative to only using first and last time-point as in an Analysis of Covariance, but then numbers were increased to account for estimated 40% dropout attrition post-treatment, giving a required total sample of N= 736 for the fractional factorial design.

However, the same MCID, power and attrition issues apply for all other trial designs. Thus, each individual experiment would need 736 participants to be adequately powered to examine each component: conducting seven separate experiments to investigate each of the seven components would require N= 5,152, or seven times as many participants as the factorial experiment. A parallel comparative RCT to compare each of the components against each other and against a no-treatment control would have 8 arms and require 368 participants per arm, thus requiring N=2,944, or four times as many participants as the factorial experiment. Similar calculations apply for component experiments – for example a dismantling study that compares a full treatment package (all seven treatment components combined), with incrementally dismantled packages, each with a component removed (i.e., all components minus compassion; all components minus compassion and absorption, etc.) would have 7 arms (assuming there is not a no-treatment control), each requiring 368 participants per arm, requiring N=2,576, or 3.5 times as many participants as the factorial design.

Factorial and fractional factorial designs are efficient and economical because rather than making direct comparisons between experimental conditions as in the other designs, the factorial design compares means based on aggregate combinations of experimental conditions. To illustrate within IMPROVE-2, as indicated in Table 1 , the estimate of the main effect of concreteness training is based on comparing the aggregate of conditions 9–16, 25–32 where it is present, versus aggregate of conditions 1–8, 17–24 where it is absent; the estimate of the main effect of relaxation is based on comparing sum of conditions 1-16 versus sum of conditions 17–32; the estimate of the main effect of thought challenging is based on comparing sum of conditions 1, 4, 6, 7, 10, 11, 13, 16, 17, 20, 22, 23, 26, 27, 29, 32 versus the sum of conditions 2, 3,5,8, 9, 12, 14, 15, 18, 19, 21, 24, 25, 28, 30, 31, etc. In this way all participants are involved in every effect estimate—it effectively recycles each participant by placing each participant in one of the levels of every factor. As such, the full sample size can be used to determine each of the main effects, making this design efficient for power and sample size.

The Evaluation Stage of MOST

The third stage in MOST is the evaluation of the optimized intervention. An optimized intervention is systematically built from the results of the factorial experiment by including the most active components with strongest effect sizes relative to the pre-specified optimization criterion, but excluding and eliminating weak inert or antagonistic components. This optimized intervention is tested against the standard evidence-based treatment in a parallel comparative RCT. Thus, to be clear, the MOST approach still retains the parallel comparative RCT as the best method to evaluate one treatment package against another, but adds the factorial design as the most efficient means to investigate the treatment components. In this way, the MOST framework uses rigorous design to identify active elements of a treatment, build a potentially better therapy and then test whether it is an improvement on existing active treatments.

IMPROVE-2 has not yet reached the optimized intervention and evaluation stage. Nonetheless, the logic is clear: based on the results of the IMPROVE-2 factorial experiment, a refined internet CBT treatment package would be produced by retaining those treatment components that had the largest effect sizes for depression, and by removing those components that had minimal or even negative effect sizes. Both the Pareto principle and prior MOST studies suggest that there will be variability in the treatment effect sizes of different components and their interactions, that not all components will be active in the therapeutic benefit of CBT, and indeed, that many will have insignificant effect sizes ( 30 ). As such, it should be possible to concentrate the therapy elements to make CBT more potent, and as a minimum more effective.

This process also considers any potential interactions between components. For example, if there was a significant positive two-way interaction between two components, such that adding one component to the another produced larger treatment effects than either on their own, then these factors may be added to the treatment package. In contrast, if there was a significant negative antagonistic interaction between two components, such that together the treatment benefit was less than either on their own, the component with the weakest positive main effect would be probably removed from the treatment package.

If an examination of the estimated effect size of the optimized intervention from the component selection experiment looked favorable, then this optimized intervention would then be tested against an established internet CBT for depression treatment package, to test whether these modifications improved treatment outcome. If the optimized intervention looked unlikely to out-perform existing treatments in the modeling of the treatment estimates, or was found to not be superior in a subsequent comparative RCT, then the MOST logic is that further iterations through the three phases are needed. If this approach indicates that some but not all components within internet CBT for depression have a significant effect size in reducing depression, it will lead to the building of better therapies that focus on the active ingredients and discard inert or iatrogenic elements.

Potential Limitations

The IMPROVE-2 trial is only one illustration of how the factorial approach could be used to delineate the active components of psychological therapies. As is true for any single study, it has specific limitations. First, it is relatively complex in utilizing seven components. This has the advantage of testing multiple putative active ingredients at once but the risk that with this complex design main treatment effects may be diluted. Adequate testing of treatment components in the factorial design requires each component to be delivered with sufficient difference between the presence and absence of the component to provide a fair test of its main effect. Because the components in IMPROVE-2 each reflect exposure to specific treatment content and techniques, this means that participants need to receive a sufficient dose of the respective content and techniques, that is, complete the relevant modules and practice the relevant behaviors. We sought to achieve this by having each component as a distinct module that is completed over several weeks, and whose content and techniques are then referenced and checked and practised in all subsequent modules and explicitly referred to in the subsequent written feedback from the therapist, to maintain their ongoing use. This meant that the “dose” of treatment elements should be comparable to proven internet CBT treatments and sufficient for testing the main effects.

Nonetheless, there are alternative approaches to tackling this issue. One alternative way to increase treatment dose would be to have a simpler design with fewer treatment components that each run over multiple modules. Another alternative is to test process-focused components such as the degree or nature of therapist support (e.g., support versus no support), or structural components such as the frequency of treatment sessions (e.g., weekly or twice weekly), both of which involving keeping therapy content constant. Such designs straightforwardly deliver a sufficient difference between the presence and absence of the treatment component. Of course, the selection of different components necessarily tests different hypotheses as to the active ingredients of therapy. At this point, it remains an empirical question which of these different components most contributes to treatment outcome. Each approach is equally valid. This is why we strongly advocate for multiple factorial trials to test these different dimensions so that we can systematically enhance therapy.

Related to this limitation, IMPROVE-2 used a fractional factorial design, which raises the potential risk of main effects being confounded with higher-order effects. While this risk is deemed to be very low because 3-way and 4-way interactions are unlikely to be significant, a full factorial design would avoid this assumption. A full factorial would be more suitable for designs utilising fewer components.

A further limitation of the IMPROVE-2 design is that all the components utilize a CBT framework and include generic CBT elements such as self-monitoring, planning, homework and homework review, Socratic review, building new activities, collaboration with the therapist, and a common CBT rationale focusing on thoughts and behavior. As such, if we were to find no main effects for any of the treatment components, we could not determine to what extent any treatment benefit observed was due to non-specific effects common across therapies (such as therapist alliance, remoralization) or due to non-specific effects particular to CBT. Nonetheless, this design still provides a better matched control to investigate specific main effects than prior designs and to test if there any specific main effects. Either pattern of findings (identifying one or more specific main effects of treatment components versus no main effects observed) would still be an advance on our current knowledge and could then be further explored further within the MOST framework.

We have reviewed the importance of better understanding the mechanisms and active ingredients of psychological treatments in order to refine, condense, and strengthen the potency and effectiveness of these treatments. We have shown that standard comparative RCTs and component trials have limitations for determining the specific treatment contributions of individual treatment components within a psychological treatment package and for inferring causality concerning treatment mechanisms. We have shown how factorial and fractional factorial trials can overcome these limitations and have the particular advantages of directly testing individual components and their interactions, of examination of individual mediators and experimental manipulation of hypothesized mechanisms, of being able to distinguish specific factors from common treatment factors, and of being economical and efficient with respect to sample size and resources.

This approach has been illustrated with respect to the IMPROVE-2 trial ( 32 ), which will provide the first examination of the underlying active treatment components within internet CBT for depression. Understanding the active components of therapy will enhance our understanding of therapeutic mechanisms and potentially enable the systematic building of more effective interventions. The IMPROVE-2 trial has completed the recruitment, treatment and follow-up stages, with 767 adult patients with depression recruited, and statistical analyses underway. It is anticipated that these analyses will significantly extend our understanding of how CBT works. We believe that this innovative approach may provide a useful means to address recent requests for rigorous study designs to determine which elements within psychological interventions are core active components ( 4 , 7 , 10 , 11 ).

Data Availability Statement

Ethics statement.

The study protocol for IMPROVE-2 was reviewed and approved by the South West National Research Ethics Committee, NHS National Research Ethics Committee SW Frenchay (reference number, 14/SW/1091, 30/4/2015). The trial sponsor is the University of Exeter, contact person Gail Seymour, Research Manager.

Author Contributions

EW and AN both designed, prepared, and delivered the IMPROVE-2 study. EW prepared the first draft of the manuscript, AN commented on the draft, and both EW and AN finalized the manuscript.

Funding for the IMPROVE-2 study was provided by grants from the Cornwall NHS Partnership Foundation Trust and South West Peninsula Academic Health Research Network. Funding sponsors did not participate in the study design; collection, management, analysis, and interpretation of data; or writing of the report. They did not participate in the decision to submit the report for publication, nor had ultimate authority over any of these activities.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

The authors wish to thank profusely all the staff within Cornwall NHS Partnership Foundation Trust who supported this research and all the patients who volunteered to participate in the IMPROVE-2 study.

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Journal Proposal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

drones-logo

Article Menu

advantages of factorial experimental design

  • Subscribe SciFeed
  • Recommended Articles
  • Author Biographies
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Design and implementation of a novel uav-assisted lorawan network.

advantages of factorial experimental design

1. Introduction

1.1. background and significance, 1.2. literature review, 1.2.1. uav-assisted lora network without lorawan specification, 1.2.2. uav-assisted lorawan network.

  • UAV Carrying End-Device as the Payload
  • UAV Carrying Gateway as the Payload

1.3. Motivation and Contribution

  • The UAVs only serve as flight carriers, and their communication capabilities are not used to assist LoRaWAN networks.
  • The altitude advantage of UAVs was leveraged to facilitate the deployment of LoRaWAN networks in complex environments. However, the integrated solution of “UAV + Remote Controller + Server”, which can enhance communication reliability and further expand the coverage of LoRaWAN network in an efficient way, was not considered.
  • Information interaction between UAVs and LoRaWAN gateways was not realized, including transparent data forwarding, clock synchronization and GPS location acquisition, resulting in a limited application scalability of the existing UAV-assisted LoRaWAN network.
  • Based on the analysis of the specific requirements of the UAV-assisted LoRaWAN network system, a UAV-assisted LoRaWAN network system architecture is proposed, expanding the LoRaWAN network coverage through the integrated solution of “UAV + remote control + server” effectively.
  • A LoRaWAN gateway prototype, which is highly integrated with the UAV, has been developed to enhance the LoRaWAN network through UAV altitude and line-of-sight advantages. Various UAV resources were provided to the LoRaWAN gateway through the multiple interfaces provided by a PSDK adapter board, including UART, PPS signal, and USB type-C. In addition, a forwarding program called the UAV packet forwarder was designed to manage the forwarding task and UAV resources.
  • A relay solution based on remote controller was proposed to further extend the coverage of UAV. With the Wi-Fi or cellular network and UAV communication resources, a MSDK-based APP was developed to realize the relay features. Through the monitoring and transparent forwarding for both the UAV and LoRaWAN server, the coverage of the LoRaWAN network was effectively extended with a single relay.
  • A performance evaluation and a positioning demonstration were carried out in the real world with the proposed network. The superiority of the UAV-assisted LoRaWAN network is verified in the typical LoRaWAN applications of both data collection and positioning.

2. LoRaWAN Network and UAV Payload Development Technology

2.1. lorawan network, 2.1.1. lorawan end-devices, 2.1.2. lorawan gateway, 2.1.3. lorawan server.

  • ChirpStack Gateway Bridge
  • MQTT broker

2.2. UAV Payload Development Technology

2.2.1. payload software development kit (psdk), 2.2.2. mobile software development kit (msdk), 2.2.3. integrated solution of “uav + remote controller + server”, 3. system design, 3.1. system requirement analysis, 3.2. system architecture.

  • LoRaWAN End-Devices
  • UAV Gateway
  • Remote Controller
  • Cloud Center Server
  • Application Server

3.3. System Workflow

4. system implementation, 4.1. uav gateway design, 4.1.1. hardware design.

  • PSDK adapter board
  • LoRaWAN gateway

4.1.2. Software Design

  • Thread Up, Thread Down and GWMP-based data forwarding
  • Thread GPS and clock synchronization

4.2. Remote Controller Relay Design

5. performance evaluation, 5.1. network coverage and communication reliability evaluation, 5.1.1. experimental setting.

  • Node_1 is deployed about 400 m southeast of the ground gateway, representing the urban propagation environment.
  • Node_2 is deployed about 400 m southwest of the ground gateway, representing the vegetation propagation environment.
  • Node_3 is deployed about 1.3 km southwest of the ground gateway, representing a complex propagation environment over long distances.
  • In group 1, the ground gateway is utilized to demonstrate the performance of the ground LoRaWAN network.
  • In group 2, the ground gateway is replaced by the UAV gateway and remote controller to construct a UAV-assisted LoRaWAN network. In addition, three scenarios have been designed to analyze the effect of gateway altitude on network performance, where the UAV gateway is deployed at heights of 30 m, 60 m, and 90 m, respectively.
  • In group 3, the UAV gateway is deployed at the center of three end-devices, which is 800 m away from the remote controller, aiming to verify the relay feature of the remote controller.

5.1.2. Evaluation Indicators

5.1.3. experimental results and discussion, 5.2. positioning demonstration, 5.2.1. tdoa positioning configuration, 5.2.2. tdoa algorithm implementation.

Positioning Method Based on TDOA (Time Difference of Arrival)
timestamps of a packet arriving at 3 different gateways, the GPS coordinates of gateways;
Calculated End-Device location;
(1) Acquire information from Center Server.
(2) Randomly choose a gateway as reference gateway.
(3) Set the coordinates of reference gateway to (0, 0) at a two-dimensional plane.
(4) Project the GPS coordinates of other gateways onto the two-dimensional plane.
(5) Calculate the distance difference of gateways to End-Device based on timestamps difference, with the aim of formulating hyperbolic equations.
(6) the calculated distance difference of gateways to End-Device is outlier
    to step 1

    calculated End-Device location based on hyperbolic equations.

5.2.3. Positioning Result and Discuss

6. conclusions, author contributions, data availability statement, conflicts of interest.

  • Semtech LoRa Technology Overview. Available online: https://www.semtech.com/lora (accessed on 2 August 2024).
  • Manzano, L.G.; Boukabache, H.; Danzeca, S.; Heracleous, N.; Murtas, F.; Perrin, D.; Pirc, V.; Alfaro, A.R.; Zimmaro, A.; Silari, M. An IoT LoRaWAN Network for Environmental Radiation Monitoring. IEEE Trans. Instrum. Meas. 2021 , 70 , 1–12. [ Google Scholar ] [ CrossRef ]
  • Das, V.V.; Sathyan, A.; Divya, D.S. Establishing LoRa Based Local Agri-Sensor Network through Sensor Plugin Modules and LoRaWAN Data Concentrator for Extensive Agriculture Automation. In Proceedings of the 2022 IEEE 19th India Council International Conference (INDICON), Kochi, India, 24–26 November 2022; pp. 1–6. [ Google Scholar ]
  • Kostadinov, A.; Kolev, K. LoRa Smart City Applications. In Proceedings of the 2022 29th International Conference on Systems, Signals and Image Processing (IWSSIP), Sofia, Bulgaria, 1–3 June 2022; pp. 1–4. [ Google Scholar ]
  • Cruz, N.; Cota, N.; Tremoceiro, J. LoRaWAN and Urban Waste Management—A Trial. Sensors 2021 , 21 , 2142. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Abubeker, K.M.; Baskar, S. A Hand Hygiene Tracking System with LoRaWAN Network for the Abolition of Hospital-Acquired Infections. IEEE Sens. J. 2023 , 23 , 7608–7615. [ Google Scholar ] [ CrossRef ]
  • Adi, P.D.P.; Wahyu, Y. The Error Rate Analyze and Parameter Measurement on LoRa Communication for Health Monitoring. Microprocess. Microsyst. 2023 , 98 , 104820. [ Google Scholar ] [ CrossRef ]
  • Mahjoub, T.; Ben Said, M.; Boujemaa, H. Experimental Analysis of LoRa Signal in Urban Environment. In Proceedings of the 2022 International Wireless Communications and Mobile Computing (IWCMC), Dubrovnik, Croatia, 30 May–3 June 2022; pp. 812–817. [ Google Scholar ]
  • Wu, Z.; Shen, Q.; Wang, J. Researching on Signal Transmission Performance of LoRa Technology in Urban Environment. In Proceedings of the 2023 IEEE 12th Data Driven Control and Learning Systems Conference (DDCLS), Xiangtan, China, 12–14 May 2023; pp. 1056–1061. [ Google Scholar ]
  • Gonzalez, C.; Gibeaux, S.; Ponte, D.; Espinosa, A.; Pitti, J.; Nolot, F. An Exploration of LoRa Network in Tropical Farming Environment. In Proceedings of the 2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI), Beijing, China, 6–8 May 2022; pp. 182–186. [ Google Scholar ]
  • Sharma, A.; Singh Kapoor, D.; Nayyar, A.; Qureshi, B.; Jot Singh, K.; Thakur, K. Exploration of IoT Nodes Communication Using LoRaWAN in Forest Environment. Comput. Mater. Contin. 2022 , 71 , 6239–6256. [ Google Scholar ] [ CrossRef ]
  • Ferreira, A.E.; Ortiz, F.M.; Costa, L.H.M.K.; Foubert, B.; Amadou, I.; Mitton, N. A Study of the LoRa Signal Propagation in Forest, Urban, and Suburban Environments. Ann. Telecommun. 2020 , 75 , 333–351. [ Google Scholar ] [ CrossRef ]
  • Villarim, M.R.; Vitor Holanda De Luna, J.; De Farias Medeiros, D.; Imaculada Soares Pereira, R.; De Souza, C.P.; Baiocchi, O.; Da Cunha Martins, F.C. An Evaluation of LoRa Communication Range in Urban and Forest Areas: A Case Study in Brazil and Portugal. In Proceedings of the 2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, Canada, 17–19 October 2019; pp. 0827–0832. [ Google Scholar ]
  • Pakrooh, R.; Bohlooli, A. A Survey on Unmanned Aerial Vehicles-Assisted Internet of Things: A Service-Oriented Classification. Wirel. Pers. Commun. 2021 , 119 , 1541–1575. [ Google Scholar ] [ CrossRef ]
  • Gu, X.; Zhang, G. A Survey on UAV-Assisted Wireless Communications: Recent Advances and Future Trends. Comput. Commun. 2023 , 208 , 44–78. [ Google Scholar ] [ CrossRef ]
  • Wei, Z.; Zhu, M.; Zhang, N.; Wang, L.; Zou, Y.; Meng, Z.; Wu, H.; Feng, Z. UAV-Assisted Data Collection for Internet of Things: A Survey. IEEE Internet Things J. 2022 , 9 , 15460–15483. [ Google Scholar ] [ CrossRef ]
  • Torres-Sanz, V.; Sanguesa, J.A.; Serna, F.; Martinez, F.J.; Garrido, P.; Calafate, C.T. Analysis of the Influence of Terrain on LoRaWAN-Based IoT Deployments. In Proceedings of the Int’l ACM Conference on Modeling Analysis and Simulation of Wireless and Mobile Systems, Montreal, QC, Canada, 30 October 2023; pp. 217–224. [ Google Scholar ]
  • Dambal, V.A.; Mohadikar, S.; Kumbhar, A.; Guvenc, I. Improving LoRa Signal Coverage in Urban and Sub-Urban Environments with UAVs. In Proceedings of the 2019 International Workshop on Antenna Technology (iWAT), Miami, FL, USA, 3–6 March 2019; pp. 210–213. [ Google Scholar ]
  • Bariah, L.; Jaafar, W.; Muhaidat, S.; Elgala, H.; Yanikomeroglu, H. On the Error Performance of LoRa-Enabled Aerial Networks Over Shadowed Rician Fading Channels. IEEE Commun. Lett. 2022 , 26 , 2322–2326. [ Google Scholar ] [ CrossRef ]
  • Xiong, R.; Liang, C.; Zhang, H.; Xu, X.; Luo, J. FlyingLoRa: Towards Energy Efficient Data Collection in UAV-Assisted LoRa Networks. Comput. Netw. 2023 , 220 , 109511. [ Google Scholar ] [ CrossRef ]
  • Panga, S.M.S.; Borkotoky, S.S. Leveraging Wake-Up Radios in UAV-Aided LoRa Networks: Some Preliminary Results on a Random-Access Scheme. In Proceedings of the 2023 17th International Conference on Telecommunications (ConTEL), Graz, Austria, 11–13 July 2023; pp. 1–7. [ Google Scholar ]
  • Bianco, G.M.; Marrocco, G. Body-UAV Near-Ground LoRa Links through a Mediterranean Forest. IEEE Trans. Antennas Propagat. 2023 , 71 , 6214–6218. [ Google Scholar ] [ CrossRef ]
  • Andreadis, A.; Giambene, G.; Zambon, R. Role of UAVs and HAPS for IoT-Based Monitoring in Emergency Scenarios. In Proceedings of the 2023 International Conference on Information and Communication Technologies for Disaster Management (ICT-DM), Cosenza, Italy, 13–15 September 2023; pp. 1–8. [ Google Scholar ]
  • Jia, B.; Qiao, W.; Huang, B.; Yang, H.; Wang, E. Sequentially Localizing LoRa Terminals with A Single UAV. In Proceedings of the 2023 IEEE Wireless Communications and Networking Conference (WCNC), Glasgow, UK, 26–29 March 2023; pp. 1–6. [ Google Scholar ]
  • Andreadis, A.; Giambene, G.; Zambon, R. Low-Power IoT for Monitoring Unconnected Remote Areas. Sensors 2023 , 23 , 4481. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Zhang, Z.; Zhou, C.; Sheng, L.; Cao, S. Optimization Schemes for UAV Data Collection with LoRa 2.4 GHz Technology in Remote Areas without Infrastructure. Drones 2022 , 6 , 173. [ Google Scholar ] [ CrossRef ]
  • Trasviña-Moreno, C.; Blasco, R.; Marco, Á.; Casas, R.; Trasviña-Castro, A. Unmanned Aerial Vehicle Based Wireless Sensor Network for Marine-Coastal Environment Monitoring. Sensors 2017 , 17 , 460. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Vlasceanu, E.; Dima, M.; Popescu, D.; Ichim, L. Sensor and Communication Considerations in UAV-WSN Based System for Precision Agriculture. In Proceedings of the 2019 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE Conference on Robotics, Automation and Mechatronics (RAM), Bangkok, Thailand, 18–20 November 2019; pp. 281–286. [ Google Scholar ]
  • Holtorf, L.; Titov, I.; Daschner, F.; Gerken, M. UAV-Based Wireless Data Collection from Underground Sensor Nodes for Precision Agriculture. AgriEngineering 2023 , 5 , 338–354. [ Google Scholar ] [ CrossRef ]
  • Cariou, C.; Moiroux-Arvis, L.; Pinet, F.; Chanet, J.-P. Data Collection from Buried Sensor Nodes by Means of an Unmanned Aerial Vehicle. Sensors 2022 , 22 , 5926. [ Google Scholar ] [ CrossRef ]
  • Zhang, M.; Li, X. Drone-Enabled Internet-of-Things Relay for Environmental Monitoring in Remote Areas without Public Networks. IEEE Internet Things J. 2020 , 7 , 7648–7662. [ Google Scholar ] [ CrossRef ]
  • Chen, L.-Y.; Huang, H.-S.; Wu, C.-J.; Tsai, Y.-T.; Chang, Y.-S. A LoRa-Based Air Quality Monitor on Unmanned Aerial Vehicle for Smart City. In Proceedings of the 2018 International Conference on System Science and Engineering (ICSSE), New Taipei, Taiwan, 28–30 June 2018; pp. 1–5. [ Google Scholar ]
  • Camarillo-Escobedo, R.; Flores, J.L.; Marin-Montoya, P.; García-Torales, G.; Camarillo-Escobedo, J.M. Smart Multi-Sensor System for Remote Air Quality Monitoring Using Unmanned Aerial Vehicle and LoRaWAN. Sensors 2022 , 22 , 1706. [ Google Scholar ] [ CrossRef ]
  • Martinez-Caro, J.-M.; Cano, M.-D. IoT System Integrating Unmanned Aerial Vehicles and LoRa Technology: A Performance Evaluation Study. Wirel. Commun. Mob. Comput. 2019 , 2019 , 4307925. [ Google Scholar ] [ CrossRef ]
  • Chen, C.; Luo, J.; Xu, Z.; Xiong, R.; Yin, Z.; Lin, J.; Shen, D. LoRaDrone: Enabling Low-Power LoRa Data Transmission via a Mobile Approach. In Proceedings of the 2022 18th International Conference on Mobility, Sensing and Networking (MSN), Guangzhou, China, 14–16 December 2022; pp. 239–246. [ Google Scholar ]
  • Chen, C.; Luo, J.; Xu, Z.; Xiong, R.; Shen, D.; Yin, Z. Enabling Large-Scale Low-Power LoRa Data Transmission via Multiple Mobile LoRa Gateways. Comput. Netw. 2023 , 237 , 110083. [ Google Scholar ] [ CrossRef ]
  • De Rango, F.; Stumpo, D. Supporting Path Planning in LoRa-Based UAVs for Dynamic Coverage for IoT Devices. In Proceedings of the 2023 IEEE 20th Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 8–11 January 2023; pp. 337–340. [ Google Scholar ]
  • Ferreira, F.H.C.D.S.; Neto, M.C.D.A.; Barros, F.J.B.; Araújo, J.P.L.D. Intelligent Drone Positioning via BIC Optimization for Maximizing LPWAN Coverage and Capacity in Suburban Amazon Environments. Sensors 2023 , 23 , 6231. [ Google Scholar ] [ CrossRef ]
  • Vishnevskiy, V.M.; Samouylov, K.E.; Kozyrev, D.V. (Eds.) Distributed Computer and Communication Networks. In Proceedings of the 19th International Conference, DCCN 2016, Moscow, Russia, 21–25 November 2016; Revised Selected Papers; Communications in Computer and Information Science. Springer International Publishing: Cham, Switzerland, 2016; Volume 678, ISBN 978-3-319-51916-6. [ Google Scholar ]
  • Moheddine, A.; Patrone, F.; Marchese, M. UAV and IoT Integration: A Flying Gateway. In Proceedings of the 2019 26th IEEE International Conference on Electronics, Circuits and Systems (ICECS), Genoa, Italy, 27–29 November 2019; pp. 121–122. [ Google Scholar ]
  • Almalki, F.A.; Soufiene, B.O.; Alsamhi, S.H.; Sakli, H. A Low-Cost Platform for Environmental Smart Farming Monitoring System Based on IoT and UAVs. Sustainability 2021 , 13 , 5908. [ Google Scholar ] [ CrossRef ]
  • Park, S.; Yun, S.; Kim, H.; Kwon, R.; Ganser, J.; Anthony, S. Forestry Monitoring System Using LoRa and Drone. In Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics, Novi Sad, Serbia, 25–27 June 2018; pp. 1–8. [ Google Scholar ]
  • Delafontaine, V.; Schiano, F.; Cocco, G.; Rusu, A.; Floreano, D. Drone-Aided Localization in LoRa IoT Networks. In Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France, 31 May–31 August 2020; pp. 286–292. [ Google Scholar ]
  • Behjati, M.; Mohd Noh, A.B.; Alobaidy, H.A.H.; Zulkifley, M.A.; Nordin, R.; Abdullah, N.F. LoRa Communications as an Enabler for Internet of Drones towards Large-Scale Livestock Monitoring in Rural Farms. Sensors 2021 , 21 , 5044. [ Google Scholar ] [ CrossRef ]
  • Mohd Noh, A.; Nordin, R. Empirical Study of Drone-LoRa Enable Performance to Doppler Robustness for Livestock Industry. Int. J. Integr. Eng. 2022 , 14 , 212–227. [ Google Scholar ] [ CrossRef ]
  • Gallego-Madrid, J.; Molina-Zarca, A.; Sanchez-Iborra, R.; Bernal-Bernabe, J.; Santa, J.; Ruiz, P.M.; Skarmeta-Gómez, A.F. Enhancing Extensive and Remote LoRa Deployments through MEC-Powered Drone Gateways. Sensors 2020 , 20 , 4109. [ Google Scholar ] [ CrossRef ]
  • Leenders, G.; Ottoy, G.; Callebaut, G.; Van Der Perre, L.; De Strycker, L. An Energy-Efficient LoRa Multi-Hop Protocol through Preamble Sampling. In Proceedings of the 2023 IEEE Wireless Communications and Networking Conference (WCNC), Glasgow, UK, 26–29 March 2023; pp. 1–6. [ Google Scholar ]
  • Cotrim, J.R.; Margi, C.B.; Kleinschmidt, J.H. Design of a Gateway-Based Relay Node for LoRaWAN Multihop Networks. In Proceedings of the 2022 Symposium on Internet of Things (SIoT), São Paulo, Brazil, 24–28 October 2022; pp. 1–4. [ Google Scholar ]
  • Prade, L.; Moraes, J.; De Albuquerque, E.; Rosário, D.; Both, C.B. Multi-Radio and Multi-Hop LoRa Communication Architecture for Large Scale IoT Deployment. Comput. Electr. Eng. 2022 , 102 , 108242. [ Google Scholar ] [ CrossRef ]
  • Wong, A.W.-L.; Goh, S.L.; Hasan, M.K.; Fattah, S. Multi-Hop and Mesh for LoRa Networks: Recent Advancements, Issues, and Recommended Applications. ACM Comput. Surv. 2024 , 56 , 1–43. [ Google Scholar ] [ CrossRef ]
  • Tondo, F.A.; Afhamisis, M.; Montejo-Sánchez, S.; López, O.L.A.; Palattella, M.R.; Souza, R.D. Multiple Channel LoRa-to-LEO Scheduling for Direct-to-Satellite IoT. IEEE Access 2024 , 12 , 30627–30637. [ Google Scholar ] [ CrossRef ]
  • Zhou, W.; Hong, T.; Ding, X.; Zhang, G. LoRa Performance Analysis for LEO Satellite IoT Networks. In Proceedings of the 2021 13th International Conference on Wireless Communications and Signal Processing (WCSP), Changsha, China, 20–22 October 2021; pp. 1–5. [ Google Scholar ]
  • Zadorozhny, A.M.; Doroshkin, A.A.; Gorev, V.N.; Melkov, A.V.; Mitrokhin, A.A.; Prokopyev, V.Y.; Prokopyev, Y.M. First Flight-Testing of LoRa Modulation in Satellite Radio Communications in Low-Earth Orbit. IEEE Access 2022 , 10 , 100006–100023. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

LiteratureArchitecture/Protocol
of UAV-Assisted LoRa Network
Integrated Design of UAV with LoRa
End-Device/Gateway
Point-to-PointCustomLoRaWANCarrying LoRaWAN ModuleCarrying LoRaWAN
End-Devices
Carrying LoRaWAN
Gateway
[ ]
[ ] Key-Value
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ]
[ ] √ (Cellular)
[ ] √ (Cellular)
[ ] √ (Wi-Fi)
[ ] √ (Wi-Fi, cellular)
[ ] √ (Wi-Fi, cellular)
[ ] √ (Local storage)
[ ] √ (Local storage)
[ ] √ (Local storage)
Our Work √ (Integrated solution of “UAV + Remote
Controller + Server”)
PSDKMSDK
Development DeviceUAV payloadMobile device
Hardware PlatformMainstream Embedded Hardware Platforms, such as STM32, Raspberry Pi, etc.Nexus Devices (connected to Remote Controller) or Remote Controller
Software PlatformLinux, ROS and RTOSiOS and Android
UAV channel resourcesBidirectional wired
communication between
User payload and UAV
Bidirectional wireless
communication between
Remote Controller and UAV
Specific functionPower supply
Time synchronization
Networking capability
UAV flight control
General functionGPS information subscription
Camera data acquisition and control
Flight parameter subscription...
ConfigurationValue
Bandwidth125 KHz
Code RateCR_4_5
Spreading Factor11
Transmission power17 dbm
Payload28 Byte
ModeClass A
Antenna gain 2 dbi
Transmission period10 s
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Zhao, H.; Tang, W.; Chen, S.; Li, A.; Li, Y.; Cheng, W. Design and Implementation of a Novel UAV-Assisted LoRaWAN Network. Drones 2024 , 8 , 520. https://doi.org/10.3390/drones8100520

Zhao H, Tang W, Chen S, Li A, Li Y, Cheng W. Design and Implementation of a Novel UAV-Assisted LoRaWAN Network. Drones . 2024; 8(10):520. https://doi.org/10.3390/drones8100520

Zhao, Honggang, Wenxin Tang, Sitong Chen, Aoyang Li, Yong Li, and Wei Cheng. 2024. "Design and Implementation of a Novel UAV-Assisted LoRaWAN Network" Drones 8, no. 10: 520. https://doi.org/10.3390/drones8100520

Article Metrics

Further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Analysis, modelling, and optimization of force in ultra-precision hard turning of cold work hardened steel using the CBN tool

  • Technical Paper
  • Open access
  • Published: 24 September 2024
  • Volume 46 , article number  624 , ( 2024 )

Cite this article

You have full access to this open access article

advantages of factorial experimental design

  • Ogutu Isaya Elly   ORCID: orcid.org/0009-0000-9857-1431 1 ,
  • Ugonna Loveday Adizue 1 , 2 ,
  • Amanuel Diriba Tura 1 ,
  • Balázs Zsolt Farkas 1 &
  • M.Takács 1  

The machinability of high-performance materials such as superalloys, composites, and hardened steel has been a big challenge due to their mechanical, physical, and chemical properties, which give them inherent complex machining characteristics. Additionally, majority of machinability tests conducted on these materials have been carried out on conventional and less precise lathes based on Taguchi, composite, and other designs of experiments that do not exploit all the possible combinations of cutting parameters. This work reports an investigation on ultra-precision hard turning (UHT) of cold work hardened AISI D2 steel of HRC 62, based on the full factorial design of experiment, carried out on an ultra-precision lathe. A theoretical analysis of the force components generated is reported. Modelling of the process, based on the resultant force, is also reported through a machine learning model. The model was developed from the experimental data and statistically evaluated with validation data. Its average MAPE values of 1.47%, 4.81%, and 10.66% for training, testing, and validation, respectively, attest to its robustness. The excellent coefficient of determination values, R 2 , also justify the model’s robustness. Multi-objective optimization was also conducted to optimize material removal rate (MRR), resultant force, and vibration simultaneously. For sustainable and efficient UHT, optimal cutting velocity (158.8 m/min), feed (0.125 mm/rev), and depth of cut (0.074 mm) were proposed to generate optimal resultant force (224.8 N), MRR (2603.6 mm 3 /min), and vibration (0.03 m/s^2) simultaneously. These results can be beneficial in planning UHT processes for high-performance materials.

Explore related subjects

  • Artificial Intelligence

Avoid common mistakes on your manuscript.

1 Introduction

High-performance metals, viz. super alloys, hardened steels, titanium, and its alloys, are known for their high creep and corrosion resistances, low thermal conductivity, hot hardness, chemical inertness, and resistance to thermal shocks. These properties have made their demand for the fabrication of parts in the manufacturing industries, such as automotive, aeronautical, and marine, very high. In the same breath, the same properties have made their machinability difficult. For instance, their hardness has been reported to accelerate the tool wear process [ 1 ] and introduce workpiece burnouts, which affect the quality of the final product [ 2 ]. Cold work hardened AISI D2 steel is an essential hard metal when it comes to making extrusion dies, slitting cutters, wire dies, burnishing rolls, gauges, and master tools. Its machinability has been greatly impeded by its mechanical properties, and as a result, its application in part production could have been much higher. Therefore, characterizing its machinability is an area that needs thorough research.

Efforts by manufacturing stakeholders have seen the development of several machining approaches for hardened AISI D2 steel. For a long time, grinding has been the primary method of fabricating parts from hard metals [ 3 ]. However, its low material removal rate, high-power consumption, long set-up times, and inability to produce complex geometries rendered it unreliable for hard machining [ 4 ]. This gave way to better methods such as hard turning [ 5 ]. Hard turning refers to the single-point cutting of metals with Rockwell hardness (HRC) of 45 and above, primarily in the range 58–68 HRC [ 6 ] [ 7 ], and is reliable for near-net shape and finish machining. The deployment of ultra-precision machine tools, tools, machining conditions, and production of parts with tight dimensional and geometrical tolerances characterizes ultra-precision hard turning (UHT) [ 8 ]. UHT is known for using small feeds and depths of cut, enabling it to achieve form accuracy of less than 1 µm and surface roughness of less than 0.1 µm. These accuracy levels are superior to the traditional hard turning methods [ 9 ].

Tool design and fabrication for the UHT process has been one of the critical research areas in hard machining. Efforts have been made to fabricate strong, tough tools with high thermal shock resistance and chemically inert at high temperatures. Cermet tools [ 10 ], ceramic tools, coated carbide inserts [ 11 ], and cubic boron nitride (CBN) tools [ 12 ] are some of the tools that have found recognition for use in UHT. However, due to their strength, toughness, and hardness (which is only second to diamond), CBN tools and their variants (PCBN) are the best for hard turning. In addition, their high melting temperature of 2730 °C and high stability of up to around 2000 °C have made them the best choice for hard turning processes [ 13 ]. Works by [ 14 , 15 ] and [ 16 ] have shown the successful application of CBN tools in the hard turning of hardened steels.

The machinability of hardened AISI D2 can further be understood and improved through the prediction of machinability indices [ 17 ]. Force is a very significant machinability index when determining process stability. It has a strong correlation with almost all the other performance indices [ 18 ]. Force prediction, for a long period now, has been reported from analytical, numerical, and experimental (based on big data models) points of view. However, recently, industries have been rushing for intelligent machining to enhance productivity. Machine learning technologies, therefore, have been deployed to upheave production in line with the Industry 4.0 industrial revolution guidelines [ 19 ]. Machine learning (ML) models can be supervised, unsupervised, or reinforcement models. Supervised ML models such as Gaussian progressive relation (GPR), random forest (RF) regression, support vector machining (SVM), polynomial regression, gradient boosted trees (GBT), adaptive neuro-fuzzy inferencing system (ANFIS), decision trees (DT), and artificial neural network (ANN) have been the subject of interest in a bid to embrace intelligent machining. These ML models have been deployed largely for tool wear monitoring and machinability metrics forecasting both online and offline [ 20 ]. Process monitoring and prediction give prior knowledge of the process outcome, which can be informative when it comes to design and planning. It is through proper design and planning that the downtimes and production costs are reduced to improve production efficiency.

In addition to prediction, optimization of machinability indices is key in the planning and execution of sustainable and profitable production of AISI D2 steel products. Optimization can be objective as well as multi-objective [ 21 ]. Single objective optimization has been faulted for overlooking some of the process kinematics and the interdependence of machinability indices, hence not capturing the dynamic nature of processes. Multi-objective optimization finds a balance between selected machinability indices; hence, it can minimize vibration and acoustic emission while maximizing material removal rate, tool life, etc. Ant colony, teacher–learner-based optimization (TLBO), particle swarm optimization (PSO), grey relational method, multi-objective particle swarm optimization (MOPSO), firefly, reinforcement dung beetle algorithm, and genetic algorithm (GA) are some of the heuristics and metaheuristics optimization methods that have been reported this far [ 22 , 23 , 24 , 25 ].

Kumar et al. [ 26 ] studied the workability of heat-treated AISI D2 steel based on the cutting tool’s flank wear, product surface quality, and chip–tool interface temperature. They developed RSM and ANN models for forecasting the three machinability indices. Vahid et al. [ 27 ] developed an ACO (adaptive control optimization) system that relied on ANN and GP (geometrical programming) for prediction, and PSO for optimization when hard turning AISI D2 steel. According to Tabassum et al. [ 28 ], ANFIS prediction is superior to that of ANN and RSM when forecasting force from the hard turning of AISI 1045 (56 HRC) in both high-pressure coolant and dry cutting conditions. Non-dominated sorting genetic algorithm II (NSGA-II) was used by Meddour et al. [ 29 ] to find the optimal values of surface roughness and cutting forces simultaneously when hard turning AISI 4140 (60 HRC) using a mixed ceramics cutting tool. In addition, they developed RSM and ANN models for prediction. Similar work on ML modelling of AISI D2 processes was reported by Adizue et al. [ 30 ]. Further works on ML modelling have been reported by [ 31 , 32 , 33 , 34 ].

Parameter correlation studies by Patel and Gandhi [ 35 ], when hard turning AISI D2 steel using the CBN tool, showed that depth of cut was the most significant parameter in force generation. According to Rafighi et al. [ 17 ], variation of cutting edge radius had the most significant impact on cutting force when hard turning AISI D2 steel. They noted that the ceramic tool used led to a lower magnitude of forces than CBN inserts. According to [ 36 ], there is no correlation between the cutting velocity and cutting force when machining AISI D2 steel using the CBN tool.

Based on the literature review, there is a need for robust prognostic models and optimization methods to effectively characterize the UHT of AISI D2 steel using the CBN tool, based on the generated forces. Most of the reported works have been based on Taguchi’s design of experiments and, therefore, fail to investigate all the possible parameter combinations during the execution of the experiments, leading to incomprehensive and inaccurate conclusions. Thus, there is a need for more investigations based on full factorial experiments to exhaust all possible parameter combinations.

Due to the dynamic nature of the UHT process, no standard machine learning model or optimization method has been approved for force prediction when hard turning AISI D2 steel using the CBN tool. This is despite their prediction superiority over analytical and numerical methods [ 37 ]. The literature work on machine learning modelling and optimization of resultant force during UHT of AISI D2 steel is scarce. Hence, there is a need for either developing new, robust, and accurate models or evaluating and improving the existing ones for application during AISI D2 steel UHT using CBN tools.

Based on these gaps, this work analyses the kinematics of force components’ generation during UHT processes. In addition, it investigates the prediction of resultant forces using ANFIS during UHT of AISI D2 steel. The investigation is based on a full factorial design of experiments and an extra set of experiments for validation data for checking model robustness. Lastly, multi-objective particle swarm optimization (MOPSO) has been used to optimize the resultant force relative to material removal rate (MRR) and vibration to establish optimal parameters for sustainable and efficient production of parts made from cold work hardened AISI D2.

2 Materials and methods

2.1 design of experiments.

This work sought to characterize the UHT of cold work hardened AISI D2 steel through analysis, prediction, and optimization of force components based on full factorial experiments. Therefore, a theoretical analysis of the force components is conducted, and the resultant force is predicted through ANFIS modelling. The multi-objective optimization method is deployed to determine the optimal cutting parameters and resultant force.

UHT was conducted on a hollow, cold work hardened AISI D2 steel workpiece (62 HRC) of 12 mm internal and 60 mm external diameters and a length of 40 mm (Fig.  1 ). The internal hole provided an exit for the tool because as the diameter of the stock reduced, the machine tool’s rotational speed needed to be increased to maintain the constant cutting velocity. The hole, therefore, compensated for the limited rotational speed of the machine tool. AISI D2 steel was chosen because of its high-dimensional stability, good resistance to temper softening, and high compressive strength when hardened. These properties enhance its application in the development of extrusion dies and slitting cutters.

figure 1

Workpiece of cold work hardened AISI D2 steel applied for the experiments

A single tip, uncoated, full-face layer CBN insert with 50% CBN grade, 2 µm grain size, and bound by a ceramic binder was chosen due to its exceptional performance in high-speed hard machining processes. The CBN insert used was SECO DCGW11T308S-01020-L1-B CBN010 and is specifically meant for finish hard turning operation due to its excellent wear resistance, good thermal shock resistance, chemical inertness when machining steel, hot hardness and strength, good thermal conductivity, and high homologous temperature. It had two cutting edges, a clearance angle of 7°, a cutting edge effective length of 3.5 mm, an insert corner radius of 0.8 mm, a thickness of 3.75 mm, a weight of 0.0004 kg, and an included angle of 55°. Figure  2 shows the CBN insert.

figure 2

CBN insert used for the experiment

The machining process took place on a high-rigidity, ultra-precision CNC lathe (Hemburg-Mikroturn, 50 CNC) with a repetitive accuracy of ± 1 µm, maximum spindle speed of 6000 rpm, and a positional accuracy of 1 µm/150 mm. As shown in Fig.  3 , the workpiece was held on the machine with an ultra-precision three-jaw pneumatic clamp, and its concentricity was set using a dial gauge indicator. A full factorial design of experiments was deployed to explore all the possible parameter combinations. As shown in Table  1 , three cutting parameters (cutting velocity (m/min), feed (mm/rev), and depth of cut (mm)) with three levels were investigated. The cutting parameters’ levels were chosen relative to the manufacturer’s handbook, UHT lathe parameters, CBN tool geometry, and previous research work on hardened steel turning.

figure 3

Experimental set-up

The three levels, therefore, meant that 27 runs of the experiment were conducted randomly. The 27 runs were carried out three times to ensure the accuracy of the final data, hence bringing the total number of experiments conducted to 81 runs. A further nine runs of experiments with unique parameter combinations were conducted to validate the models.

During the face hard turning process, three runs with different parameter combinations were carried on the surface of the workpiece cross-section. See Fig.  4 . When transiting from one run to the other, the spindle was rotated five times without any cutting action. During these non-cutting phases, the machine tool’s controller adjusted the cutting parameters relative to the next run in the used G code. After machining a set of three runs, the machined face was flattened before machining the next set. This was repeated until all the 90 runs had been completed.

figure 4

Schematic representation of the face hard turning strategy

A three-axis Kistler dynamometer (Kistler, Type 9257A) was deployed to measure the signals of passive force ( F p ), cutting force ( F c ), and feed force ( F f ) components. The directions of the force components are shown in Fig.  5 . A transducer (Kistler, Type 9257) was applied for signal amplification. The force signals in Figs. 6 , 7 , and 8 show the individual signals for each run. These signals are separated by moments of change in parameter combination, as is shown in Fig.  8 . During these moments of change in parameter combination, the workpiece was rotated five times without any tool feed.

figure 5

Force components

figure 6

Cutting force ( F c ) signal

figure 7

Feed force ( F f ) signal

figure 8

Passive force ( F p ) signal

2.2 Adaptive neuro-fuzzy inferencing system (ANFIS)

This algorithm is a hybrid of two known machine learning algorithms, artificial neuro network (ANN) and fuzzy logic system (FLS). The hybrid utilizes the constituent algorithms’ strengths to establish a powerful and flexible model for complex nonlinear problems. Figure  9 shows the general ANFIS architecture. During modelling, the input layer receives the variables that are to be used for the prediction process. The membership functions (MF), for each of the variables, are then developed in the fuzzy layer. Equations ( 1 ) and ( 2 ) denote the nodal functions in the fuzzy layer.

figure 9

ANFIS architecture

The membership functions are denoted by a j and b t . C 1, t represents the output from the t th node of the fuzzy layer. Nodes in the product layer generate the input parameters as per Eq. ( 3 ).

Normalization of the input parameters takes place in the third layer. The normalization function is denoted by Eq. ( 4 ), and it converts the input parameters to values between 0 and 1.

W t is determined by the firing strength of node t denoted by ω t . Equation ( 5 ) gives the output of the fuzzy system.

Whereby p t , q t , and r t are the consequent parameters whereas x and y represent the prediction variables. The summation process at the defuzzification layer is summarized by Eq. ( 6 ).

Training of the model involves the adjustment of the consequent parameters and MFs and is conducted by optimization algorithms such as backpropagation, least-square methods, and PSO [ 38 ] based on a cost function. The fuzzy output values, after solving Eq. ( 6 ), are clear, concise, numerical values that are suitable for data prediction and decision-making. ANFIS was chosen based on its excellent results when deployed by [ 39 , 40 ] and [ 41 ] in their works.

2.3 Multi-objective particle swarm optimization (MOPSO)

MOPSO is an improved version of the heuristic particle swarm optimization (PSO) proposed by Coello and Lechunga [ 42 ] for multi-objective problem optimization. The MOPSO process begins with the initialization of a population of particles. Each particle is identified by its position ( P j k ), velocity ( V j k ), and fitness. Each particle’s position is a potential solution for the problem at hand. To optimize the hard turning process, the particles are considered a combination of predictors P j  = ( v c j , f j , a p j ) whereas the velocities are V j  = ( v j1 , v j2 , v j3 ). The new position of the jth particle in k  + 1 iteration, P j k+1 , is updated by the new velocity, V j k+1 , as shown by Eq. ( 7 ). Equation ( 8 ) shows the updating of the previous velocity V j k . The new position is incorporated in the objective function to determine the fitness value. This fitness value is compared to the previous values, and if it is superior, it is considered the personal best; otherwise, the previous fitness is considered the best. The best individual position in the population is considered the global best position.

Whereby P best k is the personal best position, P global k is the best overall position (solution), \(\omega\) is the inertia weight, c 1 represents the personal learning coefficient, and c 2 refers to the global learning coefficient, whereas r 1 and r 2 are random values in the range [0, 1].

As an extension to PSO, MOPSO deploys the concept of Pareto optimality. It involves the creation of external repositories for each particle at each iteration to store its non-dominated solutions. Another repository is created for the entire population for each iteration and one global repository for the entire population for the entire optimization process. A selection criterion is used to choose the best non-dominated solution, from the first two repositories. A leader or global solution is later chosen from the global repository. Figure  10 shows the process of executing MOPSO.

figure 10

Flowchart of MOPSO algorithm

MOPSO was chosen for this work because of its fast convergence, great compatibility, high efficiency, and feasibility [ 43 ].

3 Results and discussion

The signal extraction process was performed in MATLAB R2022b software through coding for all 81 runs and an additional 9 for validation data. Numerical signal values were obtained from arbitrary ranges within the relevant run signal, as shown in Fig.  7 . The non-cutting ranges shown in Figs. 6 and 7 contain signals originating from machine excitation (due to tool acceleration towards the workpiece) and vibrations due to chuck rotation. These non-cutting signals were not considered during numerical extractions of signal values. The force components’ and vibrations’ values in Table  2 are the averages extracted from the signals of the three experimental replicates. Fr and MRR values were determined using Eqs. ( 9 ) and ( 10 ), respectively.

3.1 Force analysis

Generally, as shown in Table  2 , passive force had the largest magnitude. This is a distinctive characteristic of hard turning processes, which can be attributed to the exceptional circumstances of chip formation at hard turning (e.g. spring-back effect) [ 35 ]. The spring-back effect results from the elastic and plastic deformation phenomenon that the workpiece experiences during machining [ 1 ]. According to Rath et al. [ 44 ], chip formation in steel hard turning results from crack initiation on the surface of the workpiece. The crack is initiated ahead of the tool and propagates as the compressive stresses induced by the tool increase due to the feed force to form saw-tooth chips. In the wake of chip formation, the workpiece experiences some element of elastic recovery. The elastic recovery (spring-back) raises the magnitude of the passive force. According to Bartaya et al. [ 45 ], the high passive force can also be associated with the lower depth of cut relative to the CBN insert’s cutting edge radius and the negative rake angle of the tool.

Compared to conventional turning, the cutting forces magnitude in hard turning is smaller. This is probably due to residual thermal stresses induced by the high temperatures generated and the low feed rates (relative to cutting speed) associated with this process. The high temperatures soften the workpiece during a given machining pass. As a result, the subsequent passes (either close to the previous pass or covering some part of the previous pass due to small feeds and relatively high cutting speeds) tend to occur on softer surfaces. The low cutting forces can also result from the low depths of cut and feed deployed in hard turning processes [ 46 ].

Analysis of variance in Table  3 indicates that all three cutting parameters were significant in resultant force generation, as their p-values were below 0.05 confidence level. Feed has the highest contribution (69.74%), whereas cutting velocity (3.79%) is the least, and the interaction of feed and depth of cut has the strongest influence on resultant force generation. From Figs. 11 and 12 , an increment in the depth of cut leads to an increment in the resultant force. This can be attributed to an increase in cutting edge angle, which leads to an increment in the coefficient of friction, and subsequent increment in compressive stresses by the tool. Similarly, the feed directly correlates with the resultant force, as shown in Figs.  12  and 13 . This is probably due to the increase in the size of the theoretical chip area with an increase in feed. The changes in the chip area have a direct correlation with the coefficient of friction hence leading to the significant changes in the magnitude of generated force components. Cutting velocity and resultant force have an inverse correlation (Figs. 11  and 13 ). This can be attributed to increase in temperature with increasing cutting velocity at the primary shear zone. The increasing temperature softened the workpiece ahead of the tool, leading to low compressive stresses asserted by the tool, hence the generation of low resultant force [ 47 ].

figure 11

Variation of resultant force with depth of cut and cutting velocity at a constant feed of 0.125 mm/rev

figure 12

Variation of resultant force with depth of cut and feed at a constant cutting speed of 125 m/min

figure 13

Variation of resultant force with cutting velocity and feed at a constant depth of cut of 0.06 mm

As shown in Table  4 , all the cutting parameters and their two-factor interactions were significant in determining material removal rates. All of them had p -values of 0.0. However, feed was the most significant parameter with a contribution of 60.10%. Depth of cut had the least influence at 8.45%, whereas velocity had a contribution of 21.63%. The interaction of velocity and feed had the strongest influence on MRR amongst the parameters’ interactions. It can be inferred from Figs. 14 , 15 , and 16 that all the cutting parameters had a positive correlation with MRR. Increasing cutting velocity could have led to temperature increment at the primary cutting zone, hence softening the workpiece, making it easier and quicker to machine. Likewise, the increment of feed increased the chip area and subsequently volume of the material to be machined per unit time. Similarly, increasing the depth of cut implied increased engagement between the workpiece and the cutting tool that translated to large volume of material to be machined.

figure 14

Variation of MRR with cutting velocity and feed at a constant depth of cut of 0.06 mm

figure 15

Variation of MRR with feed and depth of cut at a constant cutting velocity of 125 m/min

figure 16

Variation of MRR with cutting velocity and depth of cut at a constant feed of 0.125 mm/rev

3.2 Tool vibration analysis

The correlation between vibration and the cutting parameters is quite stochastic, and no clear trend is observed in Figs. 17 , 18 , and 19 . This can be attributed to the complexity of the vibration signal. Apart from tool vibration, the accelerometer often records signals from other unwanted sources within the machining environment. These unwanted signals (noise) may be challenging to get rid of during signal processing. The ANOVA (Table  5 ), however, shows that feed and its interaction with cutting velocity are significant in determining tool vibration. Their contributions are 62.42% and 31.42%, respectively. At a confidence level of 0.05, feed has a p -value of 0.001 whereas its interaction with cutting velocity has a p-value of 0.02. The high influence of feed can be attributed to its direct correlation with the cutting force and the subsequent direct correlation between the cutting force and vibration. As the theoretical chip area increases with the increase in feed, the force generated becomes more due to the high compressive stresses asserted by the advancing tool. The significant impact of the interaction of feed and velocity on vibration can be attributed to possible tool wear. The variation of feed and cutting velocity determines the coefficients of friction and cutting temperature at the shear zone, and this influences tool wear progression. The tool vibration, therefore, will vary with the tool wear status. Figure  20 shows the vibration signals.

figure 17

Variation of vibration with cutting velocity and feed at a constant depth of cut of 0.06 mm

figure 18

Variation of vibration with cutting velocity and depth of cut at a constant feed of 0.125 mm/rev

figure 19

Variation of vibration with depth of cut and feed at a constant velocity of 125 m/min

figure 20

Vibration signal

3.3 ANFIS model development and evaluation

Development of the ANFIS model for resultant force prediction was conducted in MATLAB R2022b using data in Table  2 . The training and testing processes were conducted using data from 19 and 8 experimental runs, respectively. Data from nine experimental runs were later used for the validation of the model. The training process involved variation of the number of membership functions (MF) allocated to the inputs and output parameters, variation of MF type, variation of the type of training algorithms (backpropagation or hybrid (backpropagation and least-squares method)), as well as variation of the fuzzy inferencing system (FIS) (Sugeno or Mamdani). These variations gave different models whose suitability was gauged using the model performance parameters, mean absolute percentage error (MAPE), and coefficient of correlation, R. The adopted model comprised four Gaussian MF s for one input and two each for the other two inputs. Its output parameter was allocated a constant MF and was based on the Sugeno inferencing system. The hybrid algorithm trained the model using the weighted average of rules. Figure  21 shows the model’s structure.

figure 21

ANFIS model structure

The results of the model performance analysis in Table  6 showed that the model developed was highly reliable for prediction. According to [ 48 ], the accuracy of a model is considered low if its MAPE is above 50%, satisfactory if it is between 20 and 50%, good if it is between 10 and 20%, and high if it is below 10%. Similarly, a coefficient of determination of 1 shows a strong positive correlation between the measured and predicted values whereas an R 2 value of − 1 indicates a strong negative correlation. On the other hand, 0 indicates a weak correlation. The MAPE of the model on the trained and test data was 1.47% and 4.81%, respectively, whereas the R 2 values were 0.9 (trained data) and 0.9 (test data). The high R 2 of 0.8 on validation data indicated a strong positive correlation between the measured and predicted resultant forces. This strong correlation was further confirmed by a MAPE of 10.66% on the validation data. The high MAPE and R 2 values (close to 1) on validation data indicated that the model was robust enough to predict resultant force during UHT of cold work hardened AISI D2 using the CBN tool. Figure  22 demonstrates the prediction ability of the developed model. Similar results are observed with the validation data as is shown in Fig.  23 .

figure 22

Comparison of measured against ANFIS model predicted values

figure 23

Comparison between validation and ANFIS-predicted data

3.4 MOPSO results

A relatively high MRR in UHT, relative to the optimal values of other process parameters, increases the production rate by lowering the lead-time on a single product. Therefore, maximization of MRR relative to other machinability metrics can upheave effective and efficient production [ 49 ]. Tool vibration is also a very important parameter in the determination of production quality and cost. Tool vibration correlates with the cutting force, surface roughness, tool wear, and MRR amongst other process parameters. Sahu et al. [ 50 ] demonstrated that surface roughness increased with an increase in tool vibration. Similarly, Guleria et al. [ 51 ] used a tool vibration signal processing technique to classify surface roughness. Tool vibrations introduce chatter marks on the product surface, hence lowering the surface’s quality. In addition to these, tool vibration is also a very critical cause of tool failures and breakages. Therefore, keeping tool vibration relatively minimal is reliable for the stability of the machining process and the acquisition of quality products. This work, therefore, opted to optimize UHT resultant force relative to MRR and vibration with the aim of maximizing MRR and minimizing both the resultant force and vibration.

MOPSO sought to find the best cutting parameter values that would give an optimal magnitude of the resultant force, MRR, and vibration. Coding of the MOPSO algorithm was conducted in MATLAB R2022b using Eqs. 11 , 12 , and 13 as the objective functions. These equations were developed by the response surface methodology approach in Minitab 21. The algorithm was best tuned according to the parameters in Table  7 .

Figure  24 shows the distribution of particles in the search space during the iteration processes of MOPSO. A series of red and black circular marks in the search space represent a series of Pareto fronts. The most optimal Pareto solution (leader) was selected from the many Pareto fronts. Table 8 shows the optimal values for the cutting parameters (cutting velocity, feed, and depth of cut) and response parameters.

figure 24

Distribution of particles (possible solutions) in the search space

The large MRR value is desirable for enhancing the production rate of hardened AISI D2 steel components in an industrial setting. Since there is a direct correlation between vibration and the cutting forces [ 12 ], the low vibration value attained in this work shows that the acquired resultant force is the most optimal to warrant machine stability during UHT of AISI D2 steel within the investigated parameter range.

4 Conclusion

A comprehensive literature review on UHT of hardened AISI D2 steel is presented in this work. An in-depth analysis of the variation of force components during UHT of AISI D2 steel has been reported. ANFIS machine learning model for force prediction was developed and evaluated. The resultant force was later optimized relative to MRR and vibration using MOPSO. Consequently, the following conclusions were made:

The proposed ANFIS model was developed, and its performance evaluation was conducted using mean absolute percentage error (MAPE) and the coefficient of determination ( R 2 ). According to MAPE values of 10.66% and R 2 -values of 0.8 on validation data, the model’s prediction was highly satisfactory (Table  6 ). The model, therefore, can be incorporated into digital twin models for monitoring machining stability during the UHT process of AISI D2 metals.

From the optimization process, a feed of 0.125 mm/rev, a cutting velocity of 158.8 m/min, and a depth of cut of 0.074 mm will result in a stable UHT of hardened AISI D2 steel when using the CBN tool. The corresponding optimal resultant force, MRR, and vibration related to these cutting parameters were 224.8 N, 2603.6 mm 3 /min, and 0.03 m/s 2 (Table  8 ). These values are bound to improve the machining rate as well as lower the machining costs on a shop floor.

All the cutting parameters are significant in resultant force generation during UHT of AISI D2 steel, with feed being the most significant parameter with a contribution of 69.74%. Cutting velocity is a minor contributor (Table  3 ).

Passive force is the largest force component during UHT of hardened AISI D2 steel using the CBN tool (possibly due to the spring-back effect experienced during UHT). In contrast, the feed and cutting forces have low magnitudes (probably due to the high temperature generated at the primary shearing zone), see Table  2 . The knowledge of the magnitude of force components can be of help during tool design and tool choice for UHT processes.

Feed and cutting velocity are the most significant factors in the determination of MRR and vibration during UHT. Feed’s influence dominates that of cutting velocity.

The reported findings, therefore, offer invaluable insight into kinematics, forecasting, and optimization of UHT processes. The results can contribute immensely to the development of precision engineering in the machining of difficult-to-cut metals and other related materials.

4.1 Future directions

The authors recommend future studies to be carried out on composite materials with an interest in the chip removal mechanisms relative to the composite material’s structures. A digital twin model for monitoring machine stability, amongst other machinability indices, can be developed using the developed AI model.

Data availability

The dataset used in this work is available and can be issued by the corresponding author upon satisfactory request.

Code availability

The code used in this work is available and can be issued by the corresponding author upon satisfactory request.

Abbreviations

Ultra-precision hard turning

Mean average percentage error

Cubic boron nitride

Material removal rate

Multiple objective particle swarm optimization

Coefficient of determination

Adaptive neuro-fuzzy inferencing system

  • Resultant force

Tool vibration

Alexander JPD, Grzesik AW, Arrazola PJ, Lamikiz A, Viktor PA, Fernandez J, Azkona I, Norberto L (2011) Machining of hard materials. Springer London, 2011. https://doi.org/10.1007/978-1-84996-450-0 .

Kara F, Karabatak M, Ayyildiz M, Nas E (2020) Effect of machinability, microstructure and hardness of deep cryogenic treatment in hard turning of AISI D2 steel with ceramic cutting. J Market Res 9(1):969–983. https://doi.org/10.1016/j.jmrt.2019.11.037

Article   Google Scholar  

Pimenov DY, Gasiyarov VR, Gupta MK (2018) Multi-Objective optimization for grinding of AISI D2 steel with Al 2 O 3 wheel under MQL. Materials. https://doi.org/10.3390/ma11112269

He B, Ding S, Shi Z (2019) A survey of methods for detecting metallic grinding burn. Measurement 1(134):426–439

Jiang L, Wang D (2019) Finite-element-analysis of the effect of different wiper tool edge geometries during the hard turning of AISI 4340 steel. Simul Mod Pract Theory 1(94):250–263

He K, Gao M, Zhao Z (2019) Soft computing techniques for surface roughness prediction in hard turning: A literature review. IEEE Access. 3(7):89556–89569

Kishawy HA, Hosseini A (2019) Machining difficult-to-cut materials. Mater Mach Tribol 10:973–978

Google Scholar  

Zhao L, Zhang J, Zhang J, Dai H, Hartmaier A, Sun T (2023) Numerical simulation of materials-oriented ultra-precision diamond cutting: review and outlook. Int J Extrem Manuf 5(2):022001. https://doi.org/10.1088/2631-7990/acbb42

Kundrák J, Karpuschewski B, Gyani K, Bana V (2008) Accuracy of hard turning. J Mater Process Technol 202(1–3):328–338. https://doi.org/10.1016/j.jmatprotec.2007.09.056

Yıldırım ÇV, Şirin Ş, Kıvak T, Sarıkaya M (2022) A comparative study on the tribological behavior of mono & proportional hybrid nanofluids for sustainable turning of AISI 420 hardened steel with cermet tools. J Manufact Process 1(73):695–714

Kumar R, Pandey A, Panda A, Mallick R, Sahoo K (2001)“Grey-fuzzy hybrid optimization and cascade neural network modelling in hard turning of AISI D2 steel,” International Journal of Integrated Engineering, vol 4, pp 189–207. https://publisher.uthm.edu.my/ojs/index.php/ijie/article/view/5936 .

Kumar S et al (2023) Hard turning of AISI D2 steel with cubic boron nitride cutting inserts. Mater Today Procee 72:2002–2006. https://doi.org/10.1016/j.matpr.2022.07.338

Wang H, Deng F, Zhang Z, Xie H, He X, Wang H (2021) Study on the properties and fracture mode of pure polycrystalline cubic boron nitride with different particle sizes. Int J Refract Metals Hard Mater 1(95):105446

Sahinoglu A, Rafighi M (2021) Machinability of hardened AISI S1 cold work tool steel using cubic boron nitride. Sci Iran. https://doi.org/10.24200/sci.2021.55772.4398

Özdemir M, Rafighi M, Al AM (2023) Comparative evaluation of coated carbide and CBN inserts performance in dry hard-turning of AISI 4140 Steel using taguchi-based grey relation analysis. Coatings 13(6):979

Rafighi M, Özdemir M, Das A, Das SR (2022) Machinability investigation of cryogenically treated hardened AISI 4140 alloy steel using CBN insert under sustainable finish dry hard turning. Surf Rev Lett 29(04):2250047. https://doi.org/10.1142/S0218625X22500470

Rafighi M, Özdemir M, Al Shehabi S, Kaya MT (2021) sustainable hard turning of high chromium AISI D2 tool steel using CBN and ceramic inserts. Trans Indian Inst Metals 74(7):1639–1653. https://doi.org/10.1007/s12666-021-02245-2

Patel VD, Gandhi AH (2019) Modeling of cutting forces considering progressive flank wear in finish turning of hardened AISI D2 steel with CBN tool. Int J Adv Manuf Technol 104:503–516. https://doi.org/10.1007/s00170-019-03953-2

Nogueira ML, Greis NP, Shah R, Davies MA, Sizemore NE (2022) Machine learning classification of surface fracture in ultra-precision diamond turning using CSI intensity map images. J Manuf Syst 64(May):657–667. https://doi.org/10.1016/j.jmsy.2022.04.011

Pimenov DYu, Bustillo A, Mikolajczyk T (2018) Artificial intelligence for automatic prediction of required surface roughness by monitoring wear on face mill teeth. J Intell Manuf 29(5):1045–1061. https://doi.org/10.1007/s10845-017-1381-8

Xavior MA, Jeyapandiarajan P (2018) Multi-Objective optimization during hard turning of AISI D2 steel using grey relational analysis. Mater Today Procee 5(5):13620–13627. https://doi.org/10.1016/j.matpr.2018.02.359

Asadi R, Yeganefar A, Niknam SA (2019) Optimization and prediction of surface quality and cutting forces in the milling of aluminum alloys using ANFIS and interval type 2 neuro fuzzy network coupled with population-based meta-heuristic learning methods. Int J Adv Manuf Technol 105(5–6):2271–2287. https://doi.org/10.1007/s00170-019-04309-6

Kuntoğlu M et al (2021) Parametric optimization for cutting forces and material removal rate in the turning of AISI 5140. Machines 9(5):1–20. https://doi.org/10.3390/machines9050090

Zhu X, Ni C, Chen G, Guo J (2023) Optimization of tungsten heavy alloy cutting parameters based on rsm and reinforcement dung beetle algorithm. Sensors 23(12):5616. https://doi.org/10.3390/s23125616

Kumar R, Pandey A, Sahoo AK, Rafighi M (2022) Investigation of machinability performance in turning of Ti–6Al–4V ELI Alloy using firefly algorithm and GRNN approaches. Surf Rev Lett 29(06):2250075. https://doi.org/10.1142/S0218625X22500755

Kumar R, Sahoo AK, Das RK, Panda A, Mishra PC (2018) Modelling of flank wear, surface roughness and cutting temperature in sustainable hard turning of AISI D2 Steel. Proced Manufact 20:406–413. https://doi.org/10.1016/j.promfg.2018.02.059

Pourmostaghimi V, Zadshakoyan M, Badamchizadeh MA (2020) Intelligent model-based optimization of cutting parameters for high quality turning of hardened AISI D2. Artif Intell Eng Design Anal Manufact AIEDAM 34(3):421–429. https://doi.org/10.1017/S089006041900043X

Tabassum R, Prianka S, Dhar NR (2022) Estimation of machining responses in hard turning under dry and HPC conditions using different AI based and statistical techniques. Int J Int Design Manufact (IJIDeM) 16(4):1705–1725. https://doi.org/10.1007/s12008-022-00964-4

Meddour I, Yallese MA, Bensouilah H, Khellaf A, Elbah M (2018) Prediction of surface roughness and cutting forces using RSM, ANN, and NSGA-II in finish turning of AISI 4140 hardened steel with mixed ceramic tool. Int J Adv Manuf Technol 97:1931–1949. https://doi.org/10.1007/s00170-018-2026-6

Adizue UL, Tura AD, Isaya EO, Farkas BZ, Takács M (2023) Surface quality prediction by machine learning methods and process parameter optimization in ultra-precision machining of AISI D2 using CBN tool. Int J Adv Manuf Technol 129(3–4):1375–1394. https://doi.org/10.1007/s00170-023-12366-1

Model IH, He K, Xu Q, Jia M (2015) Modeling and predicting surface roughness in hard turning using a bayesian. IEEE Trans Autom Sci Eng 12(3):1092–1103. https://doi.org/10.1109/TASE.2014.2369478

Sharma R (2019) “Effect of cutting conditions on surface roughness and cutting forces in hard turning of AISI 4340 steel. Int J Adv Res Ideas Innov Technol 5(2):778–782

Sahi SS, Sciences M (2022) Review on optimization of hard turning. Adv Appl Math Sci 21(9):5085–5100

Colantonio L, Equeter L, Dehombreux P, Ducobu F (2021) A systematic literature review of cutting tool wear monitoring in turning by using artificial intelligence techniques. Machines 9(12):351

Patel VD, Gandhi AH (2019) Analysis and modeling of surface roughness based on cutting parameters and tool nose radius in turning of AISI D2 steel using CBN tool. Measurement 1(138):34–38

Takacs M, Farkas BZ (2014) “Hard cutting of AISI D2 steel,” 3rd international conference on mechanical engineering and mechatronics, 176 pp 1–7

Soori M, Arezoo B, Dastres R (2023) Machine learning and artificial intelligence in CNC machine tools, a review. Sustain Manufact Serv Econom 1(2):100009. https://doi.org/10.1016/j.smse.2023.100009

M. R. Jamli et al (2017) “comparison of adaptive neuro fuzzy inference system and response surface method in prediction of hard turning output responses,” journal of advanced manufacturing technoology, no. special issue AMET, pp 153–164. https://jamt.utem.edu.my/jamt/article/view/4887

Masoudi S, Sima M, Tolouei-Rad M (2018) Comparative study of ann and anfis models for predicting temperature in machining. J Eng Sci Technol 13(1):211–225

Abbas AT, Alata M, Ragab AE, El Rayes MM, El Danaf EA (2017) Prediction model of cutting parameters for turning high strength steel grade-H: comparative study of regression model versus ANFIS. Adv Mater Sci Eng 2017(1):2759020

Sen B, Mandal UK, Mondal SP (2017) Advancement of an intelligent system based on ANFIS for predicting machining performance parameters of Inconel 690–A perspective of metaheuristic approach. Measurement 1(109):9–17

Coello CAC, (2002) “MOPSO : a proposal for multiple objective particle swarm optimization.” Proceedings of the 2002 congress on evolutionary computation CEC’02 (Cat. No.02TH8600), pp 1051–1056. https://doi.org/10.1109/CEC.2002.1004388

Manav O, Chinchanikar S, Gadge M (2018) Multi-performance optimization in hard turning of AISI 4340 Steel using Particle Swarm Optimization technique. Mater Today Procee 5(11):24652–24663. https://doi.org/10.1016/j.matpr.2018.10.263

Rath D, Panda S, Pal K (2018) Prediction of surface quality using chip morphology with nodal temperature signatures in hard turning Of AISI D3 steel. Mater Today Procee 5(5):12368–12375. https://doi.org/10.1016/j.matpr.2018.02.215

Bartarya G, Choudhury SK (2012) State of the art in hard turning. Int J Mach Tools Manuf 53(1):1–14. https://doi.org/10.1016/j.ijmachtools.2011.08.019

Alp H, Çiçek A, Uçak N (2020) The effects of CryoMQL conditions on tool wear and surface integrity in hard turning of AISI 52100 bearing steel. J Manuf Process 56(May):463–473. https://doi.org/10.1016/j.jmapro.2020.05.015

Yılmaz B, Karabulut Ş, Güllü A (2020) A review of the chip breaking methods for continuous chips in turning. J Manufact Process 1(49):50–69

Pant P, Chatterjee D (2020) Prediction of clad characteristics using ANN and combined PSO-ANN algorithms in laser metal deposition process. Surface Int 1(21):100699

Kaladhar M (2019) Evaluation of hard coating materials performance on machinability issues and material removal rate during turning operations. Measurement 135:493–502. https://doi.org/10.1016/j.measurement.2018.11.066

Sahu NK, Andhare AB, Andhale S, Abraham RR (2018) Prediction of surface roughness in turning of Ti-6Al-4V using cutting parameters, forces and tool vibration. IOP Conf Series Mater Sci Eng 346:012037. https://doi.org/10.1088/1757-899X/346/1/012037

Guleria V, Kumar V, Singh PK (2022) Classification of surface roughness during turning of forged EN8 steel using vibration signal processing and support vector machine. Eng Res Express 4(1):015029. https://doi.org/10.1088/2631-8695/ac57fa

Download references

Acknowledgements

The authors acknowledge the support offered by the National Research, Development, and Information Office (NRDIO) through projects: Research on prime exploitation of the potential provided by industrial digitalization (ED_18-2018-0006) and transient deformation, thermal, and tribological processes at fine machining of metal surfaces of high hardness (OTKA-K-132430). In addition, the authors acknowledge the support of the National Laboratory of Artificial Intelligence, which NRDIO supports under the Ministry for Innovation and Technology. We acknowledge the BME administration for the permission to use the facilities at the Engineering workshop.

Open access funding provided by Budapest University of Technology and Economics. National Research Development and Information office (NRDIO), ED_18-2018-0006.

Author information

Authors and affiliations.

Faculty of Mechanical Engineering, Department of Manufacturing Science and Engineering, Budapest University of Technology and Economics, Budapest, Hungary

Ogutu Isaya Elly, Ugonna Loveday Adizue, Amanuel Diriba Tura, Balázs Zsolt Farkas &  M.Takács

Department of Engineering Research Development and Production, Projects Development Institute (PRODA), Enugu, Nigeria

Ugonna Loveday Adizue

You can also search for this author in PubMed   Google Scholar

Contributions

Research conceptualization and design were carried out by all the authors. Balázs Farkas configured the ultra-precision lathe and other data collection equipment. Ogutu Isaya Elly, Ugonna Loveday Adizue, and Amanuel Diriba Tura carried out material preparation, data collection, and analysis. Ogutu Isaya Elly prepared the very first draft of this paper. He later compiled several versions of the first draft, based on other authors’ comments and suggestions, to produce this final document. Márton Takács supervised the entire research. The final draft was later proofread and approved by all the authors.

Corresponding author

Correspondence to Ogutu Isaya Elly .

Ethics declarations

Conflict of interest.

The authors declare no competing interest.

Consent to participate

All authors participated in the research work reported in this paper.

Consent to publication

All authors agreed to publish the findings of this research.

Additional information

Technical Editor: Lincoln Cardoso Brandao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Elly, O.I., Adizue, U.L., Tura, A.D. et al. Analysis, modelling, and optimization of force in ultra-precision hard turning of cold work hardened steel using the CBN tool. J Braz. Soc. Mech. Sci. Eng. 46 , 624 (2024). https://doi.org/10.1007/s40430-024-05167-4

Download citation

Received : 05 February 2024

Accepted : 22 August 2024

Published : 24 September 2024

DOI : https://doi.org/10.1007/s40430-024-05167-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • AISI D2 steel
  • Machine learning models
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. PPT

    advantages of factorial experimental design

  2. Factorial Design

    advantages of factorial experimental design

  3. What Is Factorial Design In Research Methodology

    advantages of factorial experimental design

  4. PPT

    advantages of factorial experimental design

  5. Factorial Experiments

    advantages of factorial experimental design

  6. PPT

    advantages of factorial experimental design

VIDEO

  1. Design of Experiments, Lecture 14: 3k Full Factorial Designs

  2. Introduction Factorial Experiment and Layout Plan

  3. Factorial Experimental Design

  4. Experimental Research || Types Of Experimental Research || Factorial Designs

  5. Statistical Analysis of 2^2 Factorial Experimental design || in Telugu ||

  6. Factorial Experimental Design Under CRD

COMMENTS

  1. Implementing Clinical Research Using Factorial Designs: A Primer

    Factorial experiments have rarely been used in the development or evaluation of clinical interventions. However, factorial designs offer advantages over randomized controlled trial designs, the latter being much more frequently used in such research. Factorial designs are highly efficient (permitting evaluation of multiple intervention ...

  2. 14.2: Design of experiments via factorial designs

    14.2: Design of experiments via factorial designs. Page ID. Jocelyn Anleitner, Stephanie Combs, Diane Feldkamp, Heeral Sheth, Jason Bourgeois, Michael Kravchenko, Nicholas Parsons, & Andrew Wang. University of Michigan. Factorial design is an important method to determine the effects of multiple variables on a response.

  3. Factorial experiment

    Factorial experiment. In statistics, a full factorial experiment is an experiment whose design consists of two or more factors, each with discrete possible values or "levels", and whose experimental units take on all possible combinations of these levels across all such factors. A full factorial design may also be called a fully crossed design.

  4. What Are Factorial Experiments and Why Can They Be Helpful?

    Factorial designs are used to test more than 1 experimental factor (whence the name) in the context of a single study. ... Given the potential advantages of factorial experiments, one might wonder why they are not used more frequently. As noted above, there can be reduced statistical power for testing the effects of each treatment when there is ...

  5. What Is a Factorial Design? Definition and Examples

    Advantages of a Factorial Design. One of the big advantages of factorial designs is that they allow researchers to look for interactions between independent variables. An interaction is a result in which the effects of one experimental manipulation depends upon the experimental manipulation of another independent variable. Example of a ...

  6. Factorial design: design, measures, and classic examples

    A full factorial design (also known as complete factorial design) is the most complete of the design options, meaning that each factor and level are combined to test every possible combination condition. Let us expand upon the theoretical ERAS factorial experiment as an illustrative example. We designed our own ERAS protocol for Whipple procedures, and our objective is to test which components ...

  7. Factorial Experiments: Efficient Tools for Evaluation of Intervention

    Advantages of factorial experiments. A factorial experiment can be the most efficient way to investigate a set of several intervention components. In many cases the main effects of five, six, ... No experimental design will be equally efficient for all research questions, so questions must be prioritized in terms of scientific importance before ...

  8. Factorial design: design, measures, classic example

    Factorial design is an experimental setup that consists of multiple factors, independent variables, and is a study of both their separate and conjoint effects on the dependent variable.The factorial design enables a clinical trial to evaluate two or more interventions simultaneously. Factors have subdivisions called levels and factorial design is usually expressed in number notation to signify ...

  9. Factorial Design

    It has been argued that factorial designs epitomize the true beginning of modern behavioral research and have caused a significant paradigm shift in the way social scientists conceptualize their research questions and produce objective outcomes (Kerlinger and Lee 2000).Factorial design can be categorized as an experimental methodology which goes beyond common single-variable experimentation.

  10. Factorial Design

    Factorial design is a statistical experimental design used to investigate the effects of two or more independent variables (factors) on a dependent variable. By manipulating the levels of the characteristics and measuring the resulting impact on the dependent variable, researchers can identify each element's unique contributions and their ...

  11. What is a Full Factorial Experiment?

    A full factorial experiment allows researchers to examine two types of causal effects: main effects and interaction effects. To facilitate the discussion of these effects, we will examine results (mean scores) from three 2 x 2 factorial experiments: Experiment I: Mean Scores. A 1. A 2.

  12. 5 Reasons Factorial Experiments Are So Successful

    Orthogonal experimental designs have zero correlation between any variable or interaction effects specifically to avoid this problem. Therefore, our regression results for each effect are independent of all other effects and the results are clear and conclusive. 4. Factorial designs encourage a comprehensive approach to problem-solving.

  13. Frontiers

    Factorial experiments allow one to explore main effects of factors and interactions among factors (23-27). Factorial designs systematically experimentally manipulate multiple components or factors of interest. Indeed, factorial designs are commonly used to test the role of different factors simultaneously in experimental psychology.

  14. Factorial design: design, measures, and classic examples

    This chapter will explain the elements of the factorial and fractional factorial designs, compare these designs to an RCT, and describe the key steps of analysis and reporting. Three factorial experiments in the field of surgical oncology are described, and important benefits and limitations of factorial experiments are reviewed. Ideally, this ...

  15. Factorial Experimental Design

    A full factorial design, also known as fully crossed design, refers to an experimental design that consists of two or more factors, with each factor having multiple discrete possible values or "levels". Using this design, all the possible combinations of factor levels can be investigated in each replication. Although several factors can ...

  16. Half the price, twice the gain: How to simultaneously decrease animal

    ANOVA approaches based on factorial designs have key advantages over experiments in which only one variable is assessed in each run. They have been successfully employed and studied intensively for more than a century in a range of different fields of science (see, e.g., pages 4-53 in Dean et al. 11). As such, it is high time that they become ...

  17. The Importance of Factorial Design of Experiments in Functional

    A factorial design is the experimental design comprising all levels of each factor studied varied in a multivariate manner, in such a way that allows the interaction effects to be calculated. In general, the factorial designs are named according to how many factors they have and how many levels each factor has. ... One advantage of the BBD is ...

  18. Full Factorial Design: Comprehensive Guide for Optimal Experimentation

    A full factorial design is an experimental design that considers the effects of multiple factors simultaneously on a response variable. ... While the full factorial design offers numerous advantages in terms of comprehensiveness and insight, it is important to recognize its limitations and potential drawbacks.

  19. PDF 1 Introduction to Full Factorial Designs with Two-Level Factors

    1Introduction to Full Factorial Designs with Two-Level FactorsFactorial experiments with two-level factors are used widely because are easy to design. efficient to run, straightforward to an. lyze, and information. This chapter illustrates these benefits. The standard models for summarizing data from full factorial experiments are and an exampl.

  20. 14.5: Nested designs

    Nested designs are an important experimental design in science, and they have some advantages over the 2-way ANOVA design (for one), but they also have limitations. Classic examples of nesting: culturing and passage of cell lines in routine cell colony maintenance means that even repeated experiments are done on different experimental units.

  21. Factorial Designs Help to Understand How Psychological Therapy Works

    Factorial experiments allow one to explore main effects of factors and interactions among factors (23-27). Factorial designs systematically experimentally manipulate multiple components or factors of interest. Indeed, factorial designs are commonly used to test the role of different factors simultaneously in experimental psychology.

  22. Factorial Design

    It has been argued that factorial designs epitomize the true beginning of modern behavioral research and have caused a significant paradigm shift in the way social scientists conceptualize their research questions and produce objective outcomes (Kerlinger & Lee, 2000).Factorial design can be categorized as an experimental methodology which goes beyond common single-variable experimentation.

  23. Design and Implementation of a Novel UAV-Assisted LoRaWAN Network

    When LoRaWAN networks are deployed in complex environments with buildings, jungles, and other obstacles, the communication range of LoRa signals experiences a notable reduction, primarily due to multipath propagation, fading, and interference. With the flight advantage of height, mobility, and flexibility, UAV can provide line-of-sight (LOS) communication or more reliable communication in many ...

  24. Analysis, modelling, and optimization of force in ultra ...

    2.1 Design of experiments. This work sought to characterize the UHT of cold work hardened AISI D2 steel through analysis, prediction, and optimization of force components based on full factorial experiments. Therefore, a theoretical analysis of the force components is conducted, and the resultant force is predicted through ANFIS modelling.