2.1. Single Factor
2.1.1. Description:
Single factor controlled experiments are characterized by
having
only one independent variable that is manipulated [3]. This
is the
simplest form of a controlled experiment. Once the
experiment has
been planned, participants should be randomly selected for
participation
and then randomly assigned to the two groups. If
participants are
not selected and assigned randomly, there is an element of
experimenter
bias present. The next step is to have the participants
complete the
tasks that were planned. Care should be taken that each
participant
receives the same instructions and procedure; otherwise
there will
be extraneous error that weakens the confidence in the
results. Once
the data has been collected, statistical analyses are
conducted to
determine if the treatments groups caused a significant
difference
in the outcome measures [2]. Depending on the number of
treatments
there are for the independent variable, either a t-test or
a one-way
ANOVA can be conducted. If there are only two levels a
t-test can
be used, but if there are more than two levels a one-way
ANOVA must
be used [2]. Here are some examples of what a table and
chart of a
single factor experiment would look like. The independent
variables
are on the top of the table and the dependent variable is
on the left.
|
Small Screen
|
Large Screen
|
Reading Speed
|
10.2 s
|
8.3 s
|
This single factor experiment has two treatments of the
independent
variable: small and large screen size. By looking at the
graph there
appears to be a difference in reading speed with the two
screen sizes.
A t-test should be conducted to determine if this
difference is statistically
significant.
|
Small Font
|
Medium Font
|
Large Font
|
Scanning Speed
|
9.5 s
|
6.5 s
|
8.4 s
|
By examining this graph, it appears that the medium font
size has
the fastest scanning speed. Because there are three
treatments of
the independent variable (small, medium, and large font
size), a one-way
ANOVA should be conducted. One-way ANOVAs are used when
there is only
one independent variable and it can determine if there is a
significant
difference between one of the treatment means and the
overall mean.
If there is a significant difference, post-hoc tests should
be conducted
to determine which treatment mean is different from the
others [2].
2.1.2. Advantages/Disadvantages
The single factor experiment is considered to be
statistically and
structurally elegant because it is easy to control and
simple, robust
statistics can be performed on the data [3]. It is also
easy to extend
the independent variable to more groups. The major drawback
of this
type of controlled experiment is that it is not very
efficient. Experiments
can take a lot of time, money, and effort to complete so it
is desirable
to examine more variables in one experiment rather than
look at each
one separately in many experiments. . By using more than
one independent
variable in an experiment, the resources can essentially be
cut in
half compared to conducting several experiments.
2.1.3. Examples
Example 1:
Icons in graphical user interfaces can be used for many
different
ways and for many different purposes. But are icons always
better
than text or a mix of icons or text or just the text. The
question
is one of generality. At one extreme it might be argues
that icons
are inherently more attractive than text and should be used
whenever
possible. At he other extreme one might argue that icons
can be useful
only when performing functions that are inherently visual
or for systems
that include a complete graphical metaphor such as virtual
desktop.
Douglas et al [4] tried to get answer to this question by
performing
a controlled experiment for the three different interfaces
1. Icons
only 2. Icons with command name and 3. Command names only.
The experiment
measured the user preference for a particular type of
interface.
Independent Variable: Interface. Three treatments 1.
Icons
only 2. Icons and Command name 3. Command names only
Dependent variable: User preference rating
Results:
|
Icons only
|
Icons & Command name
|
Command names only
|
User Preference Rating
|
3.92
|
5.55
|
5.67
|
Figure 1. User preference ratings on each of the
interfaces
Figure 2. User preference ratings Vs Interfaces
Example 2:
Author's Interactive Design Dialogue Environment (AIDE) is
an interactive
tool for human computer interface implementation. The user
of this
tool could implement an interface by directly manipulating
and defining
objects rather than by the traditional method of writing
source code.
A controlled experiment study was done by Deborah Hix [5]
to empirically
evaluate the usefulness of such a tool.
Hypothesis: Creation and modification of an
interface is faster
and easier by using AIDE than by writing source code in
programming
language.
Subjects: Group of 3 expert AIDE users and group of
3 expert
programmers in 'C' were chosen. Choice of expert subjects
was to avoid
the issue of training.
Tasks: 1. Interface creation 2. Interface
modification.
Both group of subjects were given the same written
description of
the interface including the sketches and textual
explanation. They
were told that the task had two parts, creation and
modification.
They were told they would be given first the creation task
and then
when it when it had been verified by the experimenter for
correctness
they would be given the modification task.
Independent Variable:
Mechanism for creating user interface Two treatments 1.
AIDE 2. Programming
Language
Dependent Variables:
1. Length of time taken to create the user interface
2. Length of time taken to modify the user interface.
Results:
|
AIDE
|
Programming Languauge
|
Creation Task mean time
|
43 min
|
168 min
|
Modification Task
|
29 min
|
63 min
|
Figure 3. Table summarizing the time taken for
completing
each task using two different mechanisms
The difference between the two groups was significant for
the creation
task t(4) = 11.9, p<0.005 and for modification t(4) =
9.9, p<0.005
Figure 4. Mean time vs. task for each mechanism
2.2. Multi Factor
2.2.1. Description
Multi-factor controlled experiments involve two or more
independent
variables[3]. The same procedures for the single factor
experiment
are used for the multi-factor. This design is used more
often than
the single factor because of the greater variety of
research questions
that can be answer by it. With this design, the data will
always have
to be analyzed with ANOVA[2]. Whether a two-way, three-way,
etc. is
used depends on the number of independent variables. Here
are some
examples of what tables and charts for a 2 x 2 and a 2 x 3
multi-factor
experiment would look like. The independent variables are
on the top
and on the left, while the dependent variables are inside
the cells.
|
Small screen
|
Large screen
|
Practice
|
10.2 s
|
8.3 s
|
No practice
|
7.8 s
|
12.6 s
|
In this example, it appears that the practice group read
faster when
they use a large screen and the no practice group read
faster when
they use a small screen. This is known as an interaction
effect, i.e.
when two groups perform oppositely on different treatments
[2]. An
interaction effect can always be seen in graph form when
the two lines
cross. If the lines were parallel, it would mean that both
groups
performed better on the same treatment of screen size. A
two-way ANOVA
can determine if the two groups perform significantly
differently
from each other.
|
Small font
|
Medium Font |
Large Font
|
Training
|
9.5 s
|
6.8 s
|
8.4 s
|
No Training
|
8.9 s
|
6.1 s
|
8.2 s
|
In this example, both groups performed similarly on all
three treatments
of font size. Both groups had the fastest scanning speed
when they
saw a medium font and slower speeds when they saw small and
large
fonts. As can be seen from the graph, there is no
interaction effect
(the lines never cross). A two-way ANOVA can determine if
the differences
in performance are significant [2].
2.2.2. Advantages/Disadvantages
The greatest benefit of multi-factor experiments is the
ability to
analyze interaction effects between the independent
variables[3].
Not only can you see if there was a difference overall
because of
the screen size (i.e. a main effect), but you can see if
genders performed
differently from each other with respect to the screen
sizes (i.e.
an interaction effect). As can be seen from the first
example, women
had a faster reading speed when they used the small screen
and men
had a faster reading speed when they used the large screen.
In general,
if the lines of the two groups cross on the graph, there is
an interaction
effect. As can be seen in the second example, the two lines
do not
cross and therefore there is not an interaction effect.
Both age groups
had the fastest scanning speeds when the medium size font
was used.
It appears that there is a main effect for font size
because the medium
font always produced the fastest scanning times. It is
unclear just
by examining the graph if there is a main effect for age
because there
is not a great gap between the two age lines (i.e. they
seem to be
performing similarly).
Another benefit of multi-factor designs is that they are
considered
to be more efficient than single-factor designs[3]. Not
only can you
examine more independent variable, but the combination of
them through
the various interactions allows for greater generalization
to other
situations.
A disadvantage to this design over the others is that it
can become
too complicated if too many independent variables are
explored at
once[2]. Also, any design with over three independent
variables becomes
very difficult to analyze if you do not have a statistics
computer
program to help with the analyses.
2.2.3. Examples
Example 3. A menu is a list with a limited number of
options. Gary
Perlman [6] conducted a controlled experiment to study how
menu length,
menu ordering and menu items affect the search time.
Null Hypothesis: Menu length, menu ordering and
menu items
has no effect on the search time.
Independent Variable:
1. Menu List.Two treatments a. Numbers from 1 to 20 b.
Names starting
from letter 'a' through letter 't'
2. Menu Length. Four treatments a. 5 b. 10 c. 15 d. 20 3.
List type. Two treatments a. Sorted b. Random
Dependant Variable:
1. Search time
Results:
1. Finding words took longer than numbers F(1,28) = 11.7,
p<0.01
2. Sorted list were easier to search than random F(1,28) =
10.05,
p<0.001
3. It took longer to find items in longer lists than sorter
lists
F(3,84)=113.86, p<0.001
Figure 5. Response time vs. List Length
Example 4.
Menus have been popular method for accessing information in
computer
systems but human short term memory limitations reduces
performances
efficiency for deeper hierarchies. Scrolling offers an
alternative
option for accessing information. Sarah J Swierenga [7]
conducted
controlled experiment to find the relative efficiency of a
scrolling
and menus as alternative access methods.
Independent variable:
1. Access methods (Four treatments) a. Menuing (Previous
menu, next
menu, main menu etc) b. Line by line c. Half computer
screen (12 lines)
d. Full screen (24 lines)
2. Word familiarity (Two treatments) a. Familiar words b.
Unfamiliar
words
Dependent Variable:
1. Mean total task time.
Results:
1. For familiar words the effect of access method was
significant
F(3,40) = 18.27, p<0.0001
2. For unfamiliar words the effect of access method was
significant
F(3,40) = 69.38, p<0.0001
3. Menuing was the fastest technique, followed by
line-by-line, full
screen and half screen scrolling
Figure 6. Means for access method by word
familiarity on Mean
Total Task Time
2.3. Quasi-experimental
2.3.1. Description:
A quasi-experimental design is a controlled experiment
without all
the control[3]. In essence, what is lacking is random
assignment to
groups. Quasi-experiments are very similar to true
experiments but
use naturally formed or pre-existing groups. For example,
if you wanted
to study the performance differences of two age groups on a
certain
interface, it would be considered a quasi-experiment
because the age
groups are naturally formed. It is impossible to assign
people randomly
to young and old age groups because it is already
predetermined. Another
characteristic of quasi-experimental designs is that the
testing environment
may not be as controlled. Instead of having the testing
done in a
lab setting, it may be done on a job site or in someone's
home.
2.3.2. Advantages/Disadvantages
The advantage of quasi-experiments is that they are easier
to implement[3].
It is much easier to use groups that are already formed than
to have
to worry about randomization. The main disadvantage to this
design is
that it is inferior in terms of internal validity. The
threats to internal
validity inherent in this design are selection bias and the
interaction
of selection and maturation of the participants. Because the
participants
were not randomly assigned, it is impossible to know if the
changes
that occurred were due to the treatment or changes in the
individual.
Further, you must be careful in making statements of
causality because
of the lack of total control.
For example with Quasi-experiment we can find the trend in
effect of
technology in schools. Select a school that does not have
computers.
Evaluate students' performance in the school. Now provide the
students
in the same school access to computers, but at the same time
allow students
to choose whether or not to participate in computer classes.
Evaluate
the performance of students who use the computers compared to
those
who don't.
Case studies are somewhat different than traditional
controlled experiments,
but they still can fall under the same category. This design
typically
involves one person and many observations are made[3]. The
experimenter
chooses one behavior (the dependent variable) and measures it
repeatedly.
This is usually accomplished by what is called a time-series
design,
i.e. the participant is observed before, during, and after
the independent
variable is introduced. The goal is to examine one person and
his/her
behavior very closely. More complicated designs involve
turning the
independent variable on and off several times to observe the
effects.
In these situations the independent variable is usually
something like
a different font size or different organization of
information.
The great advantage to case studies is the amount of control
the experimenter
has over the situation because there is only one person.
Another advantage
is that there is the potential to get ample, detailed
data[3]. A challenge
in the case study design is to get a good baseline of
behavior before
the independent variable is introduced. The behavior must be
stable
before the different font or different organization is
introduced, otherwise
you will not be able to determine if the change in behavior
is due to
chance or to the independent variable. Other disadvantages
include poor
generalization to other people or groups, experimenter bias
in the selection
of the individual to be observed, and the lack of robust
statistics
generally used. In many cases, experimenters will examine the
data on
a graph and determine if a significant change has
occurred[2]. This
is known as "eye-balling" the data and is not a reliable or
valid statistical
method.