Evaluation

 

Evaluation Background

The purpose of this usability test is to evaluate the Gesture Menu, which I implemented for Assignment #2. In this experiment, I compared “Swipe” gesture with general “Game Pad” controller that XBox or Playstation uses. Although the result might be very obvious by intuition, I have made a few hypothesises and then tested them by using statistical analysis. This usability test involves measuring how well test subjects respond in three areas: efficiency, accuracy, and emotional response.

Hypothesis

  1. Users will accomplish their tasks in a short space of time when they interact with “game pad controller” rather than Kinect gesture. In addition, users will be more likely to miss (or make errors) when they interact with “Swipe” gesture.
  2. When users interact with “Complex” task, the experiment results will show lower level of performances in terms of time to completion time as well as number of misses and errors.

Identified Variables

Independent variables

  • Gesture control scheme:Kinect (swipe gesture) and Game Pad
  • Menu depth (simple and complex)

Dependent variables

  • Task Completion Time
  • Error Rate (user selects incorrect menu item)
  • Misses (users perform selection gesture but system does not recognize it)

Design Experiment and Run

Setup Environment

  • I used “JoyToKey” in order to control Gesture Menu with regular game controller designed for Playstation 3 and then I synchronized the controller with mouse.

Task Level

  • Simple: Movies => Rango => Movies => Limitless => (4 swipe gestures with 2 levels of depth)
  • Complex: Music => Novalima – Afro => Bandolero => Apps => Google+ => Music => Philipp Glass -The Hours <= Music => Novalima – Afro => Machete => (10 swipe gestures with 3 levels of depth)

Subjects

  • I recruited 5 participants and they are all graduate students at UC Berkeley who majored in Mechanical Engineering and Mathematics.
  • Before starting experiments, I gave brief instructions about menu structure as well as how to control two different input devices.
  • Each participant performed 2 sets of experiment with 2 input devices (Kinect swipe gesture and Console game controller) and they are required to use both devices in a different order to avoid sampling errors.
  • I also recorded their performances in order to capture completion time, error rates, and miss rates.

Questionnaires

  • After the experiments, I asked them to fill out questionnaires (Google form) which are designed for gathering qualitative data.
  • I used Likert scale to evaluate user preference for the two input methods and an open-ended question that reflect on the relative advantages and disadvantages of the two techniques they compared.

Statistical Analysis

I collected the data from 5 participants with 4 different tasks (Easy task with Kinect, Easy task with Game pad, Compex task with Kinect, and Complex task with Game pad). By utilizing the data, I produced some graphs and descriptive statistics that compare task completion time and error/miss rate for the two input types. I also used T-test and ANOVA to report inferential statistics in order to determine the observed differences are significant. For this statistical analysis, I used R, an open source statistical analysis tool.

Quantitative analysis

As I assumed that game pad controller showed better performance than Kinect gesture, the result from experiments approves above hypothesis. Especially, the time completion span of Kinect gesture (swipe) is much wider than game pad controller. This result implies that game pad controller gave users more stable user experience but Kinect did not. In addition, the time completion result of paired t-test between Game Pad and Kinect also reinforces the hypothesis. The levels of significance referred to as p <.01 (p-value: 8.325e-06, mean of the differences: 29.8). Error rate and Miss rate also showed the same results (p-value for Errors: 0.00575, p-value for Misses: 0.0005164). These p-values means that the test was found to be statistically significant at the .01 level.

Median values with Standard Deviation for two input methods

Errors

Misses

Completion Time

Game Pad (Simple)

0.0 (0.0)

0.0 (0.0)

9.50 (2.32)

Game Pad (Complex)

0.0 (0.42)

0.0 (0.0)

23.50 (5.64)

Game Pad (Overall)

0.0 (0.31)

0.0 (0.0)

16.00 (8.45)

Kinect (Simple)

0.0 (0.71)

2.0 (1.63)

29.00 (6.59)

Kinect (Complex)

1.5 (1.51)

3.0 (4.29)

61.50 (23.76)

Kinect (Overall)

0.5 (1.26)

2.0 (3.37)

39.5 (27.23)



P-Value and Mean of the difference from the paired t-test

Errors

Misses

Completion Time

Kinect – GamePad

(simple task)

0.05218 (0.5)

(p > .05)

0.003772 (2)

(p < .05)

8.301e-06 (16.2)

(p < .01)

Kinect – GamePad

(complex task)

0.03319 (1.3)

(p < .05)

0.01145 (4.3)

(p < .05)

0.000306 (43.4)

(p < .01)

I also conducted the comparison test between task difficulties (Simple vs. Complex) by using paired t-test. According to the t-test between task difficulties, while other results were statistically significant, we need to admit the result that only error rate with simple task was not barely found to be significant. Although error rate with simple task was not statistically significant, we can say that users would show different results in terms of error rates, miss rates, and completion time when they use two input methods.

Bar graphs with error bars that show 95% confidence intervals

Then, are the differences between the group means for Controller and Task complexity statistically significant? In order to answer this question, I conducted two-way ANOVA tests for Completion time, Error rate, and Miss rate.

First of all, the output of our ANOVA test indicated that the completion time differences between Controllers (p-value: 9.324e-09) and between Tasks (p-value: 3.862e-08) were both statistically significant (p < .05 or p < .01). In addition, in light of the statistically significant interaction between Controller and Task (p-vale = 0.001727), I can accept that there is main effects.

In addition to completion time, I also run ANOVA tests for Errors and Misses. The result of error rate showed that error rate differences were both statistically significant but, there is no statistically significant interaction between Controllers and Tasks (p-value = 0.149835). Lastly, the result of miss rate represented that only Controller (p < .05 or p < .01) was statistically significant but, Task and interaction between two indpendent variables were not significant (p > .05).

Qualitative analysis

After test subjects finished their tasks, I asked them a few questions in Likert scale. All participants had some experiences with game pad controller but, nobody had used Kinect before. As above quantitative results showed a wide difference gap between input methods and tasks, all participants proffered “game pad controller” with the highest satisfaction level (both comfort and satisfaction). It is largely because that they suffered a difficulty for navigating menu structure. On the other hand, two participants expressed “Slightly satisfied” with Kinect interface and other two showed “Neutral”. One participant who was not accustomed with “Swipe” gesture choose “Dissatisfied”. The followings are comments from our open question about the advantages and disadvantages of each input methods.

  • Kinect… it’s very hard to control.
  • Kinect requires more physical energy to operate tasks than game pad.
  • Kinect swipe gesture often recognize unintended gesture and it require a lot of attentions in order to stop unintended actions.
  • It needs some time for me to get used to using Kinect, but it would be fun and convenient for someone is familiar with it.

Conclusion and Discussion

In terms of quantitative analysis, game pad controller gave users more reliable and stable user experience rather than “Swipe” gesture. Also, I had noticed that most of participants with Kinect swipe gestures suffered many misses and errors comparing with game pad controller. These result imply that our ‘Gesture Menu’ needs a lot of improvements. The layout of menu was not efficient for ‘Swipe’ gesture and our ‘Swipe’ gesture requires too strict inputs from users.

In addition, Swipe gestures looks like very intuitive because many mobile devices adopted this gesture in their touch screen. However, the swipe gesture have a few critical drawbacks: It requires more physical energy and attentions when the position of hand is located above head or below belly button; Users often move back their hand quickly in order to prepare next swipes so that this gesture triggers unintended actions.

 Leave a Reply

(required)

(required)

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>