As a rule, to estimate usability the basic characteristic of interface effectiveness is the time which the task solution takes. Illustrative as this approach is, it is quite empty. Indeed, let respondents solve the same task in two competitive interfaces and in a time t_{1} and t_{2}, at that t_{1 }< t_{2}. Certainly, first interface is more effective. But why is it? Is the reason related to hardware component or interaction logic? Or is it influence of interface commands? The task solution time does not answer these questions. Our approach is based on other indexes, and we can solve more than one conceptual task.
Firstly, we refuse to measure time and use more subtle interaction models. So we can estimate the idea, conception, logic of the system instead of the finished (realized in fact) system. It gives colossal advantage in designing, as system testing is done in the beginning stages of its development that decrease spending substantially and what is more, measuring results suggest new successful solution for the next stages.
Secondly, our measurements do not concern the concrete project. Measuring an interface scheme we can use it in different parallel projects without repeated testing.
At last, our method is very simple.
Scenarios diagrams
The set of interface states is described in the form of graph which nodes are system states and ribs are possible action of user in one or other state. The graph of states is reduced by excluding all ways after wrong way within the limits of present task of choice. As a result, we have socalled states diagram (see pic. 1.a) which still contains both right and wrong scenarios of task solution. We mark incorrect user´s choices with grey color so we select trajectories of right scenarios and obtain the scenarios diagram (see pic. 1.b).
Each respondent action is encoded by one letter, in that way the trajectory of user´s movement in the interface is described by sequence of letters, by "word". We encode the choices on every state by Latin letters, beginning with A and moving clockwise alphabetically. In that way, for describing any trajectory it is enough to mark first point A on the arch of start choices (see pic. 1.b).
On the diagram (pic. 1.b) successful scenarios are AAAA, BCB, DCB, FAC. Unsuccessful scenarios are BA, C, DCC, etc. There are 19 unsuccessful scenarios in all as many as there are finish grey points on the diagram. There are 4 successful scenarios as many as there are finish black points.
For experiment realization the screens of system are imitated by paper printouts made basing on scenarios diagram; each sheet corresponds to system state and its content corresponds to possible actions in this state. This is the material for the game which experiment moderator plays with respondents.
The game
Two persons take part in the experiment, respondent and moderator. Moderator knows the scenarios diagram. He knows what actions in each state are right and what actions are wrong; the respondent does not know this. The task solution by all possible ways from most naturals to quite exotic in as few actions as possible is required of respondent.


a. The states diagram 
b. The scenarios diagram 
Figure 1. Interface diagrams
Each action of the respondent can be right or wrong. If the respondent does wrong action, the moderator let him know of it and the respondent does next attempt. If he does right action, the moderator also let know of it, but the respondent understands that this is not only right action, because he knows that any problem as a rule has several right solutions. When the goal is reached, respondent returns to initial state to try finding new solution.
After all there comes a point when the respondent thinks that it is enough solutions and there are not others solutions. Then the game is finished.
Experimental protocol
Besides playing the moderator keep a record. The protocol contains notes about all respondent´s choices, their correctness and order of execution. Just these series of actions are object of analysis, they describe user interaction manner.
Each action of user is encoded by one letter according to scenarios diagram. Letter series forms sequence of actions which make successful scenario. During the game the respondent makes mistakes, i.e. does wrong actions. Unlike right actions, they encoded by lowercase letters.
For example, basing on the diagram on picture 1.b we can get following experimental protocol:
scenario 1  AAAA,
scenario 2  BaCB,
scenario 3  DbCB,
scenario 4  eBeDCB (repeats scenario 2),
unsuccessful attempts  eFa.
We can see that the respondent regards as most natural scenario in the line of A. In other directions B and D respondent makes mistakes. Scenario in the line B is repeated twice, and scenario in the line F is not detected at all.
Numeric characteristics of the interface
Using playing experimental procedure we can get several interface characteristics at once basing on experimental protocols. There are two numeric characteristics among them: length of way S and degree of falsity P.


a. Uniform weights distribution 
b. Nonuniform weights distribution 
Figure 2. Scenarios weights
Let us begin with length of way. It may be simply defined as arithmetical mean of lengths of all theoretically possible scenarios that lead to the task solution. However, user can knows only several of all possible successful scenarios and can use only one or two. First of all consideration must be given to lengths of these actively using scenarios. Let n_{1 }actions incorrect included was spent to realization of first scenario; n_{2} was spent to realization of second scenario etc. We ascribe certain weight w_{i} to each length n_{i}, considering it as probability of using one or other scenario in real situation. These probabilities have no uniform distribution: larger probabilities answer early scenarios and smaller probabilities answer later scenarios (see pic. 2.b). We ascribe the largest weight to the first scenario and the smallest weight to the last scenario, supposing that next last scenario has zero weight and interpolation of all weights is linear. (In fact this dependence is hardly linear. Most likely it is exponential or logarithmic. However, even linear interpolation is closer to reality than uniform distribution.)
It is easy to obtain formula for the weight of next scenario and to apply weight coefficients for calculating average of way length:
, where .
Here n_{i}  number of actions which respondent spends to realization of ith scenario (including both productive and mistaken), w_{i}  weight of corresponding scenario calculated according to its number.
Degree of falsity shows how often user makes mistakes. This value is determined as ratio of wrong actions number to all actions number:
where n_{i}  number of productive actions of ith scenario, m_{i}  number of mistakes made by respondent during realization ith scenario; k  summation limit, number of successful scenarios found by respondent.
In considering example the respondent has found three successful scenarios: from the beginning in the line A, then in the line B, at last in the line D. Their weights are:
, , .
The respondent spent 2 actions to the first scenario[1]. Second and third scenarios required 4 actions including one incorrect. Thus,
.
i.e. the respondent spent on average 3 actions to this scenario[2]. Second numeric characteristic of the interface, degree of falsity is calculated as ratio of wrong actions number to all actions number:
.
Statistical analysis
Thus, playing experiment gives researcher twodimensional massive of numeric data S_{i}, P_{i}, for which a whole spectrum of known statistical methods is applicable: estimation distribution parameters, comparing means of two samples with Rosenbaum Qtest or MannWhitney Utest [2].
References:
 Raskin, Jef The Humane Interface. New Directions for Designing Interactive Systems. / AddisonWesley, 2000 233 p.
 Sidorenko, E.V. Mathematical methods of psychology/ StPetersburg, 2004 350p.
[1] We consider as one action a cycle of actions that does not require a choice.
[2] If we use an arithmetic mean, we would get larger value S=3,33.