affective tone in subjects as a whole. These, in the order in which they appeared in my original list, were: Woman, Dance, Proud, Habit, Pray, Money, Despise, War, Child, Marry, Fight, Family, Name, Afraid, Love, Kiss, State, Happy, Wound, Divorce. I replaced these by the following 20 words which I judged less likely to arouse intense affective tone in the average subject: Window, Pay, Mountain, Justice, Hat, Paint, Wild, Month, Brown, Dog, Help, Apple, Waste, Fast, Purpose, Knife, House, Coal, Fire, Hotel. The list then ran as follows: Each time that I tested a given subject I called out the words of the list in a different order. Thus I first gave them in the order shown above; next backwards; then in the order 1, 3, 5, 7,....99, 2, 4, 6, 8, ....100; for the third test I used the order 100, 98, 96,....2, 99, 97, 95,....1; and similar systematic alterations of order were made for each test. There were several reasons for doing this. In the first place I wished to eliminate, as far as possible, any effects due to preseveration, and reversing the order of the words is calculated to do this to some extent. Secondly, I feared that if I always used the same order the subjects would soon begin to remember which word was coming next, and this would be apt to interfere with the success of the experiment. Thirdly, some subjects have a tendency to 'settle down' in the course of the experiment and to give smaller reactions towards the end than at the beginning, while others behave in the opposite way. By varying the order of the words such sources of error can be minimised. In order to eliminate the danger of the results being unduly influenced by tests, the absolute magnitude of whose reactions might happen to be abnormally large or small, I adopted the same 'percentage method' which I used in my experiments on nonsense syllables. That is to say, I expressed each reaction as a percentage of the arithmetic mean of the series to which it belonged; each series, therefore, was of equal weight in determining the final results no matter what the absolute magnitude of its mean reaction might be. The tests on each subject were carried out at intervals of two or three days. In order to ascertain the average consistency of individual subjectsthe extent, that is to say, to which an individual's reactions on one occasion resembled his reactions on another-I divided the tests for each subject into two equal groups, taking the first three, four or five tests, as the case might be, as one group and the last three, four or five as the other group. Thus for subject P 1 the first group consisted of tests 1, 2 and 3 and the other of tests 4, 5 and 6, while for subject P 5 one group consisted of tests 1, 2, 3, 4 and 5 and the other of tests 6, 7, 8, 9 and 10. In each such group I computed the mean percentage reaction for each word in the series. For example: I also calculated the mean percentage-reaction for all the tests, as shown above. In order to ascertain what kind of effect is produced by using the means of two such groups of tests for each subject, instead of relying on a single pair of tests, and thus to gain some idea of how many tests it would be desirable to use in order to obtain reliable results in future work of this nature, I worked out the coefficients of correlation between the deflections given by the first test and the second test respectively in the case of each subject. The results were: If these values are compared with those obtained from the correlation of the means of the groups it will be seen that the effect of taking the mean of several tests as a basis of calculation is greatly to increase the correlation and to eliminate the discrepancies between individuals. I next calculated the coefficient of correlation between the means of the two groups (M1 and M2) for each subject and obtained the following figures: The mean of these coefficients of correlations is 68. If they be weighted in proportion to the number of observations on which each is based the weighted mean is + 685. This value is important; it is the mathematical expression of the extent to which an average subject agrees with himself, so to speak, over a period of the duration here involved (i.e. about 3-4 weeks). The next step was to ascertain the extent to which subjects agree with each other. To ascertain this I worked out the coefficient of correlation between the mean percentage-reactions of all tests (Mg) for each subject with every other subject. The resulting figures were: The mean of these values is +08. If they be weighted in proportion to the product of the number of observations on which each series correlated is based the weighted mean is + ·09. It will be noticed that with one exception (subject P 3 with P 4)1 the correlation between any two subjects is very markedly lower than that between the two groups of any single individual subject. This is what we should expect on general grounds; for, if we eliminate words of universal appeal from the list, the affective state evoked by any word in a given subject must be a product of that subject's personal experience: and the experience of every individual is unique. In accordance with the ordinary laws of probability we should expect to find certain proportions of abnormally high and low values in each class of correlation (i.e. 'individuals with themselves' and 'individuals with each other') but the majority of values in each should approximate to the mean. We thus find the very high value of + ·98, for subject P 1, and the very low value of +42, for subject P 6, in the first class; and the very high value of +46, for subjects P 3 and P4, in the second. If we had at our disposal a sufficiently large number of values to give us the frequency distributions of values in the two classes we should doubtless obtain two overlapping curves of the approximate form shown in Fig. 1. The one would have its maximum at approximately +7, the other at about 1. The precise position of the maximum would depend, inter alia, upon the number of words of universal appeal which the list contained. If there were none the maximum of the dotted curve would be exactly at 0 and it would, presumably, be symmetrical, while that of 1 This is almost wholly due to two words, 'sad' and 'waste,' which greatly excited both subjects: without these the figure would be about + .09. the full curve would be at about + 6. Any increase in the number of universally exciting words would shift the maxima towards the right and, incidentally, bring them closer together; for if the list were composed exclusively of 'universal' words the element of individuality would ex hypothesi be eliminated and the curves would coincide with a maximum at + 1.0, becoming vertical straight lines in the process. From such curves it would be possible to calculate the precise chance that a given coefficient of correlation between two series of reactions of unknown origin arose from correlating the reactions of the same individual or of two different individuals. For practical purposes, however, such refinements are unnecessary; we may say with considerable assurance that in general the correlation of individuals with themselves is about + 60 to + ·70 while the correla tion between different individuals is not likely to be greater than + ·2. The relevance of this conclusion to possible future investigations will be dealt with later. It is necessary to give, at this stage, a few observations as to the experimental conditions under which this work was done and the probable reliability of the results obtained. I experienced a good deal of difficulty from cold weather which prevailed during part of the work and which was aggravated by the coal strike. I found that when subjects were cold and their skins dry and contracted they generally gave unsatisfactory reactions. Sometimes they refused to react at all and I was obliged to discontinue and to postpone several tests on this account. When they did react they generally gave very small deflections with a distinct tendency towards an 'all-or-none' type of reaction. That is to say they |