# WARNING

THIS DOCUMENT IS IN DEVELOPMENT AND DESCRIBES FUTURE VERSIONS OF CARET

# Descriptive Statistics

Descriptive statistics provide information about the data such as the mean (average), median (middle value), mode (most common value), standard deviation, and variance. When computing the standard deviation, one must know if the data values represent the entire population in which case division is by N (number of items) or the data values are a subsample of the population in which case division is by N - 1.

## Population Descriptive Statistics

• Population Mean $\mu = \frac{\sum_{i=1}^N x_i}{N}$
• Population Standard Deviation $\sigma = \sqrt{\frac{\sum_{i=1}^N (x_i - \mu)^2}{N}}$ OR $\sigma = \sqrt{\frac{\sum_{i=1}^N x_i^2 - \frac{(\sum_{i=1}^N x_i)^2}{N}}{N}}$
• Population Variance = σ2
• Standard Deviation of the Mean $SD_{\overline{x}} = \frac{\sigma}{\sqrt{N}}$

## Sample Descriptive Statistics

• Sample Mean $M = \frac{\sum_{i=1}^N x_i}{N}$
• Sample Standard Deviation $S = \sqrt{\frac{\sum_{i=1}^N (x_i - M)^2}{N-1}}$ OR $S = \sqrt{\frac{\sum_{i=1}^N x_i^2 - \frac{(\sum_{i=1}^N x_i)^2}{N}}{N-1}}$
• Sample Variance = S2
• Standard Error of the Mean $SE_{\overline{x}} = \frac{S}{\sqrt{N}}$

## Miscellaneous Descriptive Statistics

• Z-Score $Z = \frac{x_i - \mu}{\sigma}$

# Inferential Statistic Tests

The purpose of the inferential statistic is to take the input files, perform a statistical test at each node, and create a new file containing one or more statistical measurements (F, T, Z, etc) at each node.

## Performing Inferential Statistical Tests in Caret

Inferential statistical tests in Caret are performed on metric or surface shape files. All of the data (metric or shape files) must be on a co-registered surface so that all data files have the same number of nodes and each node number i is "in register" across subjects (i.e., all subjects' surfaces have undergone surface-based registration using Caret, Freesurfer, CIVET, or other software).

The goal is to find clusters (regions) that are statistically different between the groups of input data. That is, one can reject the null hypothesis which states that the metric/shape values at each node are essentially the same.

The steps in Caret are:

1. Run the input files through an inferential statistical test.
2. Perform a randomization process to identify the p-value for a cluster based upon the surface area of the cluster.
3. Assign P-Values to the output of the original inferential statistical test.

## Parametric Inferential Tests

For parametric tests, the data is assumed to be in a specific probability distribution, typically the normal (gaussian) distribution.

### ANOVA (Analysis of Variance), One Way

A one-way ANOVA determines if the mean values at each node for two or more groups of subjects are statistically different.

K = Number of Groups

N = Total number of observations

Ni = Number of items in group i.

$S_i^2$ = Variance for group i.

Mean of group i, $M_i = \frac{\sum_{j=1}^{N_i} x_j} {N_i}$

Grand Mean, $GM = \frac{\sum_{i=1}^K M_i} {N}$

Variance of distribution of means, $S_M^2 = \frac{\sum_{i=1}^K (M_i - GM)^2} {K - 1}$

Mean Square between groups, $MS_{Between}^2 = (S_M^2)(N)$

Mean Square within groups, $MS_{Within} = \frac{\sum_{i=1}^K S_i^2} {K}$

$F = \frac{MS_{Between}} {MS_{Within}}$

dfBetween(numerator) = K − 1

dfWithin(denominator) = NK

### T-Test, One-Sample

A one-sample T-Test determines if the mean value at each node is statistically different than a specified value, often zero.

t = $\frac{\mathrm{M} - \mu}{\sqrt{\frac{\mathrm{s}^2}{N}}}$

### T-Test, Paired (Dependent Means)

A paired T-Test determines if mean at each node is statistically different for two measurements (X and Y) on one group of subjects.

$\overline{D} = \frac{\sum_{i=1}^N (x_i - y_i)}{N}$

t = $\frac{\overline{D} - \mu}{\sqrt{\frac{\mathrm{s}^2}{N}}}$

### T-Test, Two-Sample (Independent Means)

A two-sample T-Test determines if the means at each node for two groups of subjects are statistically different.

#### Equal (Pooled) Variances

$S^2 = \frac{ \sum_{i=1}^{N_1} (x_i - \overline{x}_1)^2 + \sum_{j=1}^{N_2} (x_j - \overline{x}_2)^2} {N_1 + N_2 - 2}$

$t = \frac{\overline{x}_1 - \overline{x}_2} { \sqrt{S^2(\frac{1}{N_1} + \frac{1}{N_2})} }$

df = N1 + N2 − 2

#### Unequal (Unpooled) Variances

$S_1^2 = \frac{\sum_{i=1}^{N_1} (x_i - \overline{x}_1)^2} {N_1 - 1}$

$S_2^2 = \frac{\sum_{j=1}^{N_2} (x_j - \overline{x}_2)^2} {N_2 - 1}$

$t = \frac{\overline{x}_1 - \overline{x}_2} {\sqrt{\frac{S_1^2}{N_1} + \frac{S_2^2}{N_2} }}$

$d\mathit{f} = \frac{(\frac{S_1^2}{N_1} + \frac{S_2^2}{N_2})^2} {\frac{(\frac{S_1^2}{N_1})^2}{N_1 - 1} + \frac{(\frac{S_2^2}{N_2})^2}{N_2 - 1} }$

## Non-Parametric (Distribution Free) Inferential Statistic Tests

For non-parametric tests, no assumptions are made about the distribution of the data.

# P-Value Determination

In Caret, P-Values are determined using cluster-based techniques.

## Supra-Thresholding

With cluster-based thresholding, the user must select a threshold, or cut-off value. Clusters are formed from connected groups of nodes whose metric values are greater than or equal to the threshold. There is no one "correct" threshold value. Smaller values result in larger clusters and higher thresholds result in smaller clusters.

## Threshold-Free Cluster Enhancement (TFCE)

With threshold-free cluster enhancement, the user does not need to select a threshold, or cut-off value.

Consider a spherical representation of the cortex with radius R. For each node adjust its radius by adding the node's raw statistical value. For each node with a positive raw statistical value, modify the radius of the surrounding nodes, so, no matter what path is followed to a node with a statistical value less than or equal to zero, the radius of the node decreases. What remains is the "supporting section". Essentially, "jagged terrain" becomes a roughly conical shape.

Positive and negative metrics are handled separately. When determining positive TFCE, all metrics less than zero are set to zero. When determining negative TFCE, all metrics greater than zero are set to zero, the negatives are then multiplied by -1, and the positive TFCE algorithm is used.

Algorithm that uses a breadth-first search.

• For each triangle in the fiducial surface, calculate its surface area. Scale the triangle's area by the average of the triangle's three node's areal distortion.
• Copy original metric column with statistical values at each node.
• If negative searching, multiply all metric values by -1.
• Set all nodes with metric less than zero to zero.
• For all nodes:
• Copy metric column.
• Mark all nodes with positive value as unvisited and all nodes with zero values as visited.
• Add start node number to queue.
• Loop until queue is empty
• Get current node from queue.
• If node unvisited
• Mark node visited.
• If the nodes metric value is greater than zero
• Get node's neighbors sorted by smallest metric value. All new neighbors must be BELOW any previous neighbors.
• For all neighbor nodes
• If the neighbor is unvisited
• Set neighbor's metric value to minimum of its value and the current node's metric value.
• Add the neighbor to the queue.
• For all triangles with visited nodes, multiply the triangle's surface area by the triangle's height . We will estimate the height by using the average of the original statistical values at each of the triangle's three nodes. Add these TFCE scores to the start node's TFCE value.

My concern is that the need to perform the above algorithm on many nodes may be slow. When considering the randomization process, the time required may be significant. For example, a user on the "neuro-mult-comp" list claims FSL's randomise requires more than three days for 5000 iterations using TFCE. An algorithm that runs randomization in parallel will help.

Visualize as a flat surface where Z = metric value

## Randomization

Randomization testing is used to determine the P-Values. In this process, the subjects are pooled and then randomly split into groups. The statistical test is performed on this new grouping and the largest cluster is identified. This process is iterated many times and the largest cluster from each grouping is saved. At the end, the largest clusters are sorted by decreasing surface area. The surface area of the 95% percent largest cluster (for a p-value of 0.05) indicates the surface area of a cluster with 0.05 significance.

The original statistical metric column's clusters are assigned p-values based upon their ranking, by surface area, in the sorted, randomly-created clusters.

### Randomization With One Group of Subjects

When there is one group of subjects, such as in a one-sample T-Test, it is not possible to randomize among groups. So, the randomization is performed by randomly flipping the signs of the values for each subject. The statistical test is then run on each of these randomizations and the largest clusters are identified.

### Randomization With Multiple Groups of Subjects

With multiple groups of subjects, all of subjects are placed into a pool. Subjects are then randomly drawn from the pool and placed into new groups. The statistical tests are then run on each of these randomizations and the largest clusters are identified.

When creating the randomized groups, combinations may result in the creation of a randomized group that was already created. Permutations prevent the repeating of randomized groups. For example, the sets {A, B} and {B, A} are two unique combinations but are considered identical permutations.

gifti_statistics is a command line program that performs statistical operations on GIFTI surface data files.

## Descriptive Statistical Operations

Descriptive statistic operations are performed on the values associated with each node. The parameters may be specified in any order.

-data-file <data-file-name.gii> \
-output-file <output-file-name.gii> \
<descriptive-statistics>

There may be more than one input file. Specify each input data file with the -data-file option.

Population Descriptive Statistics

• -pop-mean <new-population-mean-data-array-name>
• -pop-sd <new-population-standard-deviation-data-array-name>
• -pop-var <new-population-variance-data-array-name>
• -pop-sdm <new-population-standard-deviation-of-the-mean>

Sample Descriptive Statistics

• -sample-mean <new-sample-mean-data-array-name>
• -sample-sd <new-sample-standard-deviation-data-array-name>
• -sample-var <new-sample-variance-data-array-name>
• -sample-sem <new-sample-standard-error-of-the-mean-data-array-name>

## Inferential Statistical Operations

-data-file-group <group-name> <data-file-name.gii> \
-output-file <output-file-containing-statistics.gii>
<surface-information> \
[P-Value-Determination] \
[Variance-Smoothing] \

inferential-test is one of:

• -inferential-t-test-one-sample <mean-value>
• -inferential-t-test-two-sample-pooled-variance
• -inferential-t-test-two-sample-unpooled-variance
• -inferential-t-test-paired
• -inferential-anova-one-way

### P-Value Determination

P-Values may be determined using either cluster-based thresholding or threshold-free cluster enhancement.

#### P-Value Methods

• -cluster cluster-negative-threshold cluster-positive-threshold
• -tfce <positive | negative| both>

#### P-Value Options

• -iterations specifies the number of iterations. The default is 50.
• -permute During the randomization process, using this option ensures that each random grouping is different than any previous groupings.
• -random-seed <value-large-positive-integer> This is the seed that initializes random number generation sequence.

#### P-Value Output

• -output-dof new-data-array-name
• -output-pvalue new-data-array-name
• -output-one-minus-p-value data-array-name
• -output-randomization filename.gii* This new file will contain the randomized data.

The P-Value, 1-P-Value, and the DOF is added to the output file.

### Variance Smoothing

• -variance-smoothing iterations strength

Iterations must be greater than zero and strength ranges from 0.0 to 1.0.

### Surface Information

The user may specify either a gifti surface file or a coordinate file and a topology file.

• -surface filename.gii.surf
• -coordinate filename.gii.coord -topology filename.gii.topo

If there surface has been distorted in some way, a gifti data array may contain data that corrects for the distortion.

• -surface-distortion functional_distortion.gii* data-array-name-or- number

If the computer being used has either multiple processors or multi-core processors, running multiple threads will likely reduce the execution time of some operations.

The number of threads must be one or larger.

• Translate the statistics code from C++ to Java. I have found Java development much easier than C++ primarily due to far better error checking earlier in the process and a much smarter development environment. Most of the existing statistics algorithms will convert from C++ to Java fairly quickly.
• Development of TFCE, in particular, will be faster if done in Java.
• Multithreading will be easier to use.
• Input files must be in GIFTI format. However, if REALLY needed, Caret formats can be read. caret_command can convert files between Caret and GIFTI formats.
• Use NIFTI intent codes for newly created metric columns
• Rewriting the statistical operations will simplify the addition of new statistical methods.

# References

## Journal Articles

• Nonparametric Permutation Test For Functional Neuroimaing: A Primer with Examples. Thomas E. Nichols and Andrew P. Holmes. Human Brain Mapping 15:1
• Threshold-Free Cluster Enhancement: Addressing problems of smoothing, threshold dependence and localisation in cluster inference. Stephen M. Smith and Thomas E. Nichols.NeuroImage 2009 44(1)