TY - GEN
T1 - Sparse clustering with resampling for subject classification in PET amyloid imaging studies
AU - Bi, Wenzhu
AU - Tseng, George C.
AU - Weissfeld, Lisa A.
AU - Price, Julie C.
PY - 2011
Y1 - 2011
N2 - Sparse k-means clustering (Sparse-kM) can exclude uninformative variables and yield reliable parsimonious clustering results, especially for p≫n. In this work, Sparse-kM and data resampling were combined to identify variables of greatest interest and define confidence levels for the clustering. The method was evaluated by statistical simulation and applied to PiB PET amyloid imaging data to identify normal control (NC) subjects with (+) or without (-) evidence of amyloid, i.e., PiB(+/-). Simulations. A dataset of n=60 observations (3 groups of 20) and p=500 variables was generated for each simulation run; only 50 variables were truly different across groups. The dataset was resampled 20 times, Sparse-kM was applied to each sample and average variable weights were calculated. Probabilities of cluster membership, also called confidence levels, were computed (n=60). Simulations were performed 250 times. The 50 truly different variables were identified by variable weights that were 13-32 times greater than those for the 450 uninformative variables. Human Data. For the PiB PET dataset, images (ECAT HR+, 10-15 mCi, 90 min) were acquired for 64 cognitively normal subjects (74.1±5.4 yrs). Parametric PiB distribution volume ratio images were generated (Logan method, cerebellum reference) and normalized to the MNI template (SPM8) to produce a dataset of n=64 subjects and p=343,099 voxels/image. The dataset was resampled 10 times and Sparse-kM was applied. An average voxel weight image was computed that indicated cortical areas of greatest interest that included precuneus and frontal cortex; these are key areas linked to early amyloid deposition. Seven of 64 subjects were identified as PiB(+) and 47 as PiB(-) with confidence = 90%, where another subject was PiB(+) at lower confidence (80%) and the other 9 subjects were PiB(-) at confidence in the range of 50-70%. In conclusion, Sparse-kM with resampling can help to establish confidence levels for clustering when p≫n and may be a promising method for revealing informative voxels/spatial patterns that distinguish levels of amyloid load, including that at the transitional amyloid +/- boundary.
AB - Sparse k-means clustering (Sparse-kM) can exclude uninformative variables and yield reliable parsimonious clustering results, especially for p≫n. In this work, Sparse-kM and data resampling were combined to identify variables of greatest interest and define confidence levels for the clustering. The method was evaluated by statistical simulation and applied to PiB PET amyloid imaging data to identify normal control (NC) subjects with (+) or without (-) evidence of amyloid, i.e., PiB(+/-). Simulations. A dataset of n=60 observations (3 groups of 20) and p=500 variables was generated for each simulation run; only 50 variables were truly different across groups. The dataset was resampled 20 times, Sparse-kM was applied to each sample and average variable weights were calculated. Probabilities of cluster membership, also called confidence levels, were computed (n=60). Simulations were performed 250 times. The 50 truly different variables were identified by variable weights that were 13-32 times greater than those for the 450 uninformative variables. Human Data. For the PiB PET dataset, images (ECAT HR+, 10-15 mCi, 90 min) were acquired for 64 cognitively normal subjects (74.1±5.4 yrs). Parametric PiB distribution volume ratio images were generated (Logan method, cerebellum reference) and normalized to the MNI template (SPM8) to produce a dataset of n=64 subjects and p=343,099 voxels/image. The dataset was resampled 10 times and Sparse-kM was applied. An average voxel weight image was computed that indicated cortical areas of greatest interest that included precuneus and frontal cortex; these are key areas linked to early amyloid deposition. Seven of 64 subjects were identified as PiB(+) and 47 as PiB(-) with confidence = 90%, where another subject was PiB(+) at lower confidence (80%) and the other 9 subjects were PiB(-) at confidence in the range of 50-70%. In conclusion, Sparse-kM with resampling can help to establish confidence levels for clustering when p≫n and may be a promising method for revealing informative voxels/spatial patterns that distinguish levels of amyloid load, including that at the transitional amyloid +/- boundary.
UR - http://www.scopus.com/inward/record.url?scp=84858651751&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84858651751&partnerID=8YFLogxK
U2 - 10.1109/NSSMIC.2011.6152564
DO - 10.1109/NSSMIC.2011.6152564
M3 - Conference contribution
AN - SCOPUS:84858651751
SN - 9781467301183
T3 - IEEE Nuclear Science Symposium Conference Record
SP - 3108
EP - 3111
BT - 2011 IEEE Nuclear Science Symposium and Medical Imaging Conference, NSS/MIC 2011
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2011 IEEE Nuclear Science Symposium and Medical Imaging Conference, NSS/MIC 2011
Y2 - 23 October 2011 through 29 October 2011
ER -