HegemanLab/w4mclassfilter_galaxy_wrapper: W4M Data Subset tool for Galaxy

Main Author: Arthur Eschenlauer
Format: info software eJournal
Terbitan: , 2019
Online Access: https://zenodo.org/record/3468566
Daftar Isi:
  • Description The W4M Data Subset tool selects subsets of samples, features, or data values for further analysis. The tool takes as input the data matrix, sample metadata, and variable metadata datasets produced by the XCMS [Smith et al., 2006, http://dx.doi.org/10.1021/ac051437y] and CAMERA [Kuhl et al., 2012, http://dx.doi.org/10.1021/ac202450g] tools of Workflow4metabolomics (W4m), http://workflow4metabolomics.org [Giacomoni et al., 2014, https://doi.org/10.1021%2Fac051437y]. The tool produces as output the same trio of datasets, modified as described below. This tool performs several operations to address several data issues that may impede downstream statistical analysis: Samples that are missing from either sampleMetadata or dataMatrix are eliminated. Features that are missing from either variableMetadata or dataMatrix are eliminated. Features and samples that have zero variance are eliminated. Samples and features are sorted alphabetically in rows and columns of variableMetadata, sampleMetadata, and dataMatrix. By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata". Negative intensities are replaced by zeros. If desired, the values in the dataMatrix may be log-transformed. If desired, each missing value in dataMatrix is replaced with zero or the median value observed for the corresponding feature. This tool also can perform several operations to reduce the number samples or features to be analyzed: Samples may be eliminated by filtering on a designated "sample class" column in sampleMetadata. Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata. Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature ("range of row-maximum for each feature"). The W4M Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples. Changes in version 0.98.13 (Note that version number 0.98.12 was skipped) This version in Galaxy toolshed https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/38f509903a0b New features Support enhancement https://github.com/HegemanLab/w4mclassfilter/issues/4 - "add and test no-imputation and centering-imputation functions": Support no imputation. Support imputating missing feature-intensities as median intensity for the corresponding feature. Internal modifications Use v0.98.13 of the w4mclassfilter bioconda package.