State of health (SOH) is a key parameter to assess lithium-ion battery feasibility for secondary usage applications. SOH estimation based on machine learning has attracted great attention in recent years, and holds potentials for battery informatization and cloud battery management techniques. In this paper, a comprehensive study of the data-driven SOH estimation methods is conducted. A new classification for health indicators (HIs) is proposed where the HIs are divided into the measured variables and calculated variables. To illustrate the significance of data preprocessing, four noise reduction methods are assessed in the HIs extraction process; different feature selection methods, including filter-based method, wrapper-based method, and fusion-based method, are applied to select HIs subsets. The four widely used machine learning algorithms, including artificial neural network, support vector machine, relevance vector machine, and Gaussian process regression, are applied and compared. In order to evaluate the estimation performance in potential real usages under future big data era, the three HIs selection methods and four machine learning methods are evaluated using three public data sets and two estimation strategies. The results show that the combination of the fusion-based selection method and Gaussian process regression has an overall superior estimation performance in terms of both accuracy and computational efficiency.