梯度提升决策树
概述
Like last year, the most commonly used algorithms were linear and logistic regression, followed closely by decision trees and random forests. Of more complex methods, gradient boosting machines and convolutional neural networks were the most popular approaches.
We also saw strong year-over-year growth in the use of large language models such as transformer networks (BERT, GPT-3, etc).
Python-based tools continue to dominate the machine learning frameworks.
Like last year, Scikit-learn, a swiss army knife applicable to most projects, is the top with over 80% of data scientists using it. TensorFlow and Keras, notably used in combination for deep learning, were each selected on about half of the data scientist surveys. Gradient boosting library xgboost is fourth, with about the same usage as 2020 and 2019.
The most popular of the new tools added to the survey this year is Huggingface reaching over 10%.
Despite being used less frequently overall, we continue to see strong year-over-year growth of the PyTorch framework.
Scikit-learn is the most popular ML framework while PyTorch has been growing steadily year-over-year
算法
XGBoost
|
|
LightGBM
|
|
CatBoost
|
|