Top 5 Open-Source XGBoost Algorithm Projects to Study in 2023

Mrinal Walia
5 min readDec 23, 2022

--

This article will teach you the top 5 open-source XGBoost Algorithms and Repositories on GitHub.

Source

The XGBoost (also known as eXtreme Gradient Boosting) is a popular supervised learning algorithm and an efficient open-source implementation of the gradient-boosted decision trees (GBDT) algorithms. Conversely, Boosting means combining a set of weak and robust learners to reduce training errors.

There are many classes of boosting algorithms, but today we will talk about XGBoost Algorithm.

Note: In this article, we will talk about some excellent open-source XGBoost projects/Repositories that you can use in your projects in 2023. To read more about each, I recommend following the link given along the project.

Affiliate Courses from DataCamp

Learning isn’t just about being more competent at your job, it is so much more than that. Datacamp allows me to learn without limits.

Datacamp allows you to take courses on your own time and learn the fundamental skills you need to transition to a successful career.

Datacamp has taught me to pick up new ideas quickly and apply them to real-world problems. While I was in my learning phase, Datacamp hooked me on everything in the courses, from the course content and TA feedback to meetup events and the professor’s Twitter feeds.

Here are some of my favorite courses I highly recommend you to learn from whenever it fits your schedule and mood. You can directly apply the concepts and skills learned from these courses to an exciting new project at work or at your university.

Extreme Gradient Boosting with XGBoost

Ensemble Methods in Python

Data-scientist-with-python

Data-scientist-with-r

Machine-learning-scientist-with-r

Machine-learning-scientist-with-python

Machine-learning-for-everyone

Data-science-for-everyone

Coming back to the topic -

1. xgboost

GitHub: https://github.com/dmlc/xgboost

Official Document: https://xgboost.ai/

Stars: 23.6K

Forks: 8.5K

The xgboost is a scalable, portable, and distributed gradient-boosting decision tree library for different programming languages, including Python, R, Java, Scala, C++, etc. It is compatible with Big data platforms such as Hadoop, DataFlow, Spark, Flink, Dask, etc.

This is the official open-source repository of xgboost that has been developed and used by a group of active community members across the globe.

The XGBoost algorithm is an optimized gradient boosting algorithm from machine learning algorithms under the Gradient Boosting Framework, and it is designed to be:

  • Highly Efficient
  • Highly Flexible
  • Highly Portable
  • Supports Multiple Languages
  • Distributed on Cloud
  • High Performance

Note: Your contribution is precious to making the library better for everyone. If you can help them in any form, please check out their Community Page here.

2. deepdetect

GitHub: https://github.com/jolibrain/deepdetect

Official Document: https://www.deepdetect.com/

Stars: 2.4K

Forks: 556

Source

The deepdetect is a deep learning API and server written in C++14 language. The XGBoost algorithm and other popular frameworks such as Caffe, PyTorch, Tensorflow, DLib, TSNE, NCNN, and TensorRT support it.

The deepdetect library supports a web platform for training and managing your machine-learning models. The authors envision making deep learning easy and straightforward to work with. Supporting other backend libraries, including XGBoost, secures them a spot on this list.

Source

Some cool features of this library are:

  • Easy to Setup
  • Ready for different applications
  • Support for Web UI
  • Fast server
  • Come with Different Neural Network Templates
  • Trains in a few hours and with small datasets
  • Comes with ready-to-use models for multiple tasks
  • Fully open-source ecosystem

3. AlphaPy

GitHub: https://github.com/ScottfreeLLC/AlphaPy

Official Document: https://alphapy.readthedocs.io/en/latest/

Stars: 827

Forks: 167

Source

AlphaPy is an automated machine-learning library with support for Python, Scikit-Learn, Keras, XGBoost, LightGBM, and CatBoost libraries and algorithms.

The library is written in Python and runs machine-learning models using the Scikit-learn, Keras, XGBoost, LightGBM, and CatBoost algorithms.

AlphaPy allows you to do the following tasks:

  • Generate blended and stacked ensemble models
  • Design models for investigating the markets with MarketFlow
  • Predict sporting events with SportFlow
  • Develop trading systems and portfolios using pyfolio

4. XGBoostLSS

GitHub: https://github.com/StatMixedML/XGBoostLSS

Stars: 275

Forks: 36

Source

The XGBoostLSS is an extension of the XGBoost framework. It helps predict the entire conditional distribution of univariate and multivariate responses.

XGBoostLSS has support for the following:

  • Multi-Target Regression
  • Estimation of the Gamma Distribution
  • Full Predictive Distribution via Expectile Regression
  • Automatic Derivation of Gradients & Hessians
  • Pruning During Hyperparameter optimization

The library is written in Python and estimates all distributional parameters simultaneously. The support for multi-target regression allows the multivariate response and its dependencies to be modeled. To understand the output of XGBoostLSS, you must learn SHapley Additive exPlanations, and you can install the package in Python using the below command:

$ pip install git+https://github.com/StatMixedML/XGBoostLSS.git

5. XGBoost.jl

GitHub: https://github.com/dmlc/XGBoost.jl

Official Documentation: https://dmlc.github.io/XGBoost.jl/dev/

Stars: 244

Forks: 113

Source

XGBoost.jl is a Julia interface of the popular XGBoost algorithm. The package is efficient and can be more than ten times quicker than some current gradient-boosting packages. The users can define their objectives by making the library extensible.

The package uses xgboost_jll to package the xgboost binaries and is the Julia wrapper of the xgboost gradient boosting library.

BONUS

As a bonus, I am adding a link to a curated list of gradient-boosting research papers (until 2022 Oct) with their implementations.

Awesome Gradient Boosting Papers: GitHub (835 Stars & 147 Forks)

Source

If you enjoy reading this article, we intercommunicate similar interests and are/will be in similar industries. So let’s connect via LinkedIn and Github. Please do not hesitate to send a contact request!

Subscribe 📧 For Weekly Tech Nuggets! 💻

--

--

Mrinal Walia
Mrinal Walia

Written by Mrinal Walia

I'm a Data Scientist with a goal-driven creative mindset and a passion for learning and innovating.

No responses yet