Battle of ML/DL framework on stand-alone vs. distributed platform

Taken from: https://github.com/szilard/benchm-ml/

Will steep improvement of algorithm + decrease on hardware cost (CPU, memory, disk) drag the distributed approach irrelevant?

Pleased indeed to see @h2oai and #XGBoost moving up in @thomaswdinsmore 's 2018 magic quadrant (which joke aside is indeed "based on the tools data scientists actually use" and more relevant than Gartner's). Also pleased to see Spark moving down 😂 https://t.co/d44pILyKRl pic.twitter.com/OqTZ5zTwqa
— Szilard Pafka (@SzilardPafka) February 26, 2018

IMHO, at this time, the winner is h2oai which gives an impressive performance in stand-alone mode and supports distributed platform (i.e., atop Spark using h2o sparkling water). I was so surprised that Standford’s statistics maestro, Tibshirani & Hasti, are the advisor. Their book is the best statistics book I ever read.

Nice picture to calibrate which skill we should leave and engage as a data scientist.

Share this:

Leave a Comment Cancel reply