The analytic function of MySQL

I just realized the newest version of MySQL (version 8.0) supported oracle-based “analytic function”, meanwhile his sister MariaDB had those two years ago.

Deep learning inference on android phone

The demo that you’ll always see at any AI/IoT expo/exhibition!The difference is that I’m using my Android instead of a PC or a development board like Rasberry Pi.  The application could be downloaded from Google Play (not mine!). Its name is Object Detector and Classifier (Try it! Believe me, it’s fun!). It uses Mobilenet v2 … Read moreDeep learning inference on android phone

Tensorframes: Tensorflow + Spark

Combining data-intensive best solution (apache spark) and compute-intensive best approach (Tensorflow with GPU) results in Tensorframes. The speedup is remarkable. Hopefully, I could get a multi-GPU cluster to play with. Spark Summit EU talk by Tim Hunter from Spark Summit

Battle of ML/DL framework on stand-alone vs. distributed platform

Will steep improvement of algorithm + decrease on hardware cost (CPU, memory, disk) drag the distributed approach irrelevant? IMHO, at this time, the winner is h2oai which gives an impressive performance in stand-alone mode and supports distributed platform (i.e., atop Spark using h2o sparkling water). I was so surprised that Standford’s statistics maestro, Tibshirani & … Read moreBattle of ML/DL framework on stand-alone vs. distributed platform

Dataiku: flexible data science tools

In the previous post, the flexibility given by data science tools greatly reduces the performance, i.e., the execution speed. Fortunately, Dataiku, a data science tool, provides multiple ways to aggregate big data: 1) using the built-in building blocks; 2) using a custom R script with the built-in I/O blocks; or3) using an independent custom R … Read moreDataiku: flexible data science tools

Google Cloud Speech API

Very often I need to verify whether a video containing certain keywords that I am looking for. That is very difficult to be done unless I watch the entire video or download the subtitle. However, most videos do not have a subtitle inside. I then found out the transcription services on the Internet but the … Read moreGoogle Cloud Speech API

Flexibility vs. Speed

Data science tools such as Rapidminer, Dataiku, and KNIME offer so much flexibility and provide easy-to-understand building blocks that abstract data processing functions. It allows data analysts implementing a business case quickly. However, it comes with a price: slowing down the execution speed due to variable transfer between tasks. Here is the trial. Aggregating 100 … Read moreFlexibility vs. Speed

Why do we need a smart home?

This is why a smart home is needed: to understand your home better and communicate with pros on their own language — My dialog with the dishwasher’s company (DC): Me: Hi, I’d like to inform you about my dishwasher’s problem. DC: Sure, please give me your address (by address, they know what type of the … Read moreWhy do we need a smart home?

CPU vs. GPU

Inspired by the benchmark from Matt Dowle (https://h2oai.github.io/db-benchmark/), I compared his benchmark with GPU (Detail: https://lnkd.in/e7iHg7N). For processing big data, GPU K20 2 GB is slightly better than 20 cores CPU Xeon 2.6 GHz 125.8 GB RAM, even much better in some tests 🙂 Of course, the performance comes with a price. Thanks to Omnisci … Read moreCPU vs. GPU

Google Colab vs. Microsoft Azure notebook

Although I knew this service for a while, I just recently put attention on 2 “serverless” notebook services on the cloud: Google Colab and Microsoft Azure Notebooks. Here are my short reviews. <<Google colab>> 1. only support python (currently 3.6.7 and 2.7.15). you can build the packages through pip directly from the notebook. no way … Read moreGoogle Colab vs. Microsoft Azure notebook