January 25, 2020 Home Lab I like to learn something new everyday, whether it is related to my PhD research (big data value creation) or not. I have investigated… Share this:TweetWhatsAppPrintEmail
January 24, 2020 Benchmark Python’s Dataframe: Pandas vs. Datatable vs. PySpark SQL Setup Machine: 16-thread Xeon 2.6 GHz, 32 GB RAM, NVME PCIx16 System: Ubuntu 16.04, Spark 2.4.4, Python 3.7.4, Pandas 0.25.1, Datatable 0.10.1 Data: 100… Share this:TweetWhatsAppPrintEmail
June 3, 2019 Apache Hadoop: What is that & how to install and use it? (Part 2) Part 2: How to install a standalone Hadoop Now, we are going to install a standalone Hadoop. The easiest way is to use VM… Share this:TweetWhatsAppPrintEmail
June 3, 2019 Apache Hadoop: What is that & how to install and use it? (Part 1) Next: How to install a standalone Hadoop Part 1: Understanding Apache Hadoop as a Big Data Distributed Processing & Storage Cluster In the last… Share this:TweetWhatsAppPrintEmail
June 2, 2019 High Performance Computing (HPC) TU Delft High performance computing (HPC): What is it and what is it used for? I am so grateful to have access to wealthy resources of… Share this:TweetWhatsAppPrintEmail
June 2, 2019 Repository of public datasets For anyone who is looking for datasets for his/her project. Share this:TweetWhatsAppPrintEmail
June 2, 2019 Standardized patterns for improving the data quality of big data Abstract: Data seldom create value by themselves. They need to be linked and combined from multiple sources, which can often come with variable data… Share this:TweetWhatsAppPrintEmail
June 2, 2019 Arista, a Linux-based networking devices For years, the networking industry is dominated by Cisco and its operation system, IOS. IOS, IMHO, is not designed to be customized by the… Share this:TweetWhatsAppPrintEmail
June 2, 2019 Container on ARM SBSc Most single-board computers (SBCs) today are powered by ARM. Containerization on SBCs like Raspberry Pi or Orange Pi brings so much flexibility in a… Share this:TweetWhatsAppPrintEmail
June 2, 2019 Saving electricity by suspending idle servers Electricity is a (very) expensive resource in Europe. By putting the servers into sleep/suspend mode (while idle), I can save 80% of the power… Share this:TweetWhatsAppPrintEmail