📊 Databricks Update: Baca File Excel Secara Native (Tanpa Library Tambahan)

Status Fitur: Public Preview (Beta) – Databricks Runtime 17.1+Tanggal Berita: 3 Desember 2025Azure Databricks meluncurkan dukungan bawaan (native) untuk membaca, mem-parsing, dan melakukan query pada file Excel (.xls dan .xlsx). Fitur ini menghilangkan kebutuhan akan library eksternal yang selama ini merepotkan data engineer.🚧 Problem Statement: Gesekan Format Excel di Big DataFile Excel adalah standar di … Read more📊 Databricks Update: Baca File Excel Secara Native (Tanpa Library Tambahan)

📊 BEDAH TUNTAS DATA ANALYTICS: Panduan Lengkap 20 Langkah (Konsep + Teknis) 🥩📈

Banyak yang mikir kerjaan Data Analyst itu cuma buka Excel atau bikin grafik di Tableau. SALAH BESAR! 🙅‍♂️ Itu cuma bagian kecil. Proses aslinya adalah maraton 20 langkah. Kalau kamu cuma jago visualisasi tapi gak paham bisnis (Langkah 1) atau evaluasi model (Langkah 14), analisismu gak akan kepakai. Berdasarkan workflow di gambar, mari kita bedah … Read more📊 BEDAH TUNTAS DATA ANALYTICS: Panduan Lengkap 20 Langkah (Konsep + Teknis) 🥩📈

Apache Hadoop: What is that & how to install and use it? (Part 2)

Part 2: How to install a standalone Hadoop Now, we are going to install a standalone Hadoop. The easiest way is to use VM sandbox provided by vendors such as Hortonworks/Cloudera and MapR. However, since the sandbox has many components (not only Hadoop, but also HBase, Spark, Hive, Oozie, etc.), it requires substantial resources (4 … Read moreApache Hadoop: What is that & how to install and use it? (Part 2)

Apache Hadoop: What is that & how to install and use it? (Part 1)

Next: How to install a standalone Hadoop Part 1: Understanding Apache Hadoop as a Big Data Distributed Processing & Storage Cluster In the last post, I discussed on which occasion we prefer distributed approach such as Hadoop and Spark over the monolithic approach. I will discuss more detail about Apache Hadoop in this article. This … Read moreApache Hadoop: What is that & how to install and use it? (Part 1)

Standardized patterns for improving the data quality of big data

Abstract: Data seldom create value by themselves. They need to be linked and combined from multiple sources, which can often come with variable data quality. The task of improving data quality is a recurring challenge. In this paper, we use a case study of a large telecom company to develop a generic process pattern model … Read moreStandardized patterns for improving the data quality of big data

Spin up Oracle database in minutes using Docker

Oracle database is one of most wanted skill perhaps until today. As far as I know, compared to its rivals such as MySQL and Postgres, its installation requires substantial effort. The application itself also draws huge memory and storage. Today, Oracle is contained in a ready-to-use container. Only in minutes, an Oracle instance could be … Read moreSpin up Oracle database in minutes using Docker

Blockchain in KLM @Blockchain Expo 2018

Since early 2018, KLM put blockchain technology in their backend for easing data sharing among parties in the operation & maintenance lifecycle.

IBM Blockchain Demo @IoT Expo 2018

Interesting to see IBM demo about how blockchain will change the way people do coffee trading thru smart contracts among supply chain actors. Some captured slides: https://www.dropbox.com/sh/p1wuu2fvl0ikx63/AAABVkF7ULHReUUW4dMfwZvEa?dl=0

Google Conversation atop Translate API

I met a lot of potential Indonesians, but lack of English made them unconfident to deliver their capability. I just noticed Android’s conversation service atop Google Translate that could reduce the hassle. Just connecting Android to a Bluetooth earphone, we can watch 2 people speaking in their mother languages confidently. This is an example: Waverly … Read moreGoogle Conversation atop Translate API