The Start During the Lunar New Year in February last year, in order to solve the efficiency problem of the machine learning model sample data I was facing at the time, I created chDB. Of course, compared to everything that the creators of ClickHouse have done so far, chDB is just a tiny hack on ClickHouse local.
Running Everywhere Despite many imperfections, chDB quickly gained a lot of fans in a way that surprised me.
Rocket Engine on a Bicycle Before officially starting the journey of chDB, I think it’s best to give a brief introduction to ClickHouse. In recent years, “vectorized engines” have been particularly popular in the OLAP database community. The main reason is the addition of more and more SIMD instructions in CPUs, which greatly accelerates Aggregation, Sort, and Join operations for large amounts of data in OLAP scenarios. ClickHouse has made very detailed optimizations in multiple areas such as “vectorization”, which can be seen from its optimization on lz4 and memcpy.
I’m a heavy user and also a code contributor to Apache Superset. Running Superset on my MacBook is the only reason to have a Docker(still a VM inside?) installed which I think is too heavy.
Superset puts most heavy work onto the database side, I was thinking is there may be some possibility to have a Superset.app to make it easier to use Superset on my MacBook.
My technical stack is mainly backend, some keywords like: