Data Engineering
Published in

Data Engineering

[Clickhouse] Query data directly from CSV without loading to DB

One of the magic of the Clickhouse analytics database is that query/analyse the data directly from a CSV file

Create a basic View

First we’ve to create a basic view of a file. For that we need to have the data types of each column.

CREATE VIEW sales_view AS
SELECT * FROM FILE(
'/data/sales.csv',
'CSV',
'`order_id` Int64, `sales_amt` Float64);

This will create a view in database

Query the view

After this it will act as a normal view. You can query the CSV file through this table view.

SELECT sum(`sales_amt`) from sales_view;

The magic

The magic here is, the data not loaded into the database storage, instead the data persist only in the CSV. Since clickhouse is the analytical database, we can do complex query with low latency.

If the CSV get updated then the data will directly reflect on the query.

--

--

--

Data engineering by examples

Recommended from Medium

Setting Up A UI Automation Project With GitHub for VCS

RELEASE PLANNING IN AGILE PROJECTS

How I survived Carrot University

Microsoft PowerApps (CDS for Apps) API

Rancher 2.5.3 and Syslog

SANS SIFT Workstation

Major Differences between Python and R Programming Languages

[Toolkit] Bash Basics

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
<muTheTechie/>

<muTheTechie/>

» 6+ years of experience in Data engineering, Dashboard designing » 3+ years of experience in Web application development

More from Medium

Apache Curator: ZooKeeper made simple

How Taboola Powers the Conversion Data Pipe

How can Apache Spark help your big data?

Webhook vs API — Which One Do You Need?