Official documentation of
exajload utility only mention IMPORT from various files. However, it is possible to IMPORT not only from files, but also from standard input
It might be useful while building complex ETL pipelines and applying dynamic transformations before IMPORT.
- Create table:
CREATE OR REPLACE TABLE users ( user_id DECIMAL(18,0), user_name VARCHAR(255), register_dt DATE, last_visit_ts TIMESTAMP, is_female BOOLEAN, user_rating DECIMAL(10,5), user_score DOUBLE, status VARCHAR(50) );
- Download Exasol JDBC driver from Download section and extract it to get
- Download file with test data: users.csv
- Import data using following command:
cat users.csv | ./exajload \ -c 'localhost:8563' \ -u 'SYS' \ -P 'exasol' \ -s 'EXASECRETS' \ -presql 'TRUNCATE TABLE users' \ -sql 'IMPORT INTO users FROM LOCAL CSV FILE '\''/dev/stdin'\'' ROW SEPARATOR = '\''LF'\'''
You may use
/dev/stdin special value to read from
STDIN instead of normal file.
Compression with pipes
It is also possible to use compression while importing from
STDIN. In order to make it work, we have add extension
.zip to FILE value.
You may do it using a simple trick with symlinks:
ln -s /dev/stdin stdin.gz
Now you have a pseudo-file called
stdin.gz, which can be used as valid FILE value for exajload.
Let’s try to compress data file on the fly and IMPORT it as gzip-encoded stream:
gzip -c users.csv | ./exajload \ -c 'localhost:8563' \ -u 'SYS' \ -P 'exasol' \ -s 'EXASECRETS' \ -presql 'TRUNCATE TABLE users' \ -sql 'IMPORT INTO users FROM LOCAL CSV FILE '\''stdin.gz'\'' ROW SEPARATOR = '\''LF'\'''
This technique might help you to reduce the amount of traffic transferred over network and save CPU resources by preventing unnecessary decompression.