What is OmniSci?
OmniSci is the world's fastest SQL engine which helps you find hidden insights beyond the reach of mainstream analytics, by letting you query and visually explore large datasets at extreme speed. OmniSci can query up to billions of rows in milliseconds, and is capable of unprecedented ingestion speeds, making it the ideal SQL engine for the era of big, high-velocity data while harnessing the massive parallel computing power of CPUs and GPUs alike.
OmniSci platform seamlessly integrates with existing CPU-based legacy analytics systems, becoming the bridge needed to generate the optimal speed and scale for Big Data processing. OmniSci provides analysts and data scientists with no-lag data processing (no need to do any pre-aggregation or downsampling).
Break down silos with the only open platform that unites analytics, data science and location intelligence workflows. Leverage native SQL, interactive visual analytics, extensive PyData stack integrations, and a powerful machine learning framework. Interactively query, visualize, and power data science workflows over billions of records with OmniSci
Note: Add --gpus=all to docker run command in case of error ./bin/initdb: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory
EULA and User Documentation
Sample nvidia-docker run:
docker run --runtime=nvidia \ -v $HOME/omnisci-docker-storage:/omnisci-storage \ -p 6273-6280:6273-6280 \ omnisci/omnisci-os-cuda
The OmiSci Open Source edition does not require a license, and does not include OmniSci Immerse.
Accessing the OmniSci Container
Note the CONTAINER ID of the OmniSci docker image using the following command:
docker container ls
Importing a Sample Dataset
OmniSci ships with two sample datasets of airline flight information collected in 2008, and one dataset of New York City census information collected in 2015.
To install the sample data, run the following command.
docker exec -it ./insert_sample_data
When prompted, choose whether to insert dataset 1 (7,000,000 rows), dataset 2 (10,000 rows), or dataset 3 (683,000 rows). Enter the dataset number to download, or 'q' to quit:
#Dataset Rows Table Name File Name 1. Flights (2008) 7M flights_2008_7M flights_2008_7M.tar.gz 2. Flights (2008) 10k flights_2008_10k flights_2008_10k.tar.gz 3. NYC Tree Census (2015) 683k nyc_trees_2015_683k nyc_trees_2015_683k.tar.gz
Connecting to the OmniSci Command Line (OmniSQL)
Connect to OmniSci Core by entering the following command (default password is HyperInteractive):
docker exec -it /omnisci/bin/omnisql password: ••••••••••••••••
Enter a SQL query such as the following, in this example we use dataset #2 (flights_2008_10k):
omnisql> SELECT origin_city AS "Origin", dest_city AS "Destination", AVG(airtime) AS "Average Airtime" FROM flights_2008_10k WHERE distance < 175 GROUP BY origin_city, dest_city;