NGC | Catalog
CatalogContainersOmniSci Open Source

OmniSci Open Source

Logo for OmniSci Open Source
Description
OmniSci (formerly MapD) is the pioneer in GPU-accelerated analytics, redefining speed and scale in big data querying and visualization. The OmniSci platform is used to find insights in data beyond the limits of mainstream analytics tools. Now you can easily build your own instance of the OmniSci Core with OmniSci Open Source edition.
Publisher
OmniSci
Latest Tag
cuda10-4.8.1
Modified
April 4, 2023
Compressed Size
1.26 GB
Multinode Support
No
Multi-Arch Support
No

What is OmniSci?


OmniSci is the world's fastest SQL engine which helps you find hidden insights beyond the reach of mainstream analytics, by letting you query and visually explore large datasets at extreme speed. OmniSci can query up to billions of rows in milliseconds, and is capable of unprecedented ingestion speeds, making it the ideal SQL engine for the era of big, high-velocity data while harnessing the massive parallel computing power of CPUs and GPUs alike.

OmniSci platform seamlessly integrates with existing CPU-based legacy analytics systems, becoming the bridge needed to generate the optimal speed and scale for Big Data processing. OmniSci provides analysts and data scientists with no-lag data processing (no need to do any pre-aggregation or downsampling).

Break down silos with the only open platform that unites analytics, data science and location intelligence workflows. Leverage native SQL, interactive visual analytics, extensive PyData stack integrations, and a powerful machine learning framework. Interactively query, visualize, and power data science workflows over billions of records with OmniSci

Note: Add --gpus=all to docker run command in case of error ./bin/initdb: error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory

EULA and User Documentation


  1. EULA
  2. Container Installation Guide
  3. OmniSci Release Notes

Running OmniSci


Sample nvidia-docker run:

docker run --runtime=nvidia \
  -v $HOME/omnisci-docker-storage:/omnisci-storage \
  -p 6273-6280:6273-6280 \
  omnisci/omnisci-os-cuda

The OmiSci Open Source edition does not require a license, and does not include OmniSci Immerse.

Accessing the OmniSci Container


Note the CONTAINER ID of the OmniSci docker image using the following command:

     docker container ls

Importing a Sample Dataset


OmniSci ships with two sample datasets of airline flight information collected in 2008, and one dataset of New York City census information collected in 2015.

To install the sample data, run the following command.

docker exec -it  ./insert_sample_data 

When prompted, choose whether to insert dataset 1 (7,000,000 rows), dataset 2 (10,000 rows), or dataset 3 (683,000 rows). Enter the dataset number to download, or 'q' to quit:

   #Dataset                   Rows           Table Name                   File Name 
1. Flights (2008)             7M          flights_2008_7M           flights_2008_7M.tar.gz 
2. Flights (2008)             10k         flights_2008_10k          flights_2008_10k.tar.gz 
3. NYC Tree Census (2015)     683k        nyc_trees_2015_683k       nyc_trees_2015_683k.tar.gz
       

Connecting to the OmniSci Command Line (OmniSQL)


Connect to OmniSci Core by entering the following command (default password is HyperInteractive):

docker exec -it  /omnisci/bin/omnisql
password: ••••••••••••••••

Enter a SQL query such as the following, in this example we use dataset #2 (flights_2008_10k):

omnisql> SELECT origin_city AS "Origin", dest_city AS "Destination", AVG(airtime) AS
"Average Airtime" FROM flights_2008_10k WHERE distance < 175 GROUP BY origin_city,
dest_city;

Contact OmniSci


If you have general enquiries or requests, please email sales@omnisci.com. You can also visit the OmniSci community forum to learn more.