March 25, 2021



Natural Language Instead of SQL



SQL is the most popular programming language for communicating with databases. It has been around for years and people are quite familiar with it. However just like any other programming language, SQL also require one to adhere to a strict syntax.


With advancement in natural language processing, it is now possible to break free from the constraints of syntax. Consider below


SELECT sum(order_value) , year , product FROM sales_transactions WHERE (order_value BETWEEN 1000 AND 5000) AND category=’electronics’ GROUP BY year , product


Vs


year and product wise order value for order value > 1000 <5000 electronics


This is what is possible with DARWIN, an NLP based business intelligence and analytics product.


When the data analysis gets complex, nested SQL queries become quite difficult to comprehend. However, with Darwin, analysis is always sequential so it is much easier to understand the flow. Consider below


SELECT min(order_value) , year FROM ( SELECT sum(order_value) , year , product FROM sales_transactions WHERE (order_value BETWEEN 1000 AND 5000) AND category=’electronics’ GROUP BY year , product) table1 GROUP BY year


Vs


Step 1 - year and product wise order value for order value > 1000 <5000 electronics

Step 2 – min order value by year


Darwin also makes it is quite simple to merge and concatenate datasets as well. This enables one to perform multiple levels of transformation on datasets all with the ease of natural language text.


Darwin uses it own framework for analysing and transforming datasets rather than SQL. It stores data on disc in highly compressed efficient format and loads only the relevant data in memory when required. This enable lightning fast performance - it just takes 2 seconds to analyse 10 million rows of data.



To learn more about Darwin, click here.


To try out Darwin, click here.